C/C++ and Fortran

C/C++ and Fortran

Connect, learn, share, and engage with IBM Power.

 View Only

Discover the traps when using stringstream.str()

By Archive User posted Fri July 26, 2013 04:50 AM

  

Originally posted by: FangLu


Note: This article was originally written in Chinese by Ke Wen Lin/China/IBM. I translated this article into English. 

Strings are frequently used in C++ programs. C++ Standard Library provides the <string> and <sstream> libraries, which are very useful for string manipulations, such as object encapsulation, safe and automatic type conversion, direct concatenation, and bound exceed avoidance. This article will discover a trap when using stringstream.str(). Here is an example: 

  Example 1:

       1  #include <string>
     2  #include <sstream>
     3  #include <iostream>
     4 
     5  using namespace std;
     6 
     7  int main()
     8  {
     9      stringstream ss("012345678901234567890123456789012345678901234567890123456789");
    10      stringstream t_ss("abcdefghijklmnopqrstuvwxyz");
    11      string str1(ss.str());
    12 
    13      const char* cstr1 = str1.c_str();
    14      const char* cstr2 = ss.str().c_str();
    15      const char* cstr3 = ss.str().c_str();
    16      const char* cstr4 = ss.str().c_str();
    17      const char* t_cstr = t_ss.str().c_str();
    18 
    19      cout << "------ The results ----------" << endl
    20           << "cstr1:\t" << cstr1 << endl
    21           << "cstr2:\t" << cstr2 << endl
    22           << "cstr3:\t" << cstr3 << endl
    23           << "cstr4:\t" << cstr4 << endl
    24           << "t_cstr:\t" << t_cstr << endl
    25           << "-----------------------------"  << endl;
    26 
    27      return 0;
    28  }

The output is:

        ------ The results ----------
        cstr1:  012345678901234567890123456789012345678901234567890123456789
        cstr2:  012345678901234567890123456789012345678901234567890123456789
        cstr3:  abcdefghijklmnopqrstuvwxyz
        cstr4:  abcdefghijklmnopqrstuvwxyz
        t_cstr: abcdefghijklmnopqrstuvwxyz
        -----------------------------

From the output, we can surprisingly see that values of cstr3 and cstr4 are not the same as that of string ss, but equal to the value of t_ss. Let's add several statements in example 1 to find out why this happens.    

Example 2:

     1  #include <string>
     2  #include <sstream>
     3  #include <iostream>
     4 
     5  using namespace std;
     6 
     7  #define PRINT_CSTR(no) printf("cstr" #no " addr:\t%p\n",cstr##no)
     8  #define PRINT_T_CSTR(no) printf("t_cstr" #no " addr:\t%p\n",t_cstr##no)
     9 
    10  int main()
    11  {
    12      stringstream ss("012345678901234567890123456789012345678901234567890123456789");
    13      stringstream t_ss("abcdefghijklmnopqrstuvwxyz");
    14      string str1(ss.str());
    15 
    16      const char* cstr1 = str1.c_str();
    17      const char* cstr2 = ss.str().c_str();
    18      const char* cstr3 = ss.str().c_str();
    19      const char* cstr4 = ss.str().c_str();
    20      const char* t_cstr = t_ss.str().c_str();
    21 
    22      cout << "------ The results ----------" << endl
    23           << "cstr1:\t" << cstr1 << endl
    24           << "cstr2:\t" << cstr2 << endl
    25           << "cstr3:\t" << cstr3 << endl
    26           << "cstr4:\t" << cstr4 << endl
    27           << "t_cstr:\t" << t_cstr << endl
    28           << "-----------------------------"  << endl;
    29      printf("\n------ Char pointers ----------\n");
    30      PRINT_CSTR(1);
    31      PRINT_CSTR(2);
    32      PRINT_CSTR(3);
    33      PRINT_CSTR(4);
    34      PRINT_T_CSTR();
    35 
    36      return 0;
    37  }

In example 2, the addresses of the strings are printed out. The output is:

        ------ The results ----------
        cstr1:  012345678901234567890123456789012345678901234567890123456789
        cstr2:  012345678901234567890123456789012345678901234567890123456789
        cstr3:  abcdefghijklmnopqrstuvwxyz
        cstr4:  abcdefghijklmnopqrstuvwxyz
        t_cstr: abcdefghijklmnopqrstuvwxyz
        -----------------------------

        ------ Char pointers ----------
        cstr1 addr:     0x100200e4
        cstr2 addr:     0x10020134
        cstr3 addr:     0x10020014
        cstr4 addr:     0x10020014
        t_cstr addr:    0x10020014

  
      
From the output, we can see that the addresses of cstr3 , cstr4 and t_cstr are the same, which explains why their values are the same as shown in the output. Usually we might assume that when ss.str() is called in line 17~19, three string objects will be created and each object has a different address.
    
However the output shows otherwise. In fact, when streamstring calls str(), it returns a temporary string object, which will be destructed along with the function return. Then c_str() is called right after str() and the argument passed into c_str() is a corresponding C string of the temporary string object. Thus, these strings cannot be referenced after the expression evaluation, and the memory will be retrieved or might be overwritten. Although in some cases (for example, delete line 20 from example 2), this memory might not be overwritten and we can still read out the strings, but the accuracy of the read result is not guaranteed.   

Let's modify example 2 as below:

Example 3:
     1  #include <string>
     2  #include <sstream>
     3  #include <iostream>
     4  
     5  using namespace std;
     6  
     7  #define PRINT_CSTR(no) printf("cstr" #no " addr:\t%p\n",cstr##no)
     8  #define PRINT_T_CSTR(no) printf("t_cstr" #no " addr:\t%p\n",t_cstr##no)
     9  
    10  int main()
    11  {
    12      stringstream ss("012345678901234567890123456789012345678901234567890123456789");
    13      stringstream t_ss("abcdefghijklmnopqrstuvwxyz");
    14      string str1(ss.str());
    15  
    16      const char* cstr1 = str1.c_str();
    17      const string& str2 = ss.str();
    18      const char* cstr2 = str2.c_str();
    19      const string& str3 = ss.str();
    20      const char* cstr3 = str3.c_str();
    21      const string& str4 = ss.str();
    22      const char* cstr4 = str4.c_str();
    23      const char* t_cstr = t_ss.str().c_str();
    24  
    25      cout << "------ The results ----------" << endl
    26           << "cstr1:\t" << cstr1 << endl
    27           << "cstr2:\t" << cstr2 << endl
    28           << "cstr3:\t" << cstr3 << endl
    29           << "cstr4:\t" << cstr4 << endl
    30           << "t_cstr:\t" << t_cstr << endl
    31           << "-----------------------------"  << endl;
    32      printf("\n------ Char pointers ----------\n");
    33      PRINT_CSTR(1);
    34      PRINT_CSTR(2);
    35      PRINT_CSTR(3);
    36      PRINT_CSTR(4);
    37      PRINT_T_CSTR();
    38  
    39      return 0;
    40  }


      
        The output is:

        ------ The results ----------
        cstr1:  012345678901234567890123456789012345678901234567890123456789
        cstr2:  012345678901234567890123456789012345678901234567890123456789
        cstr3:  012345678901234567890123456789012345678901234567890123456789
        cstr4:  012345678901234567890123456789012345678901234567890123456789
        t_cstr: abcdefghijklmnopqrstuvwxyz
        -----------------------------

        ------ Char pointers ----------
        cstr1 addr:     0x100200e4
        cstr2 addr:     0x10020134
        cstr3 addr:     0x10020184
        cstr4 addr:     0x100201d4
        t_cstr addr:    0x10020014

 
     
From the examples, we know that stringstream.str() will return a temporary string object, which will be destroyed after the function call. When we want to manipulate on this string object (for example, to create corresponding C string), we must be very careful about this trap that might cause unexpected results.

Since the memory of the temporary object might not be overwritten so that this trap might not expose, but the usage does not guarantee the accuracy of the results. So to avoid wrong results, you should make sure to use this function correctly.
     
System and compiler environment:

 Red Hat Enterprise Linux Server release 5.8 (Tikanga)
 Linux 2.6.18-308.el5 ppc64 GNU/Linux
 gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)

Reference:
   http://stackoverflow.com/questions/1374468/c-stringstream-string-and-char-conversion-confusion

 

0 comments
0 views

Permalink