C++
BEGINNING C++
One of the most frequently asked questions I receive is, "I've done some VB/Perl/PHP programming before. How should I begin learning C++?" First, remember that C is not a prerequisite for learning C++. In fact, as a beginner, you may find that learning C++ directly is easier. Programming courses are costly. You can find less expensive alternatives. First, get a good C++ primer book. Since this is a matter of personal taste, I cannot recommend a specific title. Some folks prefer Q&A format, while others are more comfortable with a structured organization of topics, so you should find the book that suits your preferences. Second, you need a C++ compiler to experiment with (there are plenty of free C++ compilers on the Web).
A good primer book and a compiler can provide you with all the theoretical and practical training you need to learn C++ from scratch when time is the only resource you have to invest.
THE BOOST WEB SITE
The Boost Web site offers assorted free C++ libraries and classes, including a random number generator, four smart pointer classes, file system tools, timer classes, numeric data types, and more. You can also download classes that are still being developed and tested. The Boost site is at http://www.boost.org

GUARANTEES ABOUT THE SIZES OF INTEGRAL TYPES

The integral types char, short, int, and long have implementation-defined sizes. However, there are a few guarantees in Standard C and C++ about their sizes: char must occupy at least 8 bits, and it may not be larger than short; short must occupy at least 16 bits, and it may not be larger than int; and long must be at least as large as int, and it must occupy no less than 32 bits.

TIME REPRESENTATION

C and C++ represent time as a positive integer that contains the number of seconds elapsed since January 1, 1970. The type time_t (defined in the header < ctime > or < time.h > in C) is a platform-dependent signed integral type that has at least 32 bits. As large as this unit may seem, a 32-bit time_t will roll over on January 18, 2038. As a result, applications that use time measurements beyond that date (such as mortgage calculations, pension plans, and health insurance) may encounter undefined behavior. Let's hope that before this problem becomes critical, hardware architectures will have switched to a 64-bit time_t, which is sufficiently large to represent billions of years. To examine the size of time_t on your machine, simply use the expression sizeof(time_t).

A VIRTUAL MEMBER FUNCTION CAN BECOME PURE VIRTUAL IN A DERIVED CLASS

In general, a derived class that inherits a pure virtual member function from its abstract base class implements this function. However, the opposite is also true: A virtual member function that is implemented in a base class can become pure virtual in a derived class. For example:
 
class Base 
{
  virtual void func() { /*do something */ }
};
class Derived : public Base
{
  virtual void func() = 0; /* redefined as pure virtual */
};
You wouldn't normally do that. Sometimes, however, this is the only way to overcome flaws in a base class that you are not allowed to fix.

GET A FREE C++ COMPILER FOR WINDOWS

Dev-C++ is a free graphical C++ compiler for Windows 95, 98, and NT. The package also includes a debugger and source file editor that you can use as a complete development suite. You can download it from http://www.bloodshed.nu/devc.html

INTEGRAL TYPES WITH PORTABLE SIZES

The actual size of built-in integral types such as char, short, int, and long is machine dependent. When you need portable sizes, you can use the following standardized typedefs instead:

int8  //signed 8 bits 
int16 
int32 
These integral types are defined in the standard header < cstddef > (the corresponding C header is < stddef.h > ). Many platforms also define int64. Unsigned versions of these typedefs also exist:
 
uint8  //unsigned 8 bits 
uint16 
uint32

MINIMIZE THE USE OF DYNAMIC MEMORY ALLOCATION

Pointer-based operations are less frequently needed than they may seem. For example, examine the following class:

class PointerMisuse 
{ 
private: 
  CDC * m_pDeviceContext; 
public: 
  PointerMisuse(); 
  ~PointerMisuse(); 
}; 
 
PointerMisuse::PointerMisuse() 
{ 
  m_pDeviceContext = new CDC; 
} 
 
PointerMisuse::~PointerMisuse() 
{ 
 delete m_pDeviceContext; 
} 
Even experienced programmers are often inclined to use this programming style, allocating objects on the free store. This is not surprising--after all, classes like this are widely used in many commercial frameworks (MFC, ATL, and many others). But is the use of pointers and dynamic memory allocation really necessary here? It isn't. This class can be rewritten as follows:

class ProperUse 
{ 
private: 
  CDC * m_pDeviceContext; 
  CDC cdc; //automatic storage 
public: 
  ProperUse(); 
}; 
 
ProperUse::ProperUse() 
{ 
  m_pDeviceContext = & cdc; 
} 
Not only is this version safer (remember that dynamic memory allocations might fail), it is also simpler (for example, no destructor is required) and more efficient because it avoids the unnecessary overhead of memory allocation and deallocation at run-time.

THE MEMORY LAYOUT OF IDENTICAL STRING LITERALS

Whether identical string literals are treated as distinct objects depends on the implementation. Consider the following code snippet:

extern const char msg1[] = "hello"; 
extern const char msg2[] = "hello"; 
int main() 
{ 
  bool eq = (msg1 == msg2); //true or false? 
  return 0; 
} 
Some implementations might store the constants msg1 and msg2 at the same memory address (on such implementations, the value of the variable eq is true). Other implementations store msg1 and msg2 in two distinct memory addresses, and, in this case, eq is false.

UNDERSTANDING MEMORY ALIGNMENT

Most CPUs require that objects and variables reside at particular offsets in the system's memory. For example, 32-bit processors require that a 4-byte integer reside at a memory address that is evenly divisible by four. This requirement is called "memory alignment." On such architectures, a 4-byte int can be located at memory address 0x2000 or 0x2004, but not at 0x2001. On most UNIX systems, an attempt to use misaligned data results in a bus error, which terminates the program altogether. On Intel processors, the use of misaligned data is supported, but at a substantial performance penalty. Therefore, most compilers automatically align data variables according to their type and the particular processor being used. This is why the size that structs and classes occupy may be larger than the sum of their members' size:
 
struct Employee 
{ 
  int ID; 
  char state[3]; /* CA, NY + null */ 
  int salary; 
};
Apparently, Employee should occupy 11 bytes (4+3+4). However, most compilers add an unused padding byte after the field "state" so that it aligns on a 4-byte boundary. Consequently, Employee occupies 12 bytes rather than 11. You can examine the actual size by using the expression sizeof(Employee).

CONSTRUCTING OBJECTS ON PREALLOCATED CHAR ARRAYS

The memory buffer returned by operator new is guaranteed to have the suitable alignment properties for any type of object. This property enables you to construct objects on a preallocated char array. For example:
 
#include < new > 
#include < iostream > 
using namespace std; 
 
class Employee{ /*...*/}; 
 
void f() 
{ 
  char * pc = new char[sizeof(Employee)]; /*pre-allocate*/ 
  Employee *pemp = new (pc) Employee;  /* constructed on char array */ 
  //...use pemp 
  pemp -> ~Employee(); /* explicit destruction */ 
  delete [] pc; /* release buffer */ 
}

DELETING MULTIDIMENSIONAL ARRAYS

You can allocate a multidimensional array dynamically, as in the following example:
 
int*pp[10]; /* array of ten pointers */ 
for (int j=0; j < 10; j++) /* allocate sub-arrays */ 
{ 
  pp[j] = new int[5]; /* every element in pp points to an array of 5 
int's */ 
} 
pp[0][0] = 1; /* use pp as a multi-dimensional array */
Remember that you have to use a loop to delete a multidimensional array:
 
for (int k=0; k < 10; k++) 
{ 
  delete [] pp[k]; /* delete [] every element in  pp */ 
}
Never use the following forms to delete a multidimensional array:
 
delete pp; /* undefined behavior */ 
delete [] pp; /* undefined behavior */ 
None of them ensures that the elements of the multidimensional array are properly destroyed.

USE STATIC ALLOCATION FOR BUFFERS WITH A FIXED SIZE

Imagine that you have to write a stock quote application that accepts stock symbols and retrieves their current values. Using a std::string object to represent a stock symbol is inefficient because std::string allocates an initial buffer that has 16 to 64 characters, depending on the platform. Since a stock symbol never reaches this size, using std::string is overkill and causes a substantial waste of memory. At this point during the design process, designers often suggest writing a custom string class that handles small strings. This is a plausible suggestion, but a common design mistake is to let the custom string class allocate memory from the free store. Free store allocation is significantly slow. In addition, remember that even if you allocate a single byte using operator new, the cost is much more than a single byte, because the system stores internal bookkeeping information in the allocated buffer, which therefore has to be much larger. A better design approach is to create a buffer on the stack with the size of the largest possible string. For example:

class QuoteString 
{ 
private: 
  char stock[6]; //fixed size 
public: 
  //... 
};

OVERLOADING THE FUNCTION CALL OPERATOR

Overloading the function call operator can be confusing because the overloaded operator has two pairs of parentheses; it may not be immediately obvious which of them declares the parameters. Another point to note is that the overloaded function call operator may take any number of arguments, unlike other overloaded operators. The following example shows how you can overload the () operator and use it:

class A 
{ 
private: 
  int n; 
public: 
 //... 
/* parameters appear in second pair of parentheses */ 
  void operator ()(bool debug) const; 
}; 
 
void A::operator ()(bool debug) const //definition 
{ 
  if (debug) 
    display(n); 
  else 
    display ("no debug"); 
} 
int main() 
{ 
  A a; 
  a(false); /* use overloaded operator */ 
  a(true); 
}

THE ROLE OF IMPLICITLY DECLARED CONSTRUCTORS

If a class has no user-declared constructor, and it doesn't have const or reference data members either, C++ implicitly declares a default constructor for it. An implicitly declared default constructor is an inline public member of its class. It performs the initialization operations that are needed by the implementation to create an object instance. Note, however, that these operations do NOT involve initialization of user-declared data members or allocation of memory from the free store. For example:

class C 
{ 
private: 
  int n; 
  char *p; 
public: 
  virtual ~C() {} 
}; 
 
void f() 
{ 
  C obj; //OK 
}
The programmer did not declare a constructor in class C. Therefore, the implementation created an implicit default constructor for C. The synthesized constructor does not initialize the data members n and p. These data members have an indeterminate value after obj has been constructed. However, the synthesized constructor initialized a pointer to the virtual table of class C, to ensure that the virtual destructor can be called.

THE C_STR() AND DATA() MEMBER FUNCTIONS OF STD::STRING

Class std::string provides two member functions that return the const char * representation of its string, namely string::c_str() and string::data(). c_str() returns a null-terminated const pointer to char that represents the object's string. For example:

void f() 
{ 
  bool identical; 
  string  s = "Hello"; 
  if(strcmp( s.c_str(), "Hello") == 0) 
    identical = true; 
 else 
   identical = false; 
}
The member function string::data() also returns a const char * representation of its string, but it might not be null-terminated, so data() should not be used in a context that requires a null-terminated character array.

ALTERNATIVE REPRESENTATIONS OF OPERATORS C++ defines textual representations for logical operators. Platforms and hardware equipment that do not support all the special characters that are used as logical operators can use these alternative representations. The alternative representations are also useful when you want to transfer source files in a portable form to other countries and locales, in which special characters such as ^ and ~ are not part of the national character set. The following list contains the alternative keywords and their equivalent operators:

  • and &&
  • and_eq &=
  • bitand &
  • bitor |
  • compl ~
  • not !
  • not_eq !=
  • or ||
  • or_eq |=
  • xor ^
  • xor_eq ^=

COPY CTOR AND ASSIGNMENT OPERATOR Although the copy constructor and assignment operator perform similar operations, they are used in different contexts. The copy constructor is invoked when you initialize an object with another object: string first "hi"; string second(first); //copy ctor On the other hand, the assignment operator is invoked when an already constructed object is assigned a new value: string second; second = first; //assignment op Don't let the syntax mislead you. In the following example, the copy ctor--rather than the assignment operator--is invoked because d2 is being initialized: Date Y2Ktest ("01/01/2000"); Date d1 = Y2Ktest; /* although = is used, the copy ctor invoked */

EVALUATION ORDER OF A MEMBER-INITIALIZATION LIST Whenever you use a member-initialization list, the compiler transforms the list so that its members are laid down in the order of declaration of the class data members. For example, the following class declares two data members: a and b. However, the constructor's member-initialization list first initializes the member b and then a:

 
class A 
{ 
private: 
  int a; 
  int b; 
public: 
  A(int aa, int bb) : b(bb), a(aa) {} /* reverse order */ 
}; 
Since a is declared before b, the member-initialization list is automatically transformed by the compiler into
 
A(int aa, int bb) : a(aa), b(bb) {}/* transformed by the compiler to 
fit class  declaration order */ 
While the transformation of initializers is harmless in this case, imagine what happens with a slightly different member-initialization list:

A(int bb) : b(bb), a(b) {} /* original code */ 
The compiler implicitly rearranges the list into
 
A(int bb) : a(b), b(bb) {} /* oops: 'a' has an undefined value */ 
Although some compilers catch this potential bug, many compilers don't. Therefore, it's best to adhere to the class order of declaration in a member-initialization list.

GUIDELINES FOR WRITING PORTABLE CODE Contrary to what most people believe, portability doesn't guarantee that the same code can be compiled and run on every platform without any modifications. Although it's sometimes possible to write 100 percent portable code, in nontrivial applications such code is usually too complex and inefficient. A better approach is to separate platform-specific modules from the platform-independent parts of the entire application. GUI components, for instance, tend to be very platform specific, and therefore should be encapsulated in dedicated classes. On the other hand, business logic and numeric computations are more platform independent. Therefore, when the application is to be ported, only the GUI modules need to be rewritten, while the rest of the modules can be reused.

USING A TEMPLATE AS A TEMPLATE'S ARGUMENT You can use a template as a template's argument. In the following example, a mail server class can store incoming messages in a vector of messages, where each message is represented as a vector of bytes:

 
vector < vector < unsigned char > > vmessages; 
Note that the space between the left two angle brackets is mandatory. Otherwise, a sequence of two consecutive > signs is parsed as the right shift operator. A typedef can be used to improve readability both for the compiler and the human reader:
 
typedef  vector < unsigned char > msg; 
vector < msg > vmessages;

BEFORE PROFILING YOUR APPS... If you intend to optimize your software's performance, be sure to profile the release version rather than the debug version. The debug version of the executable contains additional code (about 40 percent extra compared to the equivalent release executable) for symbol lookup and other debug "scaffolding." In addition, most compilers have distinct implementations of the run-time library--one for the debug version and one for the release version. For example, the debug version of operator new initializes the memory it allocates with a unique value so that memory overruns can be detected automatically. Conversely, the release version of new doesn't initialize the allocated block. Furthermore, an optimizing compiler may have already optimized the release version of an executable in several ways, such as eliminating unneeded variables, loop unrolling, storing variables in the CPU registers, and function inlining. These optimizations are not applied to the debug version. Therefore, you can't deduce from a debug version where the actual bottlenecks are located. To conclude, debugging and optimization are two distinct operations. Use the debug version to chase bugs and logical errors; profile the release version to optimize it.

UNDERSTANDING REFERENCE COUNTING A reference counting class counts how many object instances have an identical state. When two or more instances share the same state, the implementation creates only a single copy and counts the number of existing references to this copy. For example, an array of strings can be represented as a single string object that holds the number of elements in the array. Since initially the array elements share the same state (they all are empty strings), only a single object is needed to represent the array, regardless of its size. When one of the array elements changes its state (the program writes a different value to it), the reference counting object creates one more object--this is called "copy on write." Under some conditions, reference counting can boost performance considerably both in terms of run-time speed and memory usage. For this reason, many implementations of std::string use copy on write.

TEMPLATES AND INHERITANCE A common mistake is to assume that a vector < Derived > is like a vector < Base > when Derived is a subclass of Base. For example:

 
class Base {/**/}; 
class Derived : public Base {/**/}; 
void func(vector < Base > & vb); 
int main() 
{ 
  vector < Derived  > vd; 
  func(vd); /* compilation error; pvd is not vector < Base > */ 
}
However, there is no relationship between classes generated from the same class template. Therefore, the is-a relationship does not exist between vector < Base > and vector < Derived >.

TEMPLATES AND INHERITANCE A common mistake is to assume that a vector < Derived > is like a vector < Base > when Derived is a subclass of Base. For example:

 
class Base {/**/}; 
class Derived : public Base {/**/}; 
void func(vector < Base > & vb); 
int main() 
{ 
  vector < Derived  > vd; 
  func(vd); /* compilation error; pvd is not vector < Base > */ 
}
However, there is no relationship between classes generated from the same class template. Therefore, the is-a relationship does not exist between vector < Base > and vector < Derived >.

WHEN TO USE A VOID-CAST Some rigorous compilers issue a warning message if you ignore a function's return value. For example:

 
int func(); 
int main() 
{ 
  func(); /* return value ignored */ 
}
Indeed, ignoring a function's return value may indicate a bug. However, sometimes it's a deliberate and well-behaved practice. To suppress the compiler warnings in such cases, you can explicitly cast the return value to void:
 
int main() 
{ 
  (void) func(); /* suppress compiler warnings about ignored return 
value */ 
}

BEGIN() AND END() MEMBER FUNCTIONS OF STL CONTAINERS All STL containers provide the begin() and end() pair of member functions. begin() returns an iterator that points to the first element of the container. For example:

 
vector < int > v(1); 
v[0] = 10;   /* assign the first element */ 
vector < int > ::iterator p  = v.begin(); /* p points to v's first 
element */ 
However, end() returns an iterator pointing one position past the last valid element of the container. This sounds surprising at first, but there's nothing really unusual about it if you consider how strings in C are represented: an additional null character is automatically appended one position past the final element of the char array. The additional element in STL has a similar role--it marks the end of the container and is automatically appended to the container so you should not attempt to modify it. To get the last valid element of a container, you should decrement the result of end() by 1, as in the following example:
 
vector < int > ::iterator p  = v.end()-1; /*p points to v's last 
element */

CONTAINER'S REALLOCATION AND ITS CONSEQUENCES When a container has exhausted its free storage and additional elements have to be inserted to it, the container reallocates itself. The reallocation process consists of four steps. First, a new memory buffer large enough to store the container is allocated. Second, the container copy-constructs existing elements on the new memory location. Third, the destructors of the original elements are successively invoked. Finally, the original memory buffer is released. Clearly, frequent reallocations can impose a significant performance overhead. However, some techniques can minimize--and sometimes avoid--reallocation. For example, when you create a vector object, you can specify its size in advance:

 
vector < int > vi(1000); /* make room for at least 1000 elements */ 
You can also specify the size of the vector object after construction. For example:
 
vi.reserve(2000); /* make room for at least 2000 elements */

CONTAINER'S REALLOCATION AND ITERATOR INVALIDATION When a container reallocates its elements, their addresses change correspondingly. Consequently, the values of existing pointers and iterators are invalidated:


#include < iostream > 
#include < list > 
using namespace std; 
 
int main() 
{ 
  list < double > payroll; 
  payroll.push_back(5000.00); 
  list < double > ::iterator p = payroll.begin(); 
  for (int i = 0 ; i < 10; i++) 
 { 
   payroll.push_back(4500.00); /* insert 10 more elements to payroll; 
reallocation may occur */ 
 } 
      /* DANGEROUS, p may have been invalidated */ 
  cout < < "first element in payroll: " < < *p ; 
}
In this example, it may well be the case that payroll reallocated itself during the insertion of 10 additional elements, thereby invalidating the iterator p. Using an invalid iterator is undefined--it's exactly as if you were using a pointer with the address of a deleted object. To be on the safe side, it is advisable to reassign the iterator's value:
 
for (int i = 0 ; i < 10; i++) 
{ 
  payroll.push_back(4500.00); 
} 
p = payroll.begin(); /* re-assign p */ 
cout < < "first element in payroll: " < < *p; /* now safe */

DISPLAYING THE LITERAL REPRESENTATION OF BOOL VARIABLES By default, iostream objects display bool variables as 0 and 1. You can override the default setting by inserting the formatting flag boolalpha to the object stream. Subsequently, 'false' and 'true' will be displayed instead of 1 and 0:

 
#include < iostream > 
using namespace std; 
int main() 
{ 
  bool b = true; 
  cout < < b;  /* output  '1' */ 
  cout < < boolalpha;  /* henceforth, use 'true' and 'false' */ 
  cout < < b;  /* output 'true' */ 
  cout < <  ! b;  /* output  'false' */ 
}

THE COPY ALGORITHM The Standard Library provides a generic function, copy(), which you can use for copying a sequence of objects to a specified target. The first and the second arguments of copy() are const iterators that mark the beginning and the end of the sequence being copied (or a fragment of that sequence). The third argument points to a container into which the sequence is to be copied. The following example shows how to copy the elements of a list into a vector using the copy() algorithm:

 
#include < algorithm > /* definition of copy */ 
#include < list > 
#include < vector > 
using namespace std; 
 
int main() 
{ 
  list < int > li; 
  vector < int > vi; 
  li.push_back(1);  /* fill the list */ 
  li.push_back(2); 
  vi.reserve( li.size() ); 
    /* must make room for copied elements in advance */ 
  copy (li.begin(), li.end(), vi.begin() ); 
    /* copy list elements into vector, at vector's beginning */ 
}

AVOID MISUSES OF EXCEPTION HANDLING Although you can use exceptions instead of for-loops and break statements, doing so is not a recommended programming practice. Consider the following demo application that prompts the user to enter data until a certain condition is met:

 
#include < iostream > 
using namespace std; 
class Exit{}; 
int main() 
{ 
  int num; 
  cout < < "enter a number; 99 to exit" < < endl; 
  try 
  { 
    while (true) /* infinite */ 
    { 
      cin > > num; 
      if (num == 99) 
        throw Exit(); /* leave loop */ 
      cout < < "you entered: " < < num; 
      cout  < < "enter another number " < 
This code is inefficient because of the performance overhead of exception handling. Furthermore, it's verbose and less readable. The use of ordinary for-loops and break statements rather than exception handling would make this code more efficient and readable. As a rule, use exceptions to report and handle run-time errors exclusively.

DEFINING A STATIC MEMBER FUNCTION A static member function in a class can access only other static members of its class. Unlike ordinary member functions, a static member function can be invoked even when no object instance exists: class Clock { public: static void showTime(); }; int main() { stat:: showTime(); /* called without an object */ stat s; s. showTime(); /* called from an object */ } Static members are used when all other data members of an object are also static, when the static function does not depend on any other object member, or, simply, when a global function is undesirable (in the pre-namespaces era, this was the predominant method of preventing global functions).

FORWARD-DECLARING I/O CLASSES AND TEMPLATES The standard header < iosfwd > contains forward declarations of the standard I/O classes and templates. This header is sufficient to refer to any of the I/O classes and templates, but not to apply operation to them. For example: #include < iosfwd > using namespace std; class C { public: friend ostream & operator < < (ostream & os, const C & d); }; ostream & operator < < (ostream & os, const C & d); In the example, the declaration of the friend function does not need a complete definition of the ostream class; a forward declaration is sufficient in this case. Therefore, < iosfwd > is #included instead of the full-blown < iostream >. The result is a significantly reduced compilation time.

INTERACTING WITH THE OPERATING SYSTEM In general, API functions and classes enable you to interact with the operating system. Sometimes, however, it is much simpler to execute a system command directly. For this purpose, you can use the standard function system() that takes a const char * argument containing a shell command. For example, on a DOS/Windows operating system, you can display the files in the current directory like this: #include < cstdlib > int main() { system("dir"); /* execute the "dir" command */ }

OVERLOADING OPERATORS FOR ENUM TYPES For some enum types, it may be useful to overload operators such as ++ and -- that can iterate through the enumerator values. You can do it like this: enum Days {Mon, Tue, Wed, Thur, Fri, Sat, Sun}; Days & operator++(Days & d, int) /* int indicates postfix ++ */ { if (d == Sun) return d = Mon; /* rollover */ int temp = d; return d = static_cast < Days > (++temp); } int main() { Days day = Mon; for (;;) /* display days as int's */ { cout < < day < < endl; day++; if (day == Mon) break; } }

STRING LITERALS AND TEMPLATE ARGUMENTS A template can take the address of an object with external linkage as an argument. Consequently, you cannot use a string literal as a template argument since string literals have internal linkage. For the same reason, you cannot use local pointers for that purpose, either. For example: template < class T, const char * > class A {/*...*/}; extern const char *global_ptr; void array_user() { const char * p ="illegal"; A < int, "invalid" > aa; /* error, string literal used as argument */ A < int, p > ab; /* error, p has internal linkage */ A < int, global_ptr > ac; /* OK, global_ptr has external linkage */ }

THE "BIG THREE RULE" VERSUS THE "BIG TWO RULE" The famous "Big Three Rule" says that if a class needs any of the Big Three member functions (copy constructor, assignment operator, and destructor), it needs all of them. In general, this rule refers to classes that allocate memory from the free store. However, many other classes require only that the Big Two (copy constructor and assignment operator) be defined by the user; the destructor, nonetheless, is not always required. Examine the following example: class Year { private: int y; bool cached; /* has this instance been cached? */ public: Year(int y); Year(const Year & other) /* make sure 'cached' isn't copied */ { y = other.getYear(); } Year & operator =(const Year & other) /* make sure 'cached' isn't copied */ { y = other.getYear(); return *this; } int getYear() const { return y; } }; /* no destructor is required for class Year */ Class Year does not allocate memory from the free store, nor does it acquire any other resources during its construction. A destructor is therefore unnecessary. However, the class needs a user-defined copy constructor and assignment operator to ensure that value of the member 'cached' is not copied, because it is calculated for every individual object separately.

TWO FORMS OF DYNAMIC_CAST The operator dynamic_cast comes in two flavors. One uses pointers and the other uses references. Accordingly, dynamic_cast returns a pointer or a reference of the desired type when it succeeds. However, when dynamic_cast cannot perform the cast, it returns a null pointer, or in case of a reference, it throws a std::bad_cast exception: void f(Shape & shape) { Circle * p = dynamic_cast < Circle * > ( & shape); /* is shape a Circle? */ if ( p ) { p->fillArea (); } else {} /* shape is not Circle */ } In this example, dynamic_cast examines the dynamic type of Shape by casting its address to a pointer to Circle. If the cast is successful, the resultant pointer is used as a pointer to Circle. In contrast, the next example uses a reference dynamic_cast. You should always place a reference dynamic_cast within a try-block and include a suitable catch-statement to handle the potential std::bad_cast exception, as follows: #include < typeinfo > /* for std::bad_cast */ void f(Shape & shape) try { Circle & ref = dynamic_cast < Circle & > (shape); ref.fillArea(); /* successful cast */ } catch (std::bad_cast & bc) /* shape is not a Circle */ {/*..*/} }

WHAT ARE LVALUES AND RVALUES? An object is a contiguous region of memory storage. An lvalue (pronounced L value) is an expression that refers to such an object (the original definition of lvalue referred to "an object that can appear on the left-hand side of an assignment"). An expression that can appear on the right-hand side of an expression (but not on its left-hand side) is an rvalue. For example: #include < string > using namespace std; int & f(); void func() { int n; char buf[3]; n = 5; /* n is an lvalue; 5 is an rvalue */ buf[0] = 'a'; /* buf[0] is an lvalue, 'a' is an rvalue */ string s1 = "a", s2 = "b", s3 = "c"; /* "a", "b", "c" are rvalues */ s1 = /* lvalue */ s2 +s3; /* s2 and s3 are lvalues that are implicitly converted to rvalues */ s1 = string("z"); /* temporaries are rvalues */ int * p = new int; /* p is an lvalue; 'new int' is an rvalue */ f() = 0; /* a function call that returns a reference is an lvalue */ s1.size(); /* otherwise, a function call is an rvalue expression */ } An lvalue can appear in a context that requires an rvalue; in this case, the lvalue is implicitly converted to an rvalue. However, an rvalue cannot be converted to an lvalue. Therefore, it is possible to use every lvalue expression in the example as an rvalue, but not vice versa.

BOOSTING PERFORMANCE OF LEGACY SOFTWARE When you port pure C code to a C++ compiler, you may discover slight performance degradation. This is not a fault in the programming language or the compiler, but a matter of compiler setting. To regain the same (or better) performance as with a C compiler, simply disable the compiler's RTTI and exception-handling support. Why is this? To support these features, a C++ compiler appends additional "scaffolding" code to the program. Consequently, the program's speed may be reduced and its size may increase. The additional overhead incurred by RTTI and exception handling ranges from five percent to 20 percent, depending on the compiler. When you're using pure C, this overhead is unnecessary, so you can safely disable RTTI and exception-handling support. However, you should not attempt to apply this tweak to C++ code or C code that uses any C++ constructs, such as operator new.

CALLING A MEMBER FUNCTION FROM A CONSTRUCTOR You can safely call member functions--virtual and nonvirtual alike--of an object from its constructor. It is guaranteed that the invoked virtual is the one defined for the object whose constructor is executing. Note, however, that virtual member functions of an object derived from the one whose constructor is executing are not invoked: class A { public: virtual void f(); virtual void g(); }; class B: public A { public: void f (); /* overrides A::f() */ B() { f(); /* B::f() */ g(); /* A::g()*/ } }; class C { public: void f (); /* overrides B:::f() */ }; B * p = new B; B's constructor calls f() and g(). The calls are resolved to B::f() and A::f(), respectively, because B overrides f() but it doesn't override g(). Note that C::f() is not called from B's constructor.

DYNAMIC_CAST AND ACCESS SPECIFICATIONS Operator dynamic_cast fails when it cannot convert its argument to the desired type. Usually, such a failure results from an attempt to cast an object to a nonrelated class. However, the int may also be the result of an attempt to cast a derived object to a private base class. For example: class Container {/*..*/ }; class Allocator{/*..*/}; class Stack: public Container, private Allocator {/*..*/ }; int main() { Stack s; /* attempt to cast a pointer to Stack to a pointer to Allocator */ Allocator* p = dynamic_cast < Allocator* > ( & s); /* runtime failure */ } In the example above, dynamic_cast fails because Stack is not publicly derived from Allocator.

EFFICIENCY OF STRING COMPARISONS Class std::string defines three versions of the overloaded == operator: bool operator == (const string & l, const string & r); bool operator == (const char* l, const string & r); bool operator == (const string & l, const char* r); This proliferation may seem redundant, since std::string has a constructor that automatically converts a const char * to a string object. Thus, one could make do with only the first version of operator ==, which in turn converts a char * to a temporary string object, and then performs the comparison. However, the overhead of creating a temporary string can be unacceptable under some circumstances. The C++ Standardization Committee's intent was to make comparison operation of std::string as efficient as possible. Therefore, the additional overloaded versions were added to allow direct comparisons, without the additional overhead of temporaries.

FORWARD DECLARATIONS AND TYPEDEF NAMES It is illegal to use forward declarations with typedef names: class string; /* illegal, string is a typedef name */ void f(string & s); Even a typename won't do here: typename std::string; /* still illegal */ void f(std::string& s); Can you see what the problem is with these forward declarations? std::string is not a class, but a typedef name defined in the standard header < string > like this: /* note: this is a simplified form */ typedef basic_string , allocator < char > >string; These forward declarations don't provide the necessary information about the type of std::string. Yet, to generate the correct mangled name for the function f(), the compiler has to see the non-typedef'd form of its argument. For this reason, you cannot use a forward declaration of typedef'd names. There is no escape from including the header < string > in this case.

NAMESPACES DO NOT IMPOSE ANY OVERHEAD The use of namespaces in your application does not incur runtime or memory overhead. Therefore, you can use namespaces in embedded systems and time-critical applications without hesitation.

PREFER OBJECT INITIALIZATION TO ASSIGNMENT In general, initializing an object is more efficient than assignment. Consider the following code snippet: string s1 = "Hello "; string s2 = "World"; string both; both = s1+s2; /* assignment rather than initialization; inefficient */ The compiler transforms the statement both = s1+s2; into this: string temp (s1 + s2); /* store result in a temporary */ both = temp; /* assign */ temp.::~string(); /* destroy temp */ Using initialization rather than assignment is more efficient: string both = s1+s2; /* initialization */ This time, the compiler transforms the expression both = s1+s2; into this: string both (s1 + s2); In other words, the construction and destruction of a temporary are avoided in this case. To conclude, object assignment requires the creation and destruction of a temporary, whereas initialization does not. Whenever you have the choice, choose object initialization rather than assignment.

THE ROLE OF ALLOCATORS Every STL container uses an allocator. Allocators encapsulate the low-level details of the memory model of a given platform: They hide the size of pointers, reallocation policy, block and page size, heap location, etc. Containers are therefore rather portable, because a different allocator type can be plugged to them, depending on the platform. Your STL implementation provides an allocator that is suitable for your platform and application's needs. However, some applications may require custom-made allocators.

VIRTUAL FUNCTIONS SHOULD NOT BE DECLARED PRIVATE It is customary to extend virtual functions in a derived class by first invoking the base's version of that function and then extending it with additional functionality. Therefore, there is little point in declaring a virtual member function private--by doing so, you disable the user's ability to extend it in a derive class.

REPORTING AN ERROR DURING OBJECT CONSTRUCTION Constructors have no return value (not even void). Therefore, you have to use alternative techniques of reporting errors during object construction. There are three common techniques of reporting such errors. The first technique uses C-style error handling: When an error occurs during construction, the constructor assigns an error value to a global variable, which is later examined. Although this is the least preferable technique, it can be useful when combining legacy code written in C with new C++ code: int errcode = OK; /* global flag */ Date::Date(const char *datestr) { if (isValid(datestr) == false) errcode = EINVALIDDATE; ... } int main() { Date d("02/29/1999"); /* invalid */ if (errcode == OK) /* proceed normally */ { ... } else {..} /* handle the error */ } Date's constructor turns on the global flag errcode to indicate a runtime error. Main() subsequently checks the flag's value and handles the exception.

THROWING EXCEPTIONS TO REPORT AN ERROR DURING OBJECT CONSTRUCTION Many implementations still don't support exception handling very well. Therefore, the methods of using global variables and passing a parameter to the constructor can be used instead. However, on implementations that support exception handling, throwing an exception is the preferred method of reporting runtime errors during construction. The advantages of exceptions are better support for object-oriented programming, simpler and more readable code, safety, and automation. The constructor in the following example throws an exception to report a runtime error: class X { public: X(const char* description); ... }; Date::Date(const char *datestr) throw (X) { if (isValid(datestr) == flase) throw X("invalid date string"); ... } int main() { try { Date d("02/29/1999"); // ... proceed normally } catch (X & exception) { /* ... handle the exception */ } }

USING PARAMETERS TO REPORT AN ERROR DURING OBJECT'S CONSTRUCTION Using global variables to indicate runtime errors has many disadvantages. An alternative method of reporting runtime errors during object construction is to pass an additional argument to the constructor. The constructor in turn assigns an appropriate value to that argument that indicates success or failure. For example: Date::Date(const char *datestr, int & status) { if (isValid(datestr) == flase) status = EINVALIDDATE; ... } int main() { int status; Date d("02/29/1999", status); if (status == OK) /* proceed normally */ { ... } }

FUNCTION ARGUMENTS EVALUATION ORDER The order of evaluation of function arguments is unspecified. This means that for a function that takes the arguments (a, b, c), any permutation of this set is a valid argument evaluation sequence. To demonstrate that, suppose you write a function f() and you call it like this: bool x; /* global */ f( g(false), 1, x, 3); Suppose also that g() changes the value of x. One cannot tell which value of x is passed to f() as its third argument--it could be either the value of x before or after the call to g(false). Therefore, writing code that relies on a specific evaluation order of function arguments is bad programming practice and should be avoided.

THE NEW LONG LONG DATA TYPE The long long type denotes an integral data type that is at least as large as a long int. On 32-bit architectures and 64-bit architectures, long long occupies 64 bits (eight bytes). Although long long is not defined by the ANSI/ISO C++ standard yet, it is defined by the C9X draft standard and, in practice, many platforms and compilers already support it. As with other integral types, a long long has an unsigned counterpart: unsigned long long distance_to_star; The suffix ll or LL can be added to a literal integer to indicate a long long type: const long long year_light_in_km = 9460800000000LL; Likewise, the suffixes ull and ULL can be added to a literal integer to indicate an unsigned long long type.

ALTERNATIVE STL IMPLEMENTATION The STLport organization offers an alternative implementation of the Standard Template Library that you can install instead of the original STL implementation supplied with your compiler. The creators of this implementation claim that it's fully compatible with Microsoft's Visual C++ 5.0 and 6.0, as well as many other platforms. You can download the STLport implementation from http://www.stlport.org/index.shtml The site contains installation directions and extensive documentation about the STLport version of STL.

RETURNING A VOID EXPRESSION FROM A VOID FUNCTION Examine the following code snippet. Does your compiler accept it? void a_func_returning_void(); void func() { return a_func_returning_void(); /* returning a value from a void function? */ } At present, most compilers will not accept this code, although it's perfectly legal. The problem is that until not long ago, returning an expression from a function that returns void was an error, even if the returned expression itself was void. This restriction was changed recently: A function with a void return type may now return an expression that evaluates to void. This change was required to enable better support of generic algorithms in the STL.

STATIC CLASS MEMBERS MAY NOT BE INITIALIZED IN A CONSTRUCTOR You cannot initialize static data members inside the constructor or a member-initialization list: class File { private: static bool locked; public: File::File(): locked(false) {} /*error*/ }; Although compilers flag such ill-formed initializations as errors, programmers often wonder why this is an error. Remember that a constructor is called as many times as the number of objects created, whereas a static data member may be initialized only once, since it is shared by all the class objects. Therefore, static member initialization should be done outside the class, as in this example: class File {/*..*/}; File::locked = false; /* OK */ Alternatively, for const static members of an integral type, the Standard now allows in-class initialization as follows (note that some compilers don't support this feature yet): class Screen { private: const static int pixels = 768*1024; /* OK */ public: Screen() {/*..*/} ... };

CATCH EXCEPTIONS BY REFERENCE Although you can catch an exception by value, as in catch( Exception except) /* by value */ { /* ... */ } it's better to catch an exception by reference: catch( Exception & except) /* by reference */ { /* ... */ } Catching an exception by reference has several advantages. First, pass by reference is more efficient, since it avoids unnecessary copying of objects. Second, you avoid slicing of derived exception objects. Finally, you ensure that modifications applied to the exception within a catch-clause are preserved in the exception object when it is rethrown.

DATA AGGREGATES The term "aggregate" refers to an array or a class with no constructors, no private or protected data members, no base classes, and no virtual functions. In other words, an aggregate is a POD (Plain Old Data) object or an array thereof. Aggregates differ from class objects and arrays of class objects in several ways. In particular, you can initialize every aggregate of every type and size by the '={0};' initializer list. This initializer list guarantees that the entire aggregate is zero-initialized: struct Employee { int ID; int rank char name[12]; int salary; }; /* using the '={0}' initializer to initialize aggregates */ Employee emp = {0}; /* all emp members are zero-initialized */ Employee emp_arr[10] = {0}; /* all array elements are zero-initialized */ Note that nonaggregate objects and arrays cannot be initialized this way; they are initialized by a constructor.

GET A FREE COPY OF THE GNU C/C++ COMPILER 2.95 The GNU project focuses on the development and distribution of open source software. The newly released GCC 2.95 is the latest version of GNU C/C++ compiler. Release 2.95 includes numerous bug fixes, new optimizations, and support for new platforms. In addition, its standard compliance has been improved. You can download GCC 2.95 from the GNU site at http://www.gnu.org/software/gcc/gcc-2.95/gcc-2.95.2.html

THE C++ EXCEPTION HANDLING MODEL In a resumptive model, after an exception has been handled, the program continues execution at the point where the exception was thrown. The C++ exception-handling model is nonresumptive, though--program execution resumes at the next statement following the catch-block. This is a source of confusion among C++ novices, who mistakenly assume that the program automatically returns to the point where an exception was thrown from, but it doesn't.

THE CC++ PARALLEL PROGRAMMING LANGUAGE CC++ (this is not a typo) is a parallel programming language based on C++. A free CC++ compiler as well as an overview of CC++ features, syntax, and supported platforms can be found at http://globus.isi.edu/ccpp/ Although CC++ is not standardized yet, it's interesting to examine the elegance of its straightforward and relatively simple handling of parallel processing, resource locking and unlocking mechanisms, and its platform-neutrality.

WHY THE COPY CONSTRUCTOR'S PARAMETER MUST BE PASSED BY REFERENCE The canonical form of a copy constructor of a class called C is C::C(const C & rhs); You can also declare the copy constructor's parameter as follows: C::C(C & rhs); /*non-const, OK*/ That is, the const specifier is not mandatory. Note, however, that the parameter may not be passed by value: C::C(const C rhs); /* Error */ Can you see why the parameter must be passed by reference? Whenever you pass an object by value, you implicitly invoke its copy constructor (because the object being passed by value is in fact a copy of the original argument). Attempting to pass a copy constructor's argument by value will result in an infinite recursion: The copy constructor calls another copy constructor, which in turn calls another copy constructor, and so on. To avoid this, C++ requires that a copy constructor's parameter be passed by reference.

INITIALIZING ARRAYS IN A CLASS You cannot initialize an array class member by a member-initialization list. For example: class A { private: char buff[100]; public: A::A() : buff("") /* error */ {} }; The following forms won't compile either: A::A() : buff('\0') {} /* ill-formed */ A::A() : buff(NULL) {} /* ill-formed */ Instead, you should initialize arrays inside the constructor body: A::A() { memset(buff, '\0', sizeof(buff)); }

OVERLOADED SUBSCRIPT OPERATOR SHOULD RETURN A REFERENCE When you overload the operator [], remember that its non-const version should return an object by reference, not by value. Otherwise, you won't be able to use it in assignment expressions. For example: class IntArray { private: /*...*/ public: int & operator [] (int index); /*return a reference */ }; IntaArray iar; iar[0] = 1; /*to make this work, operator [] must return int & */

OPTIMIZING LOOPS Consider the following code: class Foo{/*..*/}; std::vector < Foo > v; for (int j =0; j < MAX; j++) { v.push_back ( Foo() ); /* using a temporary */ } This for loop is very inefficient: It constructs and destroys a temporary Foo object on every iteration. You can easily avoid the unnecessary overhead of constructing and destroying this object if you declare it outside the loop: Foo something; // avoiding a temporary for (int j =0; j < MAX; j++) { v.push_back ( something ); } This improved version constructs and destroys the object only once, regardless of how many times the loop executes. In general, you should avoid declaring temporary objects inside loops. Instead, create them outside the loop.

THE USEFULNESS OF PTRDIFF_T C and C++ define a special typedef for pointer arithmetic, ptrdiff_t, which is a platform-specific signed integral type. You can use a variable of type ptrdiff_t to store the result of subtracting and adding pointers. For example: #include < stdlib.h > int main() { int buff[4]; ptrdiff_t diff = ( & buff[3] ) - buff; /*diff = 3*/ diff = buff -( & buff[3] ); /*diff = -3*/ } There are two advantages in using ptrdiff_t instead of int. First, the name ptrdiff_t is self-documenting and helps the reader understand that the variable is used in pointer arithmetic. Second, ptrdiff_t is portable: Its underlying type may vary across platforms, but you don't need to make changes in the source code when porting your software to other platforms.

USE EXPLICIT ACCESS SPECIFIERS IN A CLASS DECLARATION In many contexts, C++ provides default access specifiers when you don't spell them out explicitly: struct Employee{ /*..*/}; class Manager : Employee /* implicit public inheritance */ { int rank; /* implicitly private */ bool StockHolder; /* ditto */ }; Relying on the default access specifiers is not recommended, though. Always spell out the access specifier explicitly: class Manager : public Employee { private: int rank; bool StockHolder; }; class BoardMemebr : private Manager {/*..*/}; By using explicit access specifiers, you ensure that the code is more readable and you avoid future maintenance problems. Imagine that the creator of struct Employee decides one day to change it into a class. Consequently, Employee will become a private base class of Manager.

A COMMON MISTAKE IN TEMPLATE INSTANTIATION The type that a template takes as an argument must have external linkage. In other words, you cannot instantiate a template with a locally declared type: int main() { struct S { int current_state; char description[10]; }; queue < S > qs; /* 1. compilation error */ } The line numbered 1 causes a compilation error because it attempts to instantiate a template with a locally declared type, S. S has no linkage, and as such, it cannot serve as a valid template argument. To fix this, you should move the declaration of struct S outside of main(): struct S { /* ... */ }; int main() { queue < S > qs; /* now OK */ }

AVOID TYPEDEF'S IN STRUCTS DEFINITIONS In olden days, before C++ appeared, it was customary to declare a struct as a typedef: typedef struct DATE_TAG { int day; int month; int year; } Date; /* 'Date' is a typedef */ This way, one could create an instance of the struct without having to use the keyword 'struct': /* C code */ Date date; /* 'struct' not required */ struct DATE_TAG another_date; In C++, the use of a typedef in this context is unnecessary because you don't need the elaborated type specifier (for example, struct, union, and class) to create an instance: // C++ code DATE_TAG another_date; Therefore, avoid declaring structs as typedef names in C++ code.

FUNCTION SIGNATURES A function's signature provides the information needed to perform the overload resolution. The signature consists of the function's parameter list and their ordering. Note that a function's return type is not considered part of its signature, and it does not participate in overload resolution.

GETTING ALONG WITHOUT NULL REFERENCES In C, algorithms such as bsearch() and lfind() return a null pointer to indicate that the sought-after element wasn't found. Unlike a pointer, a reference must always be bound to a valid object. Therefore, a reference can never be null. I've often seen the following dreadful hack as an attempt to fabricate a null reference: int & locate(int * pi, int size, int value) { if (find_int(pi, size, value) != NULL) // ... else return *((int *) (0)); /* very bad */ } The return statement fakes a null reference by binding a reference to a dereferenced null pointer. This hack violates two C++ rules: The function locate() returns a reference to a local automatic variable. The results of using such a reference are undefined (this is called a "dangling reference"). Worse yet, even if locate() used a non-automatic variable to avoid the dangling reference problem, any attempt to use the returned reference would still cause a runtime crash because you cannot dereference a null pointer. Unfortunately, many compilers will accept this code, letting the user detect its disastrous results at runtime. You can resolve the lack of null references with several approaches. Throwing an exception is one of them. However, the overhead and complexity of exception handling may be overkill for simple functions such as locate(). In this case, you can return a reference to a special object (such as a negative integer when an array subscript cannot be found): int & locate(int * pi, int size, int value) { static int unfound = -1; if (find_int(pi, size, value) != NULL) // ... else return unfound; /* fine */ }

INITIALIZE CLASS'S DATA MEMBERS EASILY Suppose you want to initialize all the data members of class Book inside its constructor: class Book { private: char ISBN[11]; int year char name[100]; public: virtual ~Book(); Book(); }; One way to do that is by assigning a value to every member inside the constructor body. However, this is tedious. Don't be tempted to use memset() to initialize the members: Book::Book() { memset(this, 0, sizeof (*this)); /* very bad idea */ } memset() also initializes the object's virtual table pointer. Consequently, the program will crash at runtime. Instead, remove all the class's data members to a dumb struct that doesn't have any virtual member functions and add a constructor to the struct: struct BookData { char ISBN[11]; /*... rest of the data members */ BookData() {memset(this, 0, sizeof (*this)); } /* now OK, struct has no vptr */ }; Next, derive Book from that struct using private inheritance: class Book : private BookData {...}; And that's all. Now every data member of Book is automatically initialized by BookData's constructor.

OVERLOADING OPERATOR + PROPERLY The built-in operator + takes two operands of the same type and returns their sum without changing their values. In addition, + is a commutative operator: You can swap the operands' positions and get the same result. Therefore, an overloaded version of operator + should reflect all these characteristics. You can either declare operator + as a member function of its class or as a friend function: class Date { public: Date operator +(const Date & other); /* member */ }; class Year { friend Year operator+ (const Year y1, const Year y2); /* friend */ }; The friend version is preferred because it reflects symmetry between the two operands. Because built-in + does not modify any of its operands, the parameters of the overloaded + are const. Finally, overloaded + should return the result of its operation by value, not by reference.

THE DANGERS OF OBJECT SLICING Passing by value a derived object to a function may cause a problem known as "object slicing"--every additional data member declared in the derived object is sliced from the copy that the function receives. Note also that the dynamic binding mechanism is inoperative in sliced objects: class Date { private: int year, month, day; public: virtual void Display() const; /* mm-dd-yy */ }; class DateTime: public Date { private: int hrs, min, sec; /* additional members; might be sliced off */ public: void Display() const; /* mm-dd-yy hh:mm:ss */ }; void f(Date b) /* by value */ { b.Display(); /* no dynamic binding; calls Date::Display() */ } int main() { DateTime dt; f(dt); /* sliced */ } Object slicing may result in undefined behavior. Therefore, pass them by reference whenever possible.

USES OF THE OFFSETOF MACRO The standard macro offsetof (defined in < stdlib.h > ) calculates the offset of a struct member. The result is the number of bytes between the beginning of the struct and that particular member. The following example uses this macro to calculate the offset of two data members in the struct S: #include < stdlib.h > struct S { int a; int b; void * p; }; int main() { int n = offsetof(S, a); /*n = 0 */ n = offsetof(S, p); /* n = 8 */ } Note that offsetof works with POD (Plain Old Data) structs and unions. The result of applying this macro to a field that is a static data member of a class is undefined. Likewise, applying offsetof to a class that has virtual functions, base classes, or template members is undefined, as well.

VISIT SILICON GRAPHIC'S STL PAGES Silicon Graphics hosts a Web site dedicated to Standard Template Library (STL). This site documents all of the classes, functions, and concepts in the SGI implementation of STL. Each page describes a single component and includes links to related components. The documentation assumes a general familiarity with C++, and particularly with C++ templates. The site also has a comprehensive introduction to STL that can be useful for programmers who are familiar with C++ but have never used STL. You can find the SGI STL pages at http://www.sgi.com/Technology/STL/

VISIT THE C/C++ USERS GROUP WEB SITE The C/C++ Users Group (CUG) is an independent organization that offers hundreds of classes, functions, and code libraries that you can download from the site. CUG focuses primarily on cross-platform compatibility with UNIX, Linux, Windows, DOS, and others. The code packages include algorithms, mathematical libraries, logging and reporting utilities, interpreters, compilers, file managers, networking tools, and more. To find the CUG Web site, go to http://www.hal9k.com/cug/

A FUNCTION TEMPLATE CANNOT BE VIRTUAL
A reader tried to declare a template as a virtual member function of a class. However, his compiler refused to compile the code. He wanted to know whether it was a compiler bug: template < class T > struct A { template < class C > virtual void f(C); /* illegal, virtual template function */ virtual void g(); /* OK, g is not a template */ }; It's not a compiler bug. The C++ Standard says (14.5.2 p 3): "A member function template shall not be virtual."
ADD A CATCH(...) BLOCK TO YOUR EXCEPTION HANDLERS HIERARCHY
If your application uses exception handling, it is recommended that you always add a catch(...) statement after all the existing catch blocks. Remember that even if your code contains the appropriate handlers for expected exception, other unexpected exceptions may still be thrown by standard functions, algorithms, and containers. By adding a catch(...) statement to the end of existing handlers, you guarantee that no exception is left unhandled. Furthermore, a catch(...) statement ensures that the program properly destroys all automatic and static objects as part of the stack unwinding process, even if an unexpected exception is thrown. For example: int main() { try { func(); } catch(DerivedEx & de) /*most derived*/ {} catch(Ex & exc) {} catch(...) /*catch all unexpected exceptions*/ {} } Note that a catch(...) block cannot detect the type of the exception it handles. Therefore, it should perform only general cleanup and logging operations.
APPLYING OPERATOR TYPEID TO NONPOLYMORPHIC OBJECTS
Operator typeid retrieves the runtime type information of a polymorphic object (an object having at least one virtual member function). Using typeid to retrieve the type information of nonpolymorphic objects and fundamental types is also legal. However, in this case typeid returns the static type of the object rather than its dynamic type. For example: #include < typeinfo > #include < iostream > using namespace std; class B{}; class D: public B{}; void f() { typedef int I; B* p = new D; cout << typeid(I).name(); /*display 'int' */ cout << typeid(*p).name(); /* display 'B' rather than 'D'*/ } Adding a virtual member function to B will force the typeid operator to access the dynamic type of its argument. Consequently, the following line will display D rather than B: cout << typeid(*p).name(); Note that applying dynamic_cast to fundamental types or nonpolymorphic classes is an error.
AUTO_PTR SHOULD NOT HOLD ARRAYS OF OBJECTS
The auto_ptr class template automatically destroys an object that is allocated on the free store when the current scope is exited. For example: #include < memory > /* for auto_ptr */ #include < string > using namespace std; int main() { auto_ptr < string > ps (new string); /* .. use ps */ } /* ps is automatically deleted */ Note, however, that binding auto_ptr to an array of objects results in undefined behavior, because auto_ptr's destructor calls scalar delete; it never calls delete [] to destroy its bound object. Therefore, never bind auto_ptr to an array: auto_ptr < string > ps (new string[2]); /* undefined behavior */
AVOIDING RECURRENT POINTER DELETION
The result of applying delete to the pointer more than once is undefined: std::string ps = new std::string; delete ps; // ..many code lines delete ps; /* undefined behavior, ps deleted twice*/ Of course, the solution to this bug is to avoid deleting the same pointer twice. However, if code modifications are impossible--for example, when the bug exists in a software library that is maintained by a third party--a temporary workaround to this problem is to assign a null value to a pointer right after it has been deleted. C++ guarantees that a null pointer deletion is harmless: std::string ps = new std::string; delete ps; ps = NULL; // ..many code lines delete ps; /* harmless*/
DEFINING A FUNCTION IN AN UNNAMED NAMESPACE
The use of the keyword "static" to limit the linkage of classes and functions to a translation unit is considered deprecated. Unnamed namespaces are the recommended alternative. However, avoid a common mistake of declaring a function in an unnamed namespace and defining it later, somewhere outside that unnamed namespace: namespace { void func(); /* only a declaration */ } void func() /* most likely an error */ { // ... } The programmer's intention was to define the function that was previously declared in the unnamed namespace. However, the compiler cannot guess the programmer's intention because the definition of func() looks exactly like an ordinary extern function. Therefore, the compiler treats the two instances of func() as referring to two distinct functions: One appears in the unnamed namespace and the other in the global namespace. To fix this, func() must be defined inside unnamed namespace, too: namespace { void func() // OK { // ... } }
DYNAMIC BINDING AND STATIC BINDING
Not every call to a virtual function is resolved dynamically. For example: class A { public: virtual void func() {/*..*/} }; int main() { A a; A.func(); /* resolved at compile time */ } Dynamic binding applies in two cases: when calling a virtual function through a pointer to a polymorphic object, and when calling a virtual function through a reference to a polymorphic object (a polymorphic object is one that declares or inherits at least one virtual member function): void f(A & ref, A* pa) { ref.func(); /* resolved dynamically */ pa->func(); /* ditto */ } int main() { A a; f(a, & a); } You can bypass the dynamic binding mechanism by using the member function's fully qualified name: pa->A::func(); /* always resolved statically */
HELPING THE COMPILER DEDUCE THE TYPE OF A TEMPLATE ARGUMENT
In general, the compiler deduces the type of a function template argument automatically. For example: int n = max (5,10); /* max< int > deduced because 5 and 10 are int's */ However, sometimes the compiler needs more explicit clues regarding the type of the argument, as in the following example: template < class T >T f() { return static_cast< T > (0); } int main() { int n = f();/*error, can't deduce argument type*/ double d = f(); /*ditto*/ } The program fails to compile because the compiler cannot deduce what the type of f's argument is (the return type is insufficient for that purpose). In situations like these, you can state the type of the template argument explicitly: int i = f < int > (); /*now OK */ double d = f < double >(); /* OK*/
INLINE OR __FORCEINLINE
The decision whether a function declared inline is actually inline-expanded is left to the sole discretion of the compiler. In other words, inline is only a recommendation. The compiler may refuse to inline a function that has loops, for example. Visual C++ and several other compilers offer nonstandard keywords that control the inline expansion of a function, in addition to the standard inline keyword. What are the uses of the nonstandard keywords? The nonstandard keyword __forceinline overrides the compiler's heuristics and forces it to inline a function that it would normally refuse to. I'm not sure I can think of a good reason to use __forceinline, since it may cause a bloated executable file and a reduced instruction-cache hit. Furthermore, under extreme conditions, the compiler may not respect the __forceinline request, either. So, in general, you should stick to good old inline, because it ensures that your code is more portable, and it lets the compiler "do the right thing." You should use __forceinline as a last resort.
MAKING A CLASS TEMPLATE A FRIEND OF ANOTHER CLASS TEMPLATE
A class template can be a friend of another class template. For example: template < class U > class D{/*...*/}; template < class T > class Vector { public: // .. template < class U > friend class D; }; In this example, every specialization (instantiation) of the class template D is a friend of every specialization of the class template Vector.
PACK A LONG PARAMETER LIST IN A STRUCT
A function having a long list of parameters, as in void retrieve(const string & title, const string & author, int ISBN, int year, bool & inStore); often becomes a maintenance problem, because its parameter list is likely to be changed in the future. For example, you can add a link containing the book's cover as a fifth parameter. Consequently, every code line that calls this function needs to be fixed. A better solution is to pack the entire parameter list in a single struct or class and pass it by reference to the function: struct Item { string title; // ..rest of the parameters }; void retrieve(Item & book); Packing the parameters in a single struct has two advantages. First, every modification in the parameter list is localized to the definition of the struct exclusively. You don't need to make changes to every line of code that calls this function. In addition, you gain a slight performance boost, because passing a single reference to a function is more efficient than passing several arguments.
STRUCTURED EXCEPTION HANDLING VERSUS STANDARD EXCEPTION HANDLING
Many C compilers provide a feature called structured exception handling (SEH), which is not defined by the ANSI standard. SEH can trap asynchronous, platform-specific exceptions such as division by zero, floating point overflow and underflow, and other hardware exceptions. In contrast, the standard C++ exception handling (EH) applies only to synchronous exceptions--that is, exceptions that are generated by the program itself and that are explicitly created by a throw statement. The types of exceptions raised by SEH and EH, as well as their underlying mechanisms and recovery methods, are incompatible. Therefore, never confuse SEH with EH, or vice versa. In C, use SEH if necessary. In C++, stick to EH exclusively.
USING FUNDAMENTAL TYPES AS A TEMPLATE PARAMETER
A template can take ordinary types such as int and char as its parameters: template < class T, int n > class Array {/*..*/}; However, the template argument in this case must be a constant or a constant expression. For example: const int cn = 5; int num = 10; Array < char, 5 > ac; /*OK, 5 is a const*/ Array < float, cn > af; /* OK, cn is const*/ Array < unsigned char, sizeof(float) > au; /*OK, constant expression*/ Array < bool, num > ab; /*error, num is not a constant */
DEFAULT PARAMETER VALUES ARE NOT PART OF A FUNCTION'S TYPE
Although the default parameter values of a function must appear in its declaration, they are not considered part of its type. Therefore, you cannot overload a function by using different default parameter values: void f(int n = 6); void f(int n = 0); /*error, redefinition of f()*/ The attempt to overload f() is illegal because the compiler considers both versions of f() identical.
FORWARD-DECLARING A NAMESPACE MEMBER
Suppose you need to forward-declare class Interest, which is a member of namespace Bank. Seemingly, all you need to do is use the class's fully qualified name: class Bank::Interest; void Analyze(Bank::Interest & in); /*error*/ Alas, the qualified name Bank::Interest will cause a compilation error because the compiler doesn't know that Bank is a namespace. The solution is simple: you need to open the class's namespace, forward-declare the class in that namespace, and immediately close that namespace: namespace Bank { class Interest; /*fwd declaring class Interest, which is a member of namespace Bank*/ } void Analyze(Bank::Interest & in); /*OK*/ The example above reopens namespace Bank (which is defined in another source file already). It then forward-declares class Interest inside that namespace and immediately closes it. Remember that a forward-declaration is just a declaration, so it doesn't cause any link errors about "duplicate symbols."
OPEN MODE FLAGS
In older stages of C++, the ios::nocreate and ios::noreplace open modes were part of the < fsteam.h > family of classes. However, the revised < fstream > family of file access classes, which should be used instead of the now deprecated < fstream.h >, does not recognize these flags anymore. What happened to them? Along with the templatization of < fstream >, the C++ standardization committee decided to remove from std::ios_base (which now replaces ios) the open mode flags that were considered too platform-specific, including ios_base::nocreate and ios_base::noreplace. Programmers who migrate from < fstream.h > to < fstream > may find that their code refuses to compile because it contains these disused open mode flags. The alternative is to use either platform-specific APIs or alternative coding conventions that do not rely on these flags.
THE WELL-FORMEDNESS OF ZERO SIZED ARRAYS
In standard C++, declaring arrays with zero elements is ill-formed: int n[0]; // compilation error However, certain compilers do support arrays of zero size as nonstandard extensions. In contrast, dynamic allocation of zero-sized arrays is valid C++: int n = new int[0]; The standard requires that, in this case, new allocate an array with no elements. The pointer returned by new is non-null, and it is distinct from a pointer to any other object. Similarly, deleting such a pointer is a legal operation. While zero-sized dynamic arrays may seem like another C++ trivia that no one will ever need, this feature is chiefly important when implementing custom memory allocators. A custom allocation function can take any non-negative (i.e., unsigned) argument without worrying whether it's zero: void * allocate_mem(unsigned int size) { /* no need to check whether size equals zero*/ return new char[size]; }
WHY YOU SHOULDN'T STORE AUTO_PTR OBJECTS IN STL CONTAINERS
You must have heard that you shouldn't use auto_ptr objects as elements of STL containers. Here's why. The C++ Standard says that an STL element must be "copy-constructible" and "assignable." These fancy words basically mean that for a given class, assigning and copying one object to another are well-behaved operations. In particular, the state of the original object shouldn't change when it is copied to the target object. This is not the case with auto_ptr, though: Copying or assigning one auto_ptr to another makes changes to the original, in addition to the expected changes in the copy. To be more specific, the original object transfers ownership of the pointer to the target, and the pointer in the original object becomes null. Imagine what would happen if you did something like this: vector < auto_ptr < Foo > > vf; /*a vector of auto_ptr's*/ // ..fill vf int g() { auto_ptr < Foo > temp = vf[0]; /*vf[0] becomes null*/ } When temp is initialized, the member vf[0] is changed: Its pointer becomes null. Any attempt to use that element will cause a runtime crash. This situation is likely to occur whenever you copy an element from the container. Remember that even if your code doesn't perform any explicit copy or assignment operations, some algorithms (swap(), random_shuffle(), sort(), and many others) create a temporary copy of one or more container elements. Furthermore, certain member functions of the container create a temporary copy of one or more elements, thereby nullifying them. Any subsequent attempt to the container elements is therefore undefined. Several Visual C++ users report that they have never encountered any problems with using auto_ptr in STL containers. This is because the auto_ptr implementation of Visual C++ (all versions thereof) is outdated and relies on an obsolete specification. When the vendor decides to catch up with the current ANSI/ISO C++ Standard and change its Standard Library accordingly, code that uses auto_ptr in STL containers will manifest serious malfunctions. So, don't use auto_ptr in STL containers. Either use bare pointers or use other smart pointer classes instead of auto_ptr. You can find such classes at http://www.Boost.org
CATCHING EXCEPTION IN A HIERARCHY
You should catch exceptions in a bottom-down hierarchy. The first handler should catch the most derived exception, the next handler should catch more general exceptions (for instance, base class exception), and finally, the last handler should always be a catch-all. For example: class DerivedException: public BaseException {}; int main() { try { func_that_might_throw(); } catch(DerivedException & d) { /* .. */ } catch(BaseException & b) { /* .. */ } catch(...) /* catch all other exceptions */ { /* .. */ } } Remember that if you don't follow this order, some of the catch blocks in the code may become inaccessible (that is, they will become "dead code"). This is because when the exception-handling mechanism looks for an appropriate handler, it stops searching when it find the first handler-to-exception match, not necessarily the best match. For example, in the following handler hierarchy, the second handler will never be invoked because the first handler will handle its exceptions: catch(BaseException & b) { } catch(DerivedException & d) /* dead code */ { }
COPYING AND ASSIGNMENT OF CONTAINER OBJECTS
STL containers define an appropriate copy constructor and overload the assignment operator, thereby enabling you to copy construct and assign container objects easily: int main() { vector < int > first; /*.. fill first */ vector < int > second (first); /* copy construct */ second.push_back(4); first = second; /* copy all elements of second to first */ }
DECLARING REFERENCES TO FUNCTIONS
You can declare a reference to a function. For example: void f(int n) { ++n; } int main() { void (&rf) (int) = f; /* bind rf as a reference to f() */ rf(5); /*call f() through its reference */ } Unlike a function pointer, a reference to a function must always be bound to an existing function (that is, it may not be null or dangling), nor can you rebind another function to it after its initialization.
DESIGNING EXCEPTION CLASSES
An exception is just like any other object: It can have data members and member functions. A well-designed exception class should have at least one member function that returns a verbal description of the exception. In applications that use error codes, you should also include a member function that returns the exception's code. This way, the handler of that exception doesn't need to look for this information elsewhere. Remember also that sometimes the same exception can be thrown in similar but not identical situations. For example, a DBMS system can throw the same exception when it encounters a referential integrity problem or an attempt to insert a duplicate key. To distinguish between these errors, the handler can query the exception object and obtain a precise description of the actual error that occurred.
DON'T CHANGE THE ACCESS OF A VIRTUAL MEMBER FUNCTION IN A DERIVED CLASS
Although it's perfectly legal to change the access specifier of a virtual member function in a derived class, as in class Base { public: virtual void Say(); }; class Derived : public Base { private: /* access specifier changed */ void Say(); }; you should never do that. When you use a pointer or a reference to a base class, the object bound to that pointer or reference will behave unpredictably regarding the access specification of its members: Derived d; Base *p = & d; p->Say(); /* surprise */ Because the binding of a virtual member function is postponed to runtime, the compiler cannot detect that a nonpublic member function is called; although p points to an object of type Derived, in which Say() is a private member function, the compiler assumes that p points to an object of type Base, of which Say() is a public member. Therefore, the code compiles fine when, seemingly, it shouldn't.
FORWARD DECLARATIONS OF NESTED CLASSES
You cannot forward declare a nested class. For this reason, the following code will not compile: /* assuming class B is nested in class A */ int main() { class A::B; /* error */ A::B * ptr; } The only way to convince the compiler to accept this code is by making the declaration of class A visible--for example, by #including the appropriate header before main(): // file a.h class A { public: class B {/*..*/}; /*nested*/ }; #include "a.h" int main() { class A::B; /*fine but redundant*/ A::B * ptr; /*OK*/ } However, once you #include the declaration of A, the forward declaration in main() becomes redundant. Remember that nested classes are meant to hide implementation details. If you need to access a nested class anywhere outside its enclosing class, perhaps it shouldn't be a nested class in the first place.
INITIALIZING A VECTOR WITH THE CONTENTS OF AN ARRAY
Built-in arrays meet the criteria of an STL sequence. Therefore, you can directly initialize a vector with a built-in array: int main() { int arr[3] = {1, 2, 3}; vector < int > vi ( arr, arr+3 ); /* points one element past array's end */ } To copy the entire array arr into vi, the first argument in the declaration of vi must be the address of the first array element, whereas the second argument must point to the first element PAST the array's end.
MEASURING SMALL TIME UNITS
Most of the time functions of standard C/C++ are limited to the resolution of a second. For measuring smaller time units, you can read the system's clock ticks. The standard header < time.h > defines the constant CLK_TCK, which holds the number of clock ticks per second. Usually, there are at least 1,000 clock ticks per second, so you can get time resolution of up to a millisecond. You obtain the current tick count by calling the function clock(). The following program measures how long it takes to perform a loop that allocates and deallocates an int 10,000 times: #include < ctime > #include < iostream > using namespace std; int main() { t = clock(); /*get current tick count*/ for (int i = 0; i < 10000; i++) { int *p = new int; delete p; } int n = clock() - t; /* subtract new tick count from the one we previously obtained */ cout << "lopp took " << n << " ticks to execute"; double ticks = n; /* ticks are kept as integers; convert to a floating type */ cout << " or " << ticks/CLK_TCK << " second(s)"; }
MORE ON OPERATOR OVERLOADING
Here are some pointers on operator overloading. First, you cannot invent a new operator during overloading: void operator @ (int) ; /* illegal */ Second, neither the precedence nor the number of arguments that the operator takes can be changed. For example, an overloaded + takes exactly two arguments (remember that when you overload an operator as class member, the operator always has an extra hidden argument). Finally, the following operators cannot be overloaded: . .* :: ? sizeof Similarly, static_cast, dynamic_cast, reinterpret_cast, and const_cast, as well as the # and ## preprocessor tokens, may not be overloaded.
NAMESPACE ALIASES
Choosing a short namespace name might eventually lead to name clashes. On the other hand, very long namespaces tend to be laborious. For this purpose, you can use a namespace alias. A namespace alias is a synonym for another namespace. For example: namespace spring_project { class Date{ /*..*/ }; const int MAX_NUM = 512; /* .. more stuff */ } int main() { namespace sp = spring_project; /* sp is a namespace alias */ sp::Date d; /* equivalent to spring_project::Date but shorter */ }
OPTIMIZATIONS THAT ARE CARRIED OUT BY THE LINKER
While most of the code optimizations are performed at compile time, there are optimizations that only a linker can perform. For example, it can detect unreferenced function calls. These are functions that exist in one or more source files and were compiled. If the linker detects that the application never calls these functions, it can remove their text (binary code) from the resultant executable, thereby producing a smaller and faster program. With some linkers, you can control the number of passes the linker makes. The more passes you enable, the more unreferenced functions are removed, though the cost is a longer linkage time.
THE ADDRESS OF A CLASS'S CONSTRUCTOR AND DESTRUCTOR
You can take the addresses of a class's static and nonstatic member functions. However, the class's constructor(s) and destructor are an exception to the rule. You cannot take their addresses, nor can you invoke these members through a callback mechanism.
THE LIFETIME OF AN EXCEPTION OBJECT
The memory for the temporary copy of an exception that is being thrown is allocated in a platform-defined way. In general, exception objects are allocated on a special exception stack. The exception object persists as long as a handler for that exception is executing. If a handler exits by executing a throw statement (for example, the handler rethrows the original exception), control passes to another handler for the same exception, so the temporary remains. The exception object is destroyed only when the last handler for that exception has terminated. For example: class X(); int main() { try { throw X(); } catch (X x) //catch by value { cout<<"an exception has occurred"; return; }//x is destroyed at this point } Therefore, you can throw temporary exception objects safely.
DIFFERENCES BETWEEN VECTOR'S RESERVE() AND RESIZE() MEMBER FUNCTIONS
Although the reserve() and resize() member functions of std::vector perform similar operations, they differ in two respects: Unlike resize(), which allocates memory in the requested size and default-initializes it, reserve() allocates memory in the requested without initializing it. In addition, reserve() does not change the size (that is, the number of elements) of a vector; it changes only the vector's capacity. Note that both these functions take the number of elements, not bytes, as their argument. For example: vector < int > vi; vi.resize(1000); /* make room for at least 1000 int's and initialize them to 0 */ vi.reserve(2000); /* make room for at least 2000 int's without initialization */
OPERATORS CANNOT BE OVERLOADED FOR USER-DEFINED TYPES
An overloaded operator must take at least one argument of a user-defined type (operators new and delete are the only exception). This rule ensures that users cannot alter the meaning of expressions that contain only built-in types. For example: int i, j, k; k = i + j; /* uses built-in = and + operators */ Note that C++ differs in this respect from other languages that support operator overloading, such as Ada. Recall that enum types are user-defined types; as such, you can define overloaded operators for them, too.
THE NAMED RETURN VALUE OPTIMIZATION
The named return value (NRV) is an automatic optimization that compiler applies to eliminate the construction and destruction of an object returned by value from a function. When is the NRV applied? When a temporary object is copied to another object using a copy constructor, and none of these objects is const or volatile, C++ allows the implementation to treat the two objects as one and not perform a copy at all. For example: class A { public: A(); ~A(); A(const A & ); A operator=(const A & ); }; A f() /* returns A by value */ { A a1; return a1; } A a2 = f(); Theoretically, f() constructs a local object a1 and creates a temporary copy of a1 on the stack when the return statement is executed. Finally, a2 is copy-constructed using that temporary object. This is very inefficient, though. The NRV optimization eliminates the construction of the local object in f() as well as the creation of its temporary copy on the stack. Instead, the return value of f() is constructed directly into the object a2, thereby avoiding the construction and destruction of both the local object a1 and its temporary copy on the stack.
DENNIS RITCHIE'S HOME PAGE
Dennis Ritchie was the creator of C and a co-author of the legendary book "The C Programming Language." His Web site is probably the best virtual museum of the C programming language. The site includes interesting historical documents, pictures, and memos from the early days the C programming language. Amazingly enough, the original source files of the primeval C compiler (from 1972 or so) are available online! You can find it at http://cm.bell-labs.com/cm/cs/who/dmr/
DECLARING POINTERS TO DATA MEMBERS
Although the syntax of pointers to members may seem a bit confusing at first, it's consistent and resembles the form of ordinary pointers, with the addition of the class name followed by the operator :: before the asterisk. For example, if an ordinary pointer to int looks like this: int * pi; you define a pointer to an int member of class A like this: int A::*pmi; /* pmi is a pointer to an int member of class A */ You initialize a pointer to a member like this: class A { public: int num; int x; }; int A::*pmi = & A::num; /* 1 */ The statement numbered 1 declares a pointer to an int member of class A and initializes it with the address of the member num. Using pmi and the built-in operator .*, you can examine and modify the value of num in any object of class A: A a1, a2; int n = a1.*pmi; /* copy a1.num to n */ a1.*pmi = 5; /* assign the value 5 to a1.num */ a2.*pmi = 6; /* assign the value 6 to a2.num */ If you have a pointer to A, you need to use the built-in operator ->* instead: A * pa = new A; int n = pa->*pmi; pa->*pmi = 5;
INTRODUCING POINTERS TO MEMBERS
A class can have two general categories of members: function members and data members. Likewise, there are two categories of pointers to members: pointers to member functions, and pointers to data members. The latter are less common, because in general, classes do not have public data members. However, when using legacy C code that contains structs or classes that happen to have public data members, pointers to data members are useful. Pointers to members are one of the most intricate syntactic constructs in C++, and yet, they are a very powerful feature, too. They enable you to invoke a member function of an object without having to know the name of that function. This is very handy implementing callbacks. Similarly, you can use a pointer to a data member to examine and alter the value of a data member without knowing its name. In future tips, I will discuss pointers to members in further detail and show you how to define and use them.
WHERE DEFAULT ARGUMENTS CAN APPEAR
Default arguments can be specified only in the parameter list of a function declaration or in a template-parameter. This means that default arguments cannot appear in declarations of pointers to functions, pointers to member functions, references to functions, or typedef declarations. For example: void f( int n = 0); /* OK */ /* the following declarations are all ill-formed */ void ( & rf) (int n = 0) = f; /* reference to function */ void ( * pf) (int n = 0) ; /* pointer to function */ typedef void ( * pfi) (int n = 0); /* typedef */
BORLAND'S C++ 5.5 IS NOW GIVEN AWAY--FREE
Inrpise (formerly Borland) is now distributing its C++ command-line compiler, Borland C++ 5.5, for free. You can download the compiler here (registration is required): http://community.borland.com/
CORRECTLY DECLARING GLOBAL OBJECTS
Although the use of global variables is not recommended in general, sometimes it's unavoidable. When using a global variable or object, never instantiate it in a header file because the header is usually #included in several separate source files. Consequently, the linker will complain on multiple definitions of the same object. Instead, instantiate a global in a single .cpp file. This way, you ensure that it's defined only once, regardless of the number of source files used in the project. All other source and header files in the project that refer to that global object need to declare it extern. Here is an example: // File a.h /*********/ /*declaration only; x is defined in another file*/ extern int x; struct Counter { Counter() {++x;} ~Counter() {--x;} }; // File b.cpp /*********/ int x; /* definition of a global variable */ // File main.cpp /*********/ #include "a.h" int main() { Counter count; int n = x; } The two source files, b.cpp and main.cpp, are compiled separately. At link time, the linker resolves all references to x to the variable defined in b.cpp.
DECLARING A TYPEDEF
As you may know, typedef names can hide intricate syntactic constructs such as pointers to functions or complex template declarations. However, many novices simply don't know how to declare or interpret a typedef declaration. To write a typedef declaration, simply define a variable of the desired type. Suppose you want to create a typedef name that represents "pointer to int". First, define a pointer to int: int * pi; Then, precede the keyword "typedef" to the previous declaration: typedef int * pi; Now the name pi serves as a synonym for "pointer to int". You can use pi in any context that requires a pointer to int: pi ptr = new int; Similarly, to create a typedef name that is synonymous with "pointer to function that returns int and takes int", you first declare such a pointer: int (*pfi)(int); In the definition above, pfi is a pointer to a function that returns int and takes int. Now, add the keyword "typedef" before the definition: typedef int (*pfi)(int); And pfi becomes a typedef name for "pointer to function that returns int and takes int". For example: int func(int); void callback( pfi some_func) { int result = some_func(5); }
DECLARING POINTERS TO MEMBER FUNCTIONS
Pointers to member functions consist of the member function's return type, the class name followed by ::, the pointer's name, and the function's parameter list. For example, a pointer to a member function of class A that returns an int and takes no arguments is defined like this (note that both pairs of parentheses are mandatory): class A { public: int func (); }; int (A::*pmf) (); In other words, pmf is a pointer to a member function of class A that returns int and takes no arguments. In fact, a pointer to a member function looks just like an ordinary pointer to a function, except that it also contains the class's name immediately followed by the :: operator. Using operator .*, you invoke the member function to which pmf points like this: pmf = & A::func; A a; (a.*pmf)(); /* invoke a.func() */ If you have a pointer to an object, you use the operator ->* instead: A *pa = & a; (pa->*pmf)(); /* calls pa->func() */ Pointers to member functions respect polymorphism. Thus, if you call a virtual member function through such a pointer, the call will be resolved dynamically. Note that you can't take the address of a class's constructor(s) and destructor.
DON'T CONFUSE DELETE WITH DELETE[]
There's a common myth among programmers that it's okay to use delete instead of delete[] to release arrays of built-in types. For example, int *p = new int[10]; delete p; /* bad; should be: delete[] p */ This is totally wrong. The C++ standard specifically says that using delete to release dynamically allocated arrays of any type yields undefined behavior. The fact that, on some platforms, applications that use delete instead of delete[] don't crash can be attributed to sheer luck: Visual C++, for example, implements both delete[] and delete for built-in types by calling free(). However, there is no guarantee that future releases of Visual C++ will adhere to this convention. Furthermore, there's no guarantee that this code will work on other compilers. So, using delete instead of delete[], and vice versa, is precarious and should be avoided.
SOME DESIGN AND CODING RULE OF THUMBS
Writing code from scratch, without any prior design or plan, is bad programming practice. However, the opposite--over-engineering--can be just as harmful. Over-engineering is the use of costly, redundant, or "cute" features that aren't truly necessary or that have simpler and efficient alternatives. A good example of this is using exception handling as an alternative to while and for statements. Similarly, templatizing a class or functions that are never used for more than a single type is an expensive and redundant task. "Cute" features and puns such as macro trickery can also result in indecipherable code. However, the worst of all is probably a wheel's reinvention. Writing custom container classes and algorithms instead of using the ones that are readily available in C++ (in the form of STL) is expensive, bug prone, and totally uncalled for. One of the phases of design and code reviews should consist of catching and sifting such instances of over-engineering, wheel reinvention, and "poetry."
STATIC INITIALIZATION AND DYNAMIC INITIALIZATION
C++ distinguishes between two initialization types for objects with static storage duration (global objects, local static objects, and objects declared in namespace scope have static storage). Static initialization consists of either zero-initialization or initialization with a constant expression; any other initialization is considered dynamic initialization. These two types roughly correspond to compile-time initialization and runtime initialization. For example: int x = func(); int main() { } The global variable x has static storage. Therefore, it's initialized to 0 at the static initialization phase (by default, objects with static storage are zero-initialized). The subsequent dynamic initialization phase initializes x with the value returned from the function func(). This operation can take place only at runtime. In other words, objects with static storage may be initialized twice: once by static initialization, during which their memory storage is initialized to binary zeroes, and afterwards, when their constructors perform additional dynamic initialization.
THE LINKAGE TYPE OF GLOBAL CONST OBJECTS
In C++ (but not in C), a const object declared in the global scope has internal linkage. This means that a const object that apparently looks like a global one is visible only in the scope of its source file; it isn't visible from other files unless you explicitly declare it extern: /* File a.cpp */ const int x=0; /* File b.cpp */ const int x = 2; /* a different x */ /* File main.cpp */ int main() { } Both a.cpp and b.cpp define a const variable called x. However, the two definitions don't clash because they refer to distinct variables, each of which has a different value and is visible only from the scope of its source file. If you remove the const qualifier from the definition of both x's, recompile all the source files, and relink them, you will receive a linker error: Public symbol x defined in both module a.obj and b.obj This is because x has external linkage when it's not declared const. As a result, the two definitions of x clash. There are two lessons from this example. First, if you wish to make a const object declared in the global scope globally accessible, you must declare it extern. Second, avoid declaring nonlocal const objects static, as in static const int x = 0; This is both redundant and deprecated.
THE UNDERLYING REPRESENTATION OF POINTERS TO MEMBERS
Although pointers to members behave very much like ordinary pointers, behind the scenes, they aren't necessarily represented as pointers. In fact, a pointer to a member is usually a struct that can contain up to four members. This is because pointers to members don't support only ordinary member functions; they also support virtual member functions, member functions of objects that have multiple base classes, and member functions of virtual base classes. The simplest member function can be represented as a set of two pointers: one holding the physical memory address of the function, and a second pointer that holds that pointer. However, in cases like a virtual member function and multiple inheritance, the pointer to a member must store additional information; hence, its underlying structure gets larger and more complex. Therefore, you cannot cast pointers to members to ordinary pointers, nor can you safely cast a pointer to a member of one type to another; in both cases, the results are undefined. To get an idea of how your compiler represents pointers to members, you can measure their size. In the following example, the pointers to data members have different sizes and, hence, different representations (of course, the results may vary across platforms and compilers): struct A { int x; void f(); }; int A::*pmi = & A::x; void (A::*pmf)() = & A::f; int n = sizeof (pmi); /* 8 byte on my machine */ int m = sizeof (pmf); /* 12 bytes on my machine */
Most tips are from TipWorld - http://www.tipworld.com :The Internet's #1 Source for Computer Tips, News, and Gossip
Danny Kalev is a system analyst and software engineer with more than ten years of experience, specializing in C++ and object-oriented analysis and design. He is a member of the ANSI C++ standardization committee and the author of The ANSI/ISO C++ Professional Programmer's Handbook (Que,1999).