12 Jun 2006, 02:11

Initializing CComBSTRs

Anyone that’s done any programming in C/C++ knows about the dangers of using an uninitialized primitive data type. Nothing quite like performing a calculation on a double with a junk value or dereferencing an uninitialized pointer to some random block of memory that makes it past a NULL check. However, there are also cases where attempting to intialize an object that has a default constructor can cause you some trouble if you’re not careful (in addition to being unnecessary). For example, the Microsoft ATL class CComBSTR. CComBSTR is a wrapper class around the BSTR data type, which is a Microsoft specific type of Unicode string. It’s basically just a pointer (unsigned short*) to a contiguous block of characters in memory, much like the good old C style char* strings, only instead of being NULL terminated, BSTR’s have their length stored in the 4 bytes preceding the address that the pointer references. Like char*’s you’ve got to handle their memory management yourself, so the CComBSTR class wrapper keeps a member variable BSTR and takes care of the allocation and deallocation for you. However, it makes that member variable public, which is useful in some cases, but can also be dangerous if you’re not careful. Which brings me back to the problem of initialization. If you declare a CComBSTR and attempt to initialize it to NULL like:

CComBSTR sTheString = NULL;

while you might assume that this would just set the member variable BSTR to NULL, what it actually does is allocate a BSTR of 0 length which takes up 2 bytes of memory (possibly 6, I’m not sure what it does with the 4 byte length field). This isn’t a problem in and of itself though, since if you just use the CComBSTR correctly it will deallocate that memory anytime you assign another string to the CComBSTR, or when it goes out of scope and is destroyed. What can be a problem though is if you pass in the member variable BSTR to a function that takes a BSTR*, like most COM interface methods/properties that take strings do. For example:

pSomeCOMInterface->get_String(&sTheString.m_str);

If the CComBSTR had just been initialized with its default constructor, you’d be fine, since the BSTR copied into m_str would be deallocated when the CComBSTR was destroyed or reassigned. In this case though, we’re replacing the reference that the CComBSTR had to the 0 length string and causing a memory leak. So in addition to being unnecessary, the initialization to NULL actually made it easier to leak memory, since you wouldn’t expect that CComBSTR to have allocated anything. Sure it’s only a few bytes but depending on how it’s called in your code a tiny leak like that can be just as bad as losing track of some massive chunk of memory.