String handling is one of the most error-prone
aspects of programming in C and C++. Errors in dealing with strings account for most of
the buffer overruns that result in security problems. In many languages, a string is an
elementary type, and several of the issues that cause problems in C and C++, such as
buffer overruns and problems with illegal pointers, don't occur as easily in these other
languages. Perhaps if C had been written with a string type, we might have fewer problems
with strings.
Let's examine strings and take a look at three C library calls that can compromise the
security of your code. Don't despair, I'll also introduce you to the Standard Template
Library (STL) and explain how it can help you avoid some of these security vulnerabilities
in your code. As I pointed out last time, the contents of this column assume that the
reader has a basic familiarity with programming in C.
What's a String?
A string is a series of characters ending with a null (\0) character that lets the program know where to terminate the string. A Unicode string is a series of wide characters (WCHAR) that also terminates with a null character. At the lower levels (e.g., kernel level) of Windows 2000 (Win2K) and Windows NT, a UNICODE_STRING type often represents strings. This structure maintains information about the length of the string and the maximum size of the buffer. Dealing with kernel-level code is beyond the scope of this article, but you should be aware that this approach represents another way of string handling. Almost without exception, the C library calls, which deal with single-byte characters, have equivalents to properly deal with Unicode strings, and the same pitfalls apply to both single-byte and Unicode strings. Let's begin by examining some of the available library calls, starting with strcpy(). . . .

