memset() Considered Harmful
I've recently enhanced the Pantheios logging API library for full wide-string support, and found that hubris still walks tall in my coding practice!
There are a host of reasons for not using
memcpy(), and another host of scenarios in which its use is, at best, done with great care. After all these years, I've pretty much got them down pat, and can't remember the last time I experienced a
Unfortunately, it seems like this pattern has made me over-confident with some other members of the
mem*()-family. Specifically, I've had a somewhat unpleasant experience with my use of
Inside some of Pantheios' inserter classes' implementation files,
memset are being used to prepare insertion strings. The following code is from the pantheios::integer's implementation, which all worked fine in a multibyte build:
memcpy(&buffer, szTemp, static_cast<size_t>(n) * sizeof(pan_char_t)); memset(&buffer + n, ' ', static_cast<size_t>(width - n) * sizeof(pan_char_t)); buffer[width] = '\0';
As you can see, I've anticipated the eventual wide string build support by coding in terms of Pantheios' character type
pan_char_t, and remembering to use
sizeof() when calculating the number of bytes (as opposed to characters) to manipulate. Consequently, the first line, using
memcpy() is quite correct.
The problem is the use of
memset(). Perhaps you're already ahead of me, and see the defect: The second line, which right-fills the string buffer with spaces, uses
memset(), which operates on bytes. Doh! Even though I've calculated the right number of bytes to fill, the wide characters are actually being written 0x2020, rather than the intended 0x0020.
Now, since Pantheios uses a lot of automated tests, this defect was caught well before release, and not exposed to any users. But still, I was irritated - not to say a little embarassed - at having to spend nearly an hour to diagnose and correct something that should never have been in the first place.
Since the offending uses of
memset() were all in C++ implementation files, a better alternative to is to use
std::fill_n(), as in:
memcpy(&buffer, szTemp, static_cast<size_t>(n) * sizeof(pan_char_t)); <strong>std::fill_n(&buffer + n, static_cast<size_t>(width - n), ' ');</strong> buffer[width] = '\0';
So what're the lessons here? I guess they're (i) ensure you have comprehensive test suites (and you use them), and (ii) avoid low-level memory management routines when you can, and (iii) avoid
memset() in particular.
Obviously, this is a bit awkward for me. Having written numerous articles, books and columns on C++, and making it a point of pride never to have had a system fault in production, making such a fundamental gaff doesn't look too impressive. Hopefully my embarassment can be of use to you, gentle readers, as it is now to me: I've spent the evening grepping and reviewing every use of
memset() in my codebases. Thankfully, I've found no other significant gaffes. Nonetheless, I'll be using
std::fill_n() or equivalent from here on in.