Overloaded Return Values Cause Bugs
In my secure programming class I have denounced the overloading of return values as a bad practice, and recently I discovered a new, concrete example of a bug (possibly a vulnerability) that results from this widespread practice. A search in the NVD and some googling didn’t reveal any mention of similar issues anywhere, but I probably just have missed them (I have trouble imagining that nobody else pointed that out before). In any case it may be worth repeating with this example.
The practice is to return a negative value to indicate an error, whereas a positive value has meaning, e.g., a length. Imagine a function looking like this:
int bad_idea(char *buf, unsigned int size) { int length; if (<some_error_condition>) { length = -ERROR_CODE; } else { length = size; // substitute any operations that could overflow the signed int } return length; }
This function could return random error values. Under the right conditions this could result in at least a DoS (imagine that this is a security-related function, e.g., for authentication). I suggest using separate channels to return error codes and meaningful values. Doing this lowers complexity in the assignment and meaning of that return value by removing the multiplexing. As a result:
There is an increased complexity in the function prototype, but the decreased ambiguity inside the function is beneficial. When the rest of the code uses unsigned integers, the likelihood of a signed/unsigned integer conversion mistake or an overflow is high. In the above example, the function is also defective because it is unable to process correctly some allowed inputs (because the input is unsigned and the output needs to be able to return the same range), so in reality there is no choice but to decouple the error codes from the semantic results (length). This discrepancy is easier to catch when the ambiguity of the code is decreased.
It does away with the bad practice of keeping the same variable for two uses: assigning error codes and negative values to a “length” is jarring;
It disambiguates the purpose and meaning of checking the “returned” values (I’m including the ones passed by reference, loosely using the word “returned”). Is it to check for an error or is it a semantic check that’s part of the business logic?
Integer overflows are a well-known problem; however in this case they are more a symptom of conflicting requirements. The incompatible constraints of having to return a negative integer for errors and an unsigned integer otherwise are really to blame; the type mismatch (“overflow”) is inevitable given those. My point is that the likelihood of developers getting confused and having bugs in their code, for example not realizing that they have incompatible constraints, is higher when the return value has a dual purpose.
Edit (10/13): It just occurred to me that vsnprintf and snprintf are in trouble, in both BSD and Linux. They return an int, but take an unsigned integer (size_t) as a size input, AND are supposed to return either the number of printed characters or a negative number if there’s an error. Even if the size can be represented as a signed integer, they return the number of characters that *would* have been printed if the size was infinite, so specifying a small size isn’t a fix. So it *might* be possible to make these functions seemingly return errors when none happened.
Note(10/14): I’ve been checking the actual code for snprintf. In FreeBSD, it checks both the passed size and the return value against INT_MAX and appears safe.
on Monday, October 13, 2008 at 03:47 PM