Breaking cin for fun and profit

This post was originally written for students in Bunker Hill’s CIT 120 course.

Breaking the cin object for validation

As you should know, cin is an object. Specifically, it is an object of type istream (or, an instance of the istream class). For those who know anything about C (as opposed to C++), it is very roughly equivalent to stdin from the cstdio (stdio.h) library.

In C++, objects can overload operators. This means that whoever programmed the object, is able to determine the behavior of that operator when it works with that object. An example of this is the addition (+) operator and a string object; the operator is overloaded to concatenate strings (whereas by default, it simply adds numeric values).

In fact, this is how the stream extraction operator (>>) and stream insertion operator (<<) work. By default, in C++, they are not stream operators at all, but the “bitshift right” and “bitshift left” operators; they are overloaded when used with stream objects.

What does this have to do with “breaking” cin? Well, behind the scenes, the cin object has a series of flags. Flags are simple data types that can be either “1” or “0” – they are similar to the boolean type, but only take up one bit, rather than a whole byte. Flags usually represent the “status” of the object. The cin object has a bunch of flags; the ones that concern us here are badbit and failbit. If either of those bits are true, the cin object is considered “broken.” Any attempt to use the cin object’s methods (such as cin.ignore() or cin.getline()) or the stream extraction operator (>>) won’t work at all.

This will happen when cin attempts to extract data into a variable type that won’t support that data. For example, say you had this code:

int x;
cin >> x;

Now, let’s say the user enters “B”. There’s no way to interpret the letter “B” as an integer, so the cin object’s failbit is set to 1, and it will stop working.

One thing to remember is that when cin fails, no changes are made to either the input stream, or the variable you were attempting to stream into. In the example above, x will hold logical garbage, and “B” will still be in the input stream.

In order to “clear out” that flag, and get the cin object to work again, you need to call the clear() method; is is usually called with no arguments (this “fixes” the cin object). Note, however, that the clear() method only clears out the flags; it doesn’t actually do anything to the input stream. In other words, that “B” will still be in the stream, just waiting to break the cin object again. So, after calling clear(), you must also call ignore() to get rid of the bad input:

cin.clear();
cin.ignore(80, '\n');

Now, of course you don’t want to do this if the cin object doesn’t fail – that is, if the user did in fact enter the “correct” input. So, we need a way to check to see if the cin object’s failbit is set; or, colloquially, if cin is broken.

The cin object does have a method to test this, called fail(). It also takes no arguments. So, you could do this:

if ( cin.fail() ) {
  cin.clear();
  cin.ignore(80, '\n');
}

…but, in fact, there’s a much easier technique. And that is where operator overloading comes in.

In addition to the stream operators, the cin object (and, indeed, all stream objects) overloads the logical not (!) operator. When C++ sees this operator is overloaded, it evaluates the object itself as boolean “true” or “false.” In the case of the cin object, the operator is overloaded in a useful way: the “not” operator returns true if cin is “broken,” and false otherwise. (It operates exactly the same as the fail() method.) In other words, you can test to see if cin is working simply by testing the object itself:

if ( !cin ) {
  cin.clear();
  cin.ignore(80, '\n');
}

This is very useful for input validation. Remember that when validating input, you should use a while-loop to test if the input is invalid, and skip the loop if it’s not. So, you could get the input from the user, then test the cin object to see if it is “broken,” and handle the situation if it is. But there’s an even easier way.

Remember that the stream extraction operator may be chained:

cin >> x >> y;

Once a value is extracted (or attempted to be extracted) from the input stream, the result of the operation is the cin object itself. This means that we can extract the value from the stream, and validate the cin object, in one fell swoop.

One final note: Even if the operation is successful, there might still be characters left in the stream. Say the user entered “123ABC” – 123 would be streamed into x, but “ABC” would still be in the input stream. You almost certainly don’t want that, so you will have to make a call to cin.ignore() right after the body of the while loop.

Here’s the complete code:

while ( !(cin >> x) ) {
  cin.clear();
  cin.ignore(80, '\n');
  cout << "Hey, that wasn't a number! Try again, jerk: ";
}
cin.ignore(80, '\n');

Breaking `cin.getline()`

This issue is not documented in any book I’ve read, nor in any of the C++ websites I’ve seen.

Above, I detailed how to “break” the cin object for validation. What most people do NOT realize is that this happens with cin.getline() as well.

It’s a familiar scenario: you have a cstring (null-terminated char array), and you allocate enough memory to hold a line of text in a typical console, which is 80 characters. You then use cin.getline(theCString, 80, '\n') to read a line into the theCString char array.

But what happens in the unlikely event that a user enters more than 80 characters? Or, if you only want the array to hold, say, 5 characters?

It depends.

If you are using the latest version of MinGW (which comes with Dev-C++ among others) or Visual Studio 2010, then you are going to break the cin object. You must check cin.fail(), and call cin.ignore() if true, after every single call to cin.getline().

On the other hand, if you are using an older compiler (e.g. Visual Studio 6 or Visual Studio 2008), cin will not break.

Either way, once cin is “up and running,” the characters will still be in the input stream. If you don’t want them, you must make a call to cin.ignore() or similar.

Of course, you could just use string objects, and use the getline() function from the string library. Since that version of getline() doesn’t specify the number of characters to read, this problem doesn’t occur.