This post was originally written for students in Bunker Hill’s CIT 120 course.
Breaking the cin
object for validation
As you should know, cin
is an object. Specifically, it is an object of type istream
(or, an instance of the istream
class). For those who know anything about C (as opposed to C++), it is very roughly equivalent to stdin
from the cstdio
(stdio.h
) library.
In C++, objects can overload operators. This means that whoever programmed the object, is able to determine the behavior of that operator when it works with that object. An example of this is the addition (+
) operator and a string
object; the operator is overloaded to concatenate strings (whereas by default, it simply adds numeric values).
In fact, this is how the stream extraction operator (>>
) and stream insertion operator (<<
) work. By default, in C++, they are not stream operators at all, but the “bitshift right” and “bitshift left” operators; they are overloaded when used with stream objects.
What does this have to do with “breaking” cin
? Well, behind the scenes, the cin
object has a series of flags. Flags are simple data types that can be either “1” or “0” – they are similar to the boolean type, but only take up one bit, rather than a whole byte. Flags usually represent the “status” of the object. The cin
object has a bunch of flags; the ones that concern us here are badbit and failbit. If either of those bits are true, the cin
object is considered “broken.” Any attempt to use the cin
object’s methods (such as cin.ignore()
or cin.getline()
) or the stream extraction operator (>>
) won’t work at all.
This will happen when cin
attempts to extract data into a variable type that won’t support that data. For example, say you had this code:
int x; cin >> x;
Now, let’s say the user enters “B”. There’s no way to interpret the letter “B” as an integer, so the cin
object’s failbit is set to 1, and it will stop working.
One thing to remember is that when cin
fails, no changes are made to either the input stream, or the variable you were attempting to stream into. In the example above, x
will hold logical garbage, and “B” will still be in the input stream.
In order to “clear out” that flag, and get the cin
object to work again, you need to call the clear()
method; is is usually called with no arguments (this “fixes” the cin
object). Note, however, that the clear()
method only clears out the flags; it doesn’t actually do anything to the input stream. In other words, that “B” will still be in the stream, just waiting to break the cin
object again. So, after calling clear()
, you must also call ignore()
to get rid of the bad input:
cin.clear(); cin.ignore(80, '\n');
Now, of course you don’t want to do this if the cin
object doesn’t fail – that is, if the user did in fact enter the “correct” input. So, we need a way to check to see if the cin
object’s failbit is set; or, colloquially, if cin
is broken.
The cin
object does have a method to test this, called fail()
. It also takes no arguments. So, you could do this:
if ( cin.fail() ) { cin.clear(); cin.ignore(80, '\n'); }
…but, in fact, there’s a much easier technique. And that is where operator overloading comes in.
In addition to the stream operators, the cin
object (and, indeed, all stream objects) overloads the logical not (!
) operator. When C++ sees this operator is overloaded, it evaluates the object itself as boolean “true” or “false.” In the case of the cin
object, the operator is overloaded in a useful way: the “not” operator returns true if cin
is “broken,” and false otherwise. (It operates exactly the same as the fail()
method.) In other words, you can test to see if cin
is working simply by testing the object itself:
if ( !cin ) { cin.clear(); cin.ignore(80, '\n'); }
This is very useful for input validation. Remember that when validating input, you should use a while-loop to test if the input is invalid, and skip the loop if it’s not. So, you could get the input from the user, then test the cin
object to see if it is “broken,” and handle the situation if it is. But there’s an even easier way.
Remember that the stream extraction operator may be chained:
cin >> x >> y;
Once a value is extracted (or attempted to be extracted) from the input stream, the result of the operation is the cin
object itself. This means that we can extract the value from the stream, and validate the cin
object, in one fell swoop.
One final note: Even if the operation is successful, there might still be characters left in the stream. Say the user entered “123ABC” – 123 would be streamed into x
, but “ABC” would still be in the input stream. You almost certainly don’t want that, so you will have to make a call to cin.ignore()
right after the body of the while
loop.
Here’s the complete code:
while ( !(cin >> x) ) { cin.clear(); cin.ignore(80, '\n'); cout << "Hey, that wasn't a number! Try again, jerk: "; } cin.ignore(80, '\n');
Breaking cin.getline()
This issue is not documented in any book I’ve read, nor in any of the C++ websites I’ve seen.
Above, I detailed how to “break” the cin
object for validation. What most people do NOT realize is that this happens with cin.getline()
as well.
It’s a familiar scenario: you have a cstring (null-terminated char
array), and you allocate enough memory to hold a line of text in a typical console, which is 80 characters. You then use cin.getline(theCString, 80, '\n')
to read a line into the theCString char
array.
But what happens in the unlikely event that a user enters more than 80 characters? Or, if you only want the array to hold, say, 5 characters?
It depends.
If you are using the latest version of MinGW (which comes with Dev-C++ among others) or Visual Studio 2010, then you are going to break the cin
object. You must check cin.fail()
, and call cin.ignore()
if true, after every single call to cin.getline()
.
On the other hand, if you are using an older compiler (e.g. Visual Studio 6 or Visual Studio 2008), cin
will not break.
Either way, once cin
is “up and running,” the characters will still be in the input stream. If you don’t want them, you must make a call to cin.ignore()
or similar.
Of course, you could just use string
objects, and use the getline()
function from the string
library. Since that version of getline()
doesn’t specify the number of characters to read, this problem doesn’t occur.