Tuesday, 23 April 2019

integer - Store hex input into int variable without using scanf() function in C



Pre-History:
I had the issue, that the getchar() function did not get processed in the right way as there was not a request for any given input and the program just have continued processing further.



I searched the internet about what this issue could be and found the information that if the scanf() function is implemented into a program before the getchar() function, the getchar() function does not behave in the right way, and would act like my issue was.



Citation:





I will bet you ONE HUNDRED DOLLARS you only see this problem when the call to getchar() is preceded by a scanf().



Don't use scanf for interactive programs. There are two main reasons for this:



1) scanf can't recover from malformed input. You have to get the format string right, every time, or else it just throws away whatever input it couldn't match and returns a value indicating failure. This might be fine if you're parsing a fixed-format file when poor formatting is unrecoverable anyway, but it's the exact opposite of what you want to do with user input. Use fgets() and sscanf(), fgets() and strtok(), or write your own user input routines using getchar() and putchar().



1.5) Even properly used, scanf inevitably discards input (whitespace) that can sometimes be important.




2) scanf has a nasty habit of leaving newlines in the input stream. This is fine if you never use anything but scanf, since scanf will usually skip over any whitespace characters in its eagerness to find whatever it's expecting next. But if you mix scanf with fgets/getchar, it quickly becomes a total mess trying to figure out what might or might not be left hanging out in the input stream. Especially if you do any looping -- it's quite common for the input stream to be different on the first iteration, which results in a potentially weird bug and even weirder attempts to fix it.



tl;dr -- scanf is for formatted input. User input is not formatted. //




Here is the link, to that thread: https://bbs.archlinux.org/viewtopic.php?id=161294






scanf() with:




scanf("%x",integer_variable); 


seems for me as a newbie to the scene as the only way possible to input a hex number from the keyboard (or better said the stdin file) and store it to a int variable.



Is there a different way to input a hex value from the stdin and store it into an integer variable?



Bonus challenge: It would be nice also, if i could write negative values (through negative hex input of course) into an signed int variable.







INFO: I have read many threads for C here on Stackoverflow about similar problems but none of those answer my explicit question quite well. So i´ve posted this question.



I work under Linux Ubuntu.


Answer



The quote about the hundred dollar bet is accurate. Mixing scanf with getchar is almost always a bad idea; it almost always leads to trouble. It's not that they can't be used together, though. It's possible to use them together -- but usually, it's just way too difficult. There are too many fussy little details and "gotcha!"s to keep track of. It's more trouble than it's worth.



At first you had said





scanf() with ... %d ... seems for me as a newbie to the scene as the only way possible to input a hex number from the keyboard




There was some side confusion there, because of course %d is for decimal input. But since I'd written this answer by the time you corrected that, let's proceed with decimal for the moment.
(Also for the moment I'm leaving out error checking -- that is, these code fragments don't check for or do anything graceful if the user doesn't type the requested number.) Anyway, here are several ways of reading an integer:




  1. scanf("%d", &integer_variable);
    You're right, this is the (superficially) easiest way.


  2. char buf[100];
    fgets(buf, sizeof(buf), stdin);
    integer_variable = atoi(buf);
    This is, I think, the easiest way that doesn't use scanf. But most people these days frown on using atoi, because it doesn't do much useful error checking.



  3. char buf[100];
    fgets(buf, sizeof(buf), stdin);
    integer_variable = strtol(buf, NULL, 10);
    This is almost the same as before, but avoids atoi in favor of the preferred strtol.


  4. char buf[100];
    fgets(buf, sizeof(buf), stdin);
    sscanf(buf, "%d", &integer_variable);
    This reads a line and then uses sscanf to parse it, another popular and general technique.




All of these will work; all of these will handle negative numbers. It's important to think about error conditions, though -- I'll have more to say about that later.



If you want to input hexadecimal numbers, the techniques are similar:




  1. scanf("%x", &integer_variable);



  2. char buf[100];
    fgets(buf, sizeof(buf), stdin);
    integer_variable = strtol(buf, NULL, 16);


  3. char buf[100];
    fgets(buf, sizeof(buf), stdin);
    sscanf(buf, "%x", &integer_variable);




These should all work, too. I wouldn't necessarily expect them to handle "negative hexadecimal", though, because that's an unusual requirement. Most of the time, hexadecimal notation is used for unsigned integers. (In fact, strictly speaking, %x with scanf and sscanf must be used with an integer_variable that has been declared as unsigned int, not plain int.)



Sometimes it's useful or necessary to do this sort of thing "by hand". Here's a code fragment that reads exactly two hexadecimal digits. I'll start out with the version using getchar:



int c1 = getchar();
if(c1 != EOF && isascii(c1) && isxdigit(c1)) {

int c2 = getchar();
if(c2 != EOF && isascii(c2) && isxdigit(c2)) {
if(isdigit(c1)) integer_variable = c1 - '0';
else if(isupper(c1)) integer_variable = 10 + c1 - 'A';
else if(islower(c1)) integer_variable = 10 + c1 - 'a';

integer_variable = integer_variable * 16;

if(isdigit(c2)) integer_variable += c2 - '0';
else if(isupper(c2)) integer_variable += 10 + c2 - 'A';

else if(islower(c2)) integer_variable += 10 + c1 - 'a';
}
}


As you can see, it's a bit of a jawbreaker. Me, although I almost never use members of the scanf family, this is one place where I sometimes do, precisely because doing it "by hand" is so much work. You can simplify it considerably by using an auxiliary function or macro to do the digit conversion:



int c1 = getchar();
if(c1 != EOF && isascii(c1) && isxdigit(c1)) {
int c2 = getchar();

if(c2 != EOF && isascii(c2) && isxdigit(c2)) {
integer_variable = Xctod(c1);
integer_variable = integer_variable * 16;
integer_variable += Xctod(c2);
}
}


Or you could collapse those inner expressions down to just




        integer_variable = 16 * Xctod(c1) + Xctod(c2);


These work in terms of an auxiliary function:



int Xctod(int c)
{
if(!isascii(c)) return 0;
else if(isdigit(c)) return c - '0';
else if(isupper(c)) return 10 + c - 'A';

else if(islower(c)) return 10 + c - 'a';
else return 0;
}


Or perhaps a macro (though this is definitely an old-school sort of thing):



#define Xctod(c) (isdigit(c) ? (c) - '0' : (c) - (isupper(c) ? 'A' : 'a') + 10)



Often I'm parsing hexadecimal digits like this not from stdin using getchar(), but from a string. Often I'm using a character pointer (char *p) to step through the string, meaning that I end up with code more like this:



char c1 = *p++;
if(isascii(c1) && isxdigit(c1)) {
char c2 = *p++;
if(isascii(c2) && isxdigit(c2))
integer_variable = 16 * Xctod(c1) + Xctod(c2);
}



It's tempting to omit the temporary variables and the error checking and boil this down still further:



integer_variable = 16 * Xctod(*p++) + Xctod(*p++);


But don't do this! Besides the lack of error checking, this expression is probably undefined, and it definitely won't always do what you want, because there's no longer any guarantee abut what order you read the characters in. If you know p points at the first of two hex digits, you don't want to collapse it any further than



integer_variable = Xctod(*p++);
integer_variable = 16 * integer_variable + Xctod(*p++);



and even then, this will work only with the function version of Xctod, not the macro, since the macro evaluates its argument multiple times.






Finally, let's talk abut error handling. There are quite a few possibilities to worry about:




  1. The user hits Return without typing anything.

  2. The user types whitespace before or after the number.


  3. The user types extra garbage after the number.

  4. The user types non-numeric input instead of a number.

  5. The code hits end-of-file; there are no characters to read at all.



And then how you handle these depends on what input techniques you're using. Here are the basic rules:



A. If you're calling scanf, fscanf, or sscanf, always check the return value. If it's not 1 (or, in the case where you had multiple % specifiers, it's not the number of values you expected to read), it means something went wrong. This will generally catch problems 4 and 5, and will handle case 2 gracefully. But it will often quietly ignore problems 1 and 3. (In particular, scanf and fscanf treat an extra \n just like leading whitespace.)



B. If you're calling fgets, again, always check the return value. You'll get NULL on EOF (problem 5). Handling the other problems depends on what you do with the line you read.




C. If you're calling atoi, it will deal gracefully with problem 2, but it will ignore problem 3, and it will quietly turn problem 4 into the number 0 (which is why atoi is usually not recommended any more).



D. If you're calling strtol or any of the other "strto" functions, they will deal gracefully with problem 2, and if you let them give you back an "end pointer", you can check for and deal with problems 3 and 4. (Note that I left the end-pointer handling out of my two strtol examples above.)



E. Finally, if you're doing something down-and-dirty like my "hardway" two-digit hex converter, you generally have to take care of all these problems, explicitly, yourself. If you want to skip leading whitespace you have to do so (the isspace function from can help), and if there might be unexpected non-digit characters, you have to check for those, too. (That's what the calls to isascii and isxdigit are doing in my "hardway" two-digit hex converter.)


No comments:

Post a Comment

php - file_get_contents shows unexpected output while reading a file

I want to output an inline jpg image as a base64 encoded string, however when I do this : $contents = file_get_contents($filename); print ...