Character classification is used in regular expression processing.
character classification - the assignment of a character to one or more character classes.
character class - one of the following: upper, lower, digit, space, graph, print, punct, cntrl, xdigit, and alpha. Not to be confused with equivalence class.
POSIX.2 also defines the class blank. All characters in the blank class are
automatically included in the space class. If characters are not explicitly
assigned to blank, then
POSIX.2 specifies whether a character belonging to one class may also belong
to other classes. See 2.5.2.1 LC_CTYPE.
The concept of character class finds application in regular expressions and
other string manipulation applications.
character classification function - one of either the is*() functions or the
isw*() functions.
is*() function - a collection of character classification functions which take
as an argument the int representation of an 8bit codepoint. One of isalpha(),
isupper(), islower(), isdigit(), isxdigit(), asalnum(), ispunct(), isprint(),
isgraph(), isspace(), iscntrl(). Except for isalnum(), there is a one-to-one
correspondence between the is*() functions and character classes.
isw*() function - a collection of character classification functions which
take as an argument the wchar_t representation of any character of any
codeset. One of iswalpha(), iswupper(), iswlower(), iswdigit(), iswxdigit(),
iswalnum(), iswpunct(), iswprint(), iswgraph(), iswspace(), iswcntrl().
isalpha() - returns true if character is upper, lower, or alpha.
isupper() - returns true if character is upper.
islower() - returns true if character is lower.
isdigit() - returns true if character is digit.
isxdigit() - returns true if character is xdigit.
isalnum() - returns true if character is upper, lower, digit, or alpha.
ispunct() - returns true if character is punct.
isprint() - returns true if character is upper, lower, alpha, digit, or punct.
isgraph() - returns true if character is upper, lower, alpha, digit, graph, or
punct.
isspace() - returns true if character is space.
iscntrl() - returns true if character is cntrl.
character classification in 7bit ASCII - Each character in 7bit ASCII is
assigned to one or more character classes in the following way (escaped
numbers are in decimal): \0 - \8 control; \9 - \13 control and space; \14 -
\31 control; \32 space and blank; \33 - \47 punctuation; '0' - '9' numeric;
'A' - 'F' uppercase and hexadecimal; 'G' - 'Z' uppercase; \59 - \65
punctuation; 'a' - 'f' lowercase and hexadecimal; 'g' - 'z' lowercase; \123 -
\126 punctuation; \127 control.