previous next contents
XPG4

X/Open Common Applications Environment - the API defined by X/Open.

XSI - same as X/open System Interface.

XPG - X/Open Portability Guide.

XPG3 was fully aligned with POSIX.1 and ANSI C except in the area of multi-byte codeset operation and localeconv().

The Uniforum Technical Subcommittee on I18N have, in collaboration with X/Open, developed a formal definition for internationalized regular expressions which will be published in POSIX.2.

WPI - same as Worldwide Portability Interfaces.

Worldwide Portability Interfaces - the set of X/Open functions recommended for use by character-based portable applications that take wide character arguments. Designed to be a superset of SIGMA interface functions to support SIGMA multibyte codesets.

How and where localization data is stored is not defined by X/Open, nor are the permitted settings of a locale-name.

language information - same as cultural data.

langinfo - same as language information.

cultural data - localization data which doesn't include character set information and message catalogs.

message catalog - file or storage area containing program messages, command prompts, and responses to promts for a particular language.

announcement mechanism - mechanism by which the locale is specified by the user of a program or the system as a whole.

character - same as multibyte character.

multibyte character - sequence of one or more bytes which represents an abstract character.

character string - contiguous sequence of bytes terminated by and including the NULL byte.

NULL byte - byte with value (char)0.

wide character - any encoded character of type wchar_t.

wchar_t - an integral data type defined in or large enough to hold any member of the codeset which is permitted to be defined as a single byte.

wide character string - contiguous sequence of wide characters terminated by and including the NULL wide character.

NULL wide character - integer with value (wchar_t)0.

empty string - character string whose first element is the NULL byte.

empty wide character string - a wide character string whose first element is the NULL wide character.

ISO 6937:1983 - 7bit or 8bit codeset for text communication using public communication networks, private communication networks, ror interchange media such as magnetic tape and disks. Used for X.400 and ISO mail systems.

portable character set - character set supported in both the compile-time and run-time environment. The set contains the 26 uppercase and 26 lowercase letters of the English alphabet, the 10 decimal digits, and the 32 graphics characters !"#$%&'()*+,-./:;<=>?@[\]^_`{|} , , , and . The execution character set is also defined to contain , , , and .

source portable character set - subset of the portable character set that may be used by a program at compile-time.

execution portable character set - subset of the portable character set that may be used by a program at execution-time.

If any characters from the extended character set are used in string literals, the code may not be portable to all X/Open systems.

string literal - in a C program, a sequence of character constants in double quotes.

character literal - a numeric constant used to represent the encoding of a character in a program.

character constant - a symbol, such as 'a', used to represent a character in a program.

A character constant which is not a member of the source portable character set should not be used in a program.

message system - same as message facility.

message facility - X/Open API which creates and accesses message catalogs.

Programs that need to check for characters that are not in the portable character set should use the message facility.

trigraph sequence - three character sequence supported by X/Open C compilers designed to allow programmers to enter certain characters from the portable character set even if their keyboard does not contain the character.

multibyte codeset rules - set of rules which must be satisfied by all codesets supported by X/Open systems. For all source and execution codesets: 1) the encoding of the source and execution codesets need not be the same, 2) the portable codeset will be a subset of both, 3) the encoding of characters not in the portable codeset will be locale-dependent, 4) the X/Open state-dependent encoding rules will apply, 5) a byte with all bits zero is interpreted as a NULL character, independent of shift state, 6) a NULL byte will not occur in the second or subsequent bytes of a multibyte character. In addition, for all source character sets, 8) a comment, string literal, character constant or header name will begin and end in the intitial shift state, 9) a comment, string literal, character constant or header name will consist of a sequence of valid multibyte characters.

X/Open state-dependent encoding rules - 1) A character may have a state-dependent encoding. Each sequence of characters begins in an initial shift state and inters other implementation-defined shift states when specific characters are encountered in the sequence. 2) While in the initial shift state, all characters from the portable character set retain their usual interpretation and do not alter the shift state. 3) The interpretation of subsequent bytes in the sequence is a function of the current shift state.

XPG4 provides functions for character classification, case conversion, and string comparison.

nl_langinfo() - function which returns a pointer to the value associated with a specified cultural data item from the locale database.

language database - same as locale database.

locale database - database which contains all of the localization data for a given locale.

cultural data item - the name of a single cultural fact from the locale database passed as an argument to nl_langinfo(). Defined in .

strftime() - function which loads a specified buffer with a formatted locale-dependent date/time string.

localeconv() - function which loads and returns a pointer to the lconv structure, defined in and sometimes in .

The X/Open announcement mechanism uses setlocale() defined by ANSI C and extended by POSIX.1.

ANSI C compatible X/Open functions - atof(), fprintf(), fscanf(), is*(), localeconv(), mblen(), mbstowcs(), mbtowc(), perror(), printf(), scanf(), setlocale(), sprintf(), sscanf(), strcoll(), strerror(), strftime(), strtod(), strxfrm(), tolower(), toupper(), vfprintf(), vprintf(), vsprintf(), wcstombs(), wctomb().

X/Open-enhanced ANSI C compatible functions - fprintf(), fscanf(), perror(), printf(), scanf(), sprintf(), sscanf(), vfprintf(), vprintf(), vsprintf().

X/Open regular expression functions - advance(), compile(), step().

X/Open message facility functions - catclose(), catopen(), catgets().

single byte transparent - same as 8bit clean.

multibyte transparent - said of a program which is capable of supporting multibyte codesets.

WPI functions - fgetwc(), fgetws(), fputwc(), fputws(), getwc(), getwchar(), getws(), iconv(), iconv_close(), iconv_open() iswalnum(), iswalpha(), iswcntrl(), iswdigit(), iswgraph(), iswlower(), iswprint(), iswpunct(), iswspace(), iswupper(), iswxdigit(), putwc(), putwchar(), putws(), strptime(), towlower(), towupper(), ungetwc(), vwsprintf(), wcscat(), wcschr(), wcscmp(), wcscoll(), wcscpy(), wcscspn(), wcsftime(), wcslen(), wcsncat(), wcsncmp(), wcsncpy(), wcspbrk(), wcsrchr(), wcsspn(), wcstod(), wcstok(), wcstol(), wcstoul(), wcswcs(), wcswidth(), wcsxfrm(), wcwidth(), wsprintf(), wsscanf(). wint_t - integer data type defined in large enough to hold WEOF.

The wint_t data type is larger than wchar_t because there must be a way to distinguish WEOF from any character that might be returned from a call to the wide character I/O functions.

The isw*() and wide character case conversion functions take wint_t arguments.

WEOF - wide EOF, defined in , defined to be distinguishable from any wchar_t.

The wcs*() functions are functionally equivalent to the str*() functions except that they take wchar_t arguments.

strptime() - performs the reverse of strftime(), converting a string which represents a locale-dependent time to a struct tm value.

X/Open printing functions - fprintf(), printf(), printw(), mvprintw(), mvwprintw(), sprintf(), vfprintf(), vprintf(), vsprintf(), vwsprintf(), wprintw(), wsprintf().

numbered conversion specifier - same as numbered format specifier.

numbered format specifier - format specifier of the form %n$s where n is an integer and s is a conversion specifier, used to associate a format specifier with a numbered argument, for use in locale-dependent ordering of format specifiers.

unnumbered format specifier - format specifier of the form %s where s is a conversion specifier.

Numbered and unnumbered format specifiers should not be mixed in a single format string.

X/Open scanning functions - fscanf(), scanf(), sscanf(), wsscanf().

Both X/Open scanning functions and X/Open printing functions may use numbered format specifiers.

X/Open number conversion functions - wcstod(), wcstol(), wcstoul().

ANSI C multibyte functions - mblen(), mbstowcs(), mbtowc(), wcstombs(), wctomb().

wide character I/O functions - getwc(), getwchar(), getws(), fgetwc(), fgetws(), fputwc(), fputws(), putwc(), putwchar(), putws(), ungetwc().

wide character string functions - wcscat(), wcschr(), wcscmp(), wcscpy(), wcscspn(), wcslen(), wcsncat(), wcsncmp(), wcsncpy(), wcspbrk(), wcsrchr(), wcsspn(), wcstok(), wcswcs(), wcswidth(), wcwidth().

error handling functions - perror(), strerror().

The X/Open versions of perror() and strerror() have been enhanced to produce locale-dependent error messages.

codeset conversion functions - iconv_open(), iconv(), iconv_close().

iconv_open() - performs initializations to convert character encodings from the source codeset to the target codeset and returns a conversion descriptor of type iconv_t.

iconv() - converts a sequence of characters in the specified inbuf and places the results in the specified outbuf according to the conversion for the given conversion descriptor.

iconv_close() - closes the conversion stream given by the conversion descriptor.

conversion stream - a stream opened by a call to iconv_open() for converting character encodings with the function iconv().

conversion descriptor - descriptor which identifies a conversion stream, returned by a call to iconv_open().

locale-name - character string of the form language[_territory][.codeset][@modifier] which identifies an allowable name of a locale. The length of this string may not exceed NL_LANGMAX.

modifier - a character string which gives more information about the locale. X/Open suggests using it rarely for specifying things like dictionary ordering. Motif suggests using it to specify an Input Method.

multi-language-working - the ability of a user to specify different locales for different aspects of program operation, implement by the existence of different categories for such things as collation, character classification, etc.

mixed-language-working - the ability to associate a different locale with each process of a program.

single-language-working - the ability of the system to specify a default locale in the event neither mixed-language-working nor multi-language-working mechanisms are used by a program.

iconv - a command line utility which calls iconv() to convert the contents of a given file from one codeset to another.

tcs - the Plan 9 utility for translating between character sets. It is nearly equivalent to the iconv utility in Posix 1003.2b.

The tcs utility is available by anonymous ftp from research.att.com in the file dist/tcs.shar.Z. It requires ANSI/ISO C.

localedef - a command line utility which converts source definitions for locale categories into the locale database suitable for use by functions and utilities defined to support internationalized behavior.

charmap - character set description file loaded by localedef.


previous next contents