[I18n-sig] Perhaps the locale should matter?
Markus Kuhn
Markus.Kuhn@cl.cam.ac.uk
Fri, 05 May 2000 20:59:02 +0100
Guido van Rossum <guido@python.org> writes:
> Problem: I have no idea how to go from the locale setting (a
> two-charater language abbreviation) to a specific character encoding
> -- but that might conceivably a fixed table.
Starting with glibc 2.2, you can ask for the encoding name with
#include <langinfo.h>
encoding_string = nl_langinfo(CODESET);
as described on
http://www.opengroup.org/onlinepubs/7908799/xsh/langinfo.h.html
But are you really interested in the name of the encoding or not more in
the already Unicode-converted string? In this case, simply use the C
library's wide character I/O functions getwc(), fwscanf(), etc. as
described in
http://www.unix-systems.org/version2/whatsnew/login_mse.html
or
http://www.cl.cam.ac.uk/~mgk25/volatile/ISO-C-FDIS.1999-04.pdf
http://www.cl.cam.ac.uk/~mgk25/volatile/ISO-C-FDIS.1999-04.txt
(section 7.24)
and the locale dependent conversion to Unicode will be done for you by
the C library. Under glibc 2.2, wchar_t always contains UCS-4 values.
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>