From webmaster@pferdemarkt.ws Wed Jan 15 12:25:51 2003 From: webmaster@pferdemarkt.ws (webmaster@pferdemarkt.ws) Date: Wed, 15 Jan 2003 04:25:51 -0800 Subject: [I18n-sig] Pferdemarkt.ws informiert! Newsletter 01/2003 Message-ID: <200301151225.EAA02785@eagle.he.net> http://www.pferdemarkt.ws Wir sind in 2003 erfolgreich in des neue \"Pferdejahr 2003 gestartet. Für den schnellen Erfolg unseres Marktes möchten wir uns bei Ihnen bedanken. Heute am 15. Januar 2003 sind wir genau 14 Tage Online! Täglich wächst unsere Datenbank um ca. 30 neue Angebote. Stellen auch Sie als Privatperson Ihre zu verkaufenden Pferde direkt und vollkommen Kostenlos ins Internet. Zur besseren Sichtbarmachung Ihrer Angebote können SIe bis zu ein Bild zu Ihrer Pferdeanzeige kostenlos einstellen! Klicken Sie hier um sich direkt einzuloggen http://www.Pferdemarkt.ws Kostenlos Anbieten, Kostenlos Suchen! Direkt von Privat zu Privat! Haben Sie noch Fragen mailto: webmaster@pferdemarkt.ws From gp@pooryorick.com Wed Jan 15 22:48:49 2003 From: gp@pooryorick.com (Poor Yorick) Date: Wed, 15 Jan 2003 15:48:49 -0700 Subject: [I18n-sig] codecs module, readlines and xreadlines Message-ID: <3E25E551.4010202@pooryorick.com> The following code shows an inconsistency between open.readlines and codecs.open.readlines, and also between open.xreadlines and codecs.open.xreadlines. the call to open.readlines returns '\n' as the whereas codecs.open.readlines returns '\r\n'. Any plans to fix this? >>> fh = open('test2.txt', 'r') >>> lines = fh.readlines() >>> print lines ['1120, "Serial Number", 1016993947\n', '1122, "msconfig.exe", 1016994129\n', '1123, "Microsoft Windows XP", 1016994141\n', '1124, "Version", 1016994143\n', '1125, "XP", 1016994156\n', '1126, "Microsoft Windows", 1016994169\n', '1127, "Component", 1016994468'] >>> fh = codecs.open('test1.txt', 'r', 'utf-16') >>> lines = fh.readlines() >>> print lines [u'1120, "Serial Number", 1016993947\r\n', u'1122, "msconfig.exe", 1016994129\r\n', u'1123, "Microsoft Windows XP", 1016994141\r\n', u'1124, "Version", 1016994143\r\n', u'1125, "XP", 1016994156\r\n', u'1126, "Microsoft Windows", 1016994169\r\n', u'1127, "Component", 1016994468'] >>> fh = open('test2.txt', 'r') >>> lines = fh.xreadlines() >>> lines.next() '1120, "Serial Number", 1016993947\n' >>> lines.next() '1122, "msconfig.exe", 1016994129\n' >>> fh = codecs.open('test1.txt', 'r', 'utf-16') >>> lines = fh.xreadlines() >>> lines.next() '\xff\xfe1\x001\x002\x000\x00,\x00 \x00"\x00S\x00e\x00r\x00i\x00a\x00l\x00 \x00N\x00u\x00m\x00b\x00e\x00r\x00"\x00,\x00 \x001\x000\x001\x006\x009\x009\x003\x009\x004\x007\x00\r\x00\n' >>> lines.next() '\x001\x001\x002\x002\x00,\x00 \x00"\x00m\x00s\x00c\x00o\x00n\x00f\x00i\x00g\x00.\x00e\x00x\x00e\x00"\x00,\x00 \x001\x000\x001\x006\x009\x009\x004\x001\x002\x009\x00\r\x00\n' >>> Poor Yorick gp@pooryorick.com From martin@v.loewis.de Wed Jan 15 23:06:15 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 16 Jan 2003 00:06:15 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <3E25E551.4010202@pooryorick.com> References: <3E25E551.4010202@pooryorick.com> Message-ID: Poor Yorick writes: > The following code shows an inconsistency between open.readlines and > codecs.open.readlines, and also between open.xreadlines and > codecs.open.xreadlines. the call to open.readlines returns '\n' as the > whereas codecs.open.readlines returns '\r\n'. Any plans to fix this? Not without a bug report, or better yet, an actual patch. I think it would be best if codecs supported the "universal newlines" feature of Python 2.3. Regards, Martin From mal@lemburg.com Thu Jan 16 09:15:37 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Jan 2003 10:15:37 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <3E25E551.4010202@pooryorick.com> References: <3E25E551.4010202@pooryorick.com> Message-ID: <3E267839.50301@lemburg.com> Poor Yorick wrote: > The following code shows an inconsistency between open.readlines and > codecs.open.readlines, and also between open.xreadlines and > codecs.open.xreadlines. the call to open.readlines returns '\n' as the > whereas codecs.open.readlines returns '\r\n'. Any plans to fix this? On Windows, the 'r' opens the file in text which mangles the line-end information. You should try to open the file in 'rb' (binary) mode for comparison. codecs.open() automatically appends the 'b' to the 'r' for you, so this is probably the cause of the problem. > >>> fh = open('test2.txt', 'r') > >>> lines = fh.readlines() > >>> print lines > ['1120, "Serial Number", 1016993947\n', '1122, "msconfig.exe", > 1016994129\n', '1123, "Microsoft Windows XP", 1016994141\n', '1124, > "Version", 1016994143\n', '1125, "XP", 1016994156\n', '1126, "Microsoft > Windows", 1016994169\n', '1127, "Component", 1016994468'] > > >>> fh = codecs.open('test1.txt', 'r', 'utf-16') > >>> lines = fh.readlines() > >>> print lines > [u'1120, "Serial Number", 1016993947\r\n', u'1122, "msconfig.exe", > 1016994129\r\n', u'1123, "Microsoft Windows XP", 1016994141\r\n', > u'1124, "Version", 1016994143\r\n', u'1125, "XP", 1016994156\r\n', > u'1126, "Microsoft Windows", 1016994169\r\n', u'1127, "Component", > 1016994468'] > > >>> fh = open('test2.txt', 'r') > >>> lines = fh.xreadlines() > >>> lines.next() > '1120, "Serial Number", 1016993947\n' > >>> lines.next() > '1122, "msconfig.exe", 1016994129\n' > > >>> fh = codecs.open('test1.txt', 'r', 'utf-16') > >>> lines = fh.xreadlines() > >>> lines.next() > '\xff\xfe1\x001\x002\x000\x00,\x00 > \x00"\x00S\x00e\x00r\x00i\x00a\x00l\x00 > \x00N\x00u\x00m\x00b\x00e\x00r\x00"\x00,\x00 > \x001\x000\x001\x006\x009\x009\x003\x009\x004\x007\x00\r\x00\n' > >>> lines.next() > '\x001\x001\x002\x002\x00,\x00 > \x00"\x00m\x00s\x00c\x00o\x00n\x00f\x00i\x00g\x00.\x00e\x00x\x00e\x00"\x00,\x00 > > \x001\x000\x001\x006\x009\x009\x004\x001\x002\x009\x00\r\x00\n' > >>> > > Poor Yorick > gp@pooryorick.com > > > > _______________________________________________ > I18n-sig mailing list > I18n-sig@python.org > http://mail.python.org/mailman/listinfo/i18n-sig -- Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Thu Jan 16 10:09:05 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 16 Jan 2003 11:09:05 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <3E267839.50301@lemburg.com> References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > On Windows, the 'r' opens the file in text which mangles the line-end > information. You should try to open the file in 'rb' (binary) mode > for comparison. The issue is, of course, that codecs.open is usually meant for text data, so comparing 'r' to 'r' is fair, IMO. > codecs.open() automatically appends the 'b' to the 'r' for you, > so this is probably the cause of the problem. That is an implementation detail which shouldn't be visible to the user. I understand that it is necessary to open the underlying stream in binary mode, but then the higher layers should hide that fact. Regards, Martin From gp@pooryorick.com Thu Jan 16 15:59:48 2003 From: gp@pooryorick.com (Poor Yorick) Date: Thu, 16 Jan 2003 08:59:48 -0700 Subject: [I18n-sig] codecs module, readlines and xreadlines References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> Message-ID: <3E26D6F4.8090802@pooryorick.com> Martin v. Löwis wrote: >"M.-A. Lemburg" writes: > >>On Windows, the 'r' opens the file in text which mangles the line-end >>information. You should try to open the file in 'rb' (binary) mode >>for comparison. >> > >The issue is, of course, that codecs.open is usually meant for text >data, so comparing 'r' to 'r' is fair, IMO. > >>codecs.open() automatically appends the 'b' to the 'r' for you, >>so this is probably the cause of the problem. >> > Whether the file is opened in binary mode or in text mode, the '\r' character is still there. It isn't mangled, it's just that in the utf-16 encoding all characters are encoded as double-byte characters, and \r\n becomes \x00\r\x00\n. The thing is that I AM processing text data. It just happens to be unicode text data. The example I used turns into perfectly legible chinese characters once it's decoded in Python. I think that people using the codecs module on Windows to read Unicode text files would expect codecs.open.readlines to behave exactly like the builtin open.readlines. open.readlines automatically removes the "\r" character on Windows systems when the file is opened and read in text mode, and inserts a \r character when a \n is written to a file, so to be consistent, codecs.open.readlines should do the same thing and remove \x00\r when the file is opened in text mode. Poor Yorick gp@pooryorick.com From martin@v.loewis.de Thu Jan 16 16:08:28 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 16 Jan 2003 17:08:28 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <3E26D6F4.8090802@pooryorick.com> References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> Message-ID: Poor Yorick writes: > The thing is that I AM processing text data. It just happens to be > unicode text data. The example I used turns into perfectly legible > chinese characters once it's decoded in Python. I think that people > using the codecs module on Windows to read Unicode text files would > expect codecs.open.readlines to behave exactly like the builtin > open.readlines. Would you like to work on a patch to fix this problem? > open.readlines automatically removes the "\r" character on Windows > systems when the file is opened and read in text mode, and inserts a > \r character when a \n is written to a file, so to be consistent, > codecs.open.readlines should do the same thing and remove \x00\r > when the file is opened in text mode. It is not Python code which does that, though: instead, the Microsoft C library does the removal/insertion of \r. For Unicode, this is useless, since we cannot open the file in text mode: The C library would *still* remove \r (only), leaving us with an extra null byte. Notice that a similar problem exists on the Mac, where \r should be replaced by \n. Regards, Martin From mal@lemburg.com Thu Jan 16 16:14:38 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Thu, 16 Jan 2003 17:14:38 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <3E26D6F4.8090802@pooryorick.com> References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> Message-ID: <3E26DA6E.9000306@lemburg.com> Poor Yorick wrote: >=20 >=20 > Martin v. L=F6wis wrote: >=20 >> "M.-A. Lemburg" writes: >> >>> On Windows, the 'r' opens the file in text which mangles the line-end >>> information. You should try to open the file in 'rb' (binary) mode >>> for comparison. >>> >> >> The issue is, of course, that codecs.open is usually meant for text >> data, so comparing 'r' to 'r' is fair, IMO. >> >>> codecs.open() automatically appends the 'b' to the 'r' for you, >>> so this is probably the cause of the problem. >>> >> > Whether the file is opened in binary mode or in text mode, the '\r'=20 > character is still there. It isn't mangled, it's just that in the=20 > utf-16 encoding all characters are encoded as double-byte characters,=20 > and \r\n becomes \x00\r\x00\n. >=20 > The thing is that I AM processing text data. It just happens to be=20 > unicode text data. The example I used turns into perfectly legible=20 > chinese characters once it's decoded in Python. I think that people=20 > using the codecs module on Windows to read Unicode text files would=20 > expect codecs.open.readlines to behave exactly like the builtin=20 > open.readlines.=20 > open.readlines automatically removes the "\r" character on Windows=20 > systems when the file is opened and read in text mode, and inserts a \r= =20 > character when a \n is written to a file,=20 That's what I meant with mangling. I don't see any code in fileobject.c which would do the above, so unless I've overlooked something the MS C lib must apply this operation. > so to be consistent,=20 > codecs.open.readlines should do the same thing and remove \x00\r when=20 > the file is opened in text mode. But only on Windows, right ? (On Unix text mode and binary mode behave identically) --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From gp@pooryorick.com Thu Jan 16 16:32:34 2003 From: gp@pooryorick.com (Poor Yorick) Date: Thu, 16 Jan 2003 09:32:34 -0700 Subject: [I18n-sig] codecs module, readlines and xreadlines References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> Message-ID: <3E26DEA2.3040902@pooryorick.com> Martin v. Löwis wrote: >Poor Yorick writes: > >>The thing is that I AM processing text data. It just happens to be >>unicode text data. The example I used turns into perfectly legible >>chinese characters once it's decoded in Python. I think that people >>using the codecs module on Windows to read Unicode text files would >>expect codecs.open.readlines to behave exactly like the builtin >>open.readlines. >> > >Would you like to work on a patch to fix this problem? > Alas, I only wish I had the skills that would require! Perhaps someday. I just wanted to point the issue out. I could, however, probably help by creating a tutorial about the subject for others like me, which I will try to do. Thank you, Poor Yorick, gp@pooryorick.com From Scott.Daniels@Acm.Org Thu Jan 16 21:45:52 2003 From: Scott.Daniels@Acm.Org (Scott David Daniels) Date: Thu, 16 Jan 2003 13:45:52 -0800 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <3E26DA6E.9000306@lemburg.com> References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> <3E26DA6E.9000306@lemburg.com> Message-ID: <3E272810.9050600@dsl-only.net> M.-A. Lemburg wrote: > Poor Yorick wrote: >> so to be consistent, codecs.open.readlines should do the same thing >> and remove \x00\r when the file is opened in text mode. > But only on Windows, right ? (On Unix text mode and binary mode > behave identically) Actually, on Apple's systems, lines are delimitted with \r, removing the \n. As painful as it is for me to acknowledge this, Microsoft is actually the most standards-compliant of the three major interpretation. C (and hence Unix) considered that it was redundant to have two distinct characters indicating end-of line. The unix choice was the only irreversible character of the pair (the line-feed). For a while, MIT had a non-standard control character that they called the "line-starve" which reversed the effect of the line feed. On the old teletype model 33s, though, the line feed was irreversible, while the carriage return was simple horiozontal postioning (and equivalent to the appropriate number of backspaces. Apple, I suspect, was thinking of the analogue to the keyboard. Very few typists ever type the line feed character; they type a return which emits the \r character. Unix solves this by conversion if the terminal is not in "raw" mode; Apple doesn't have to make a distinction. The least reasonable (but most standard-conforming) choice is \r\n, which (if you interpret the early ASCII standards literally), should be interpretted the same as \n\r. It is also uncomfortably true that \r\n\n should be exactly equivalent to \r\n\r\n. So, a lot of code is simplified if there is a single EOL (End-Of-Line) character. C declared this so, and anyone who does not use LF (\n) as a line delimiter in the environment where their C runtimes work is supposed to translate their local convention to the C-standard in the I/O runtimes. To summarize briefly, after being hopelessly long-winded, Apple non-raw should probably convert \r to \n, Microsoft non-raw should similarly convert \r\n to \n. What should be done in non-binary mode for the other line terminators in UniCode (I _think_ some exist) might be a source of hopelessly long-winded debate. From guido@python.org Thu Jan 16 21:57:09 2003 From: guido@python.org (Guido van Rossum) Date: Thu, 16 Jan 2003 16:57:09 -0500 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: Your message of "Thu, 16 Jan 2003 13:45:52 PST." <3E272810.9050600@dsl-only.net> References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> <3E26DA6E.9000306@lemburg.com> <3E272810.9050600@dsl-only.net> Message-ID: <200301162157.h0GLv9Z14045@odiug.zope.com> > To summarize briefly, after being hopelessly long-winded, Apple > non-raw should probably convert \r to \n, Microsoft non-raw > should similarly convert \r\n to \n. What should be done in > non-binary mode for the other line terminators in UniCode (I > _think_ some exist) might be a source of hopelessly long-winded > debate. That's exactly what Universal newlines does. Have I missed something? --Guido van Rossum (home page: http://www.python.org/~guido/) From martin@v.loewis.de Thu Jan 16 23:52:09 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 17 Jan 2003 00:52:09 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <200301162157.h0GLv9Z14045@odiug.zope.com> References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> <3E26DA6E.9000306@lemburg.com> <3E272810.9050600@dsl-only.net> <200301162157.h0GLv9Z14045@odiug.zope.com> Message-ID: Guido van Rossum writes: > That's exactly what Universal newlines does. Have I missed something? The issue is whether codecs.open should follow the platform conventions for text mode if neither "b" nor "U" is passed. Builtin open currently does, codecs.open does not (instead, it treats a plain "r" just as if "rb" had been passed). Furthermore, the *implementation* of universal newlines is useless for codecs.open, as the newline conversion must happen after decoding, not before. Regards, Martin From mal@lemburg.com Fri Jan 17 09:04:57 2003 From: mal@lemburg.com (M.-A. Lemburg) Date: Fri, 17 Jan 2003 10:04:57 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> <3E26DA6E.9000306@lemburg.com> <3E272810.9050600@dsl-only.net> <200301162157.h0GLv9Z14045@odiug.zope.com> Message-ID: <3E27C739.9050802@lemburg.com> Martin v. L=F6wis wrote: > Guido van Rossum writes: >=20 >>That's exactly what Universal newlines does. Have I missed something? >=20 > The issue is whether codecs.open should follow the platform > conventions for text mode if neither "b" nor "U" is passed. Builtin > open currently does, codecs.open does not (instead, it treats a plain > "r" just as if "rb" had been passed). I'd say: let the codecs decide what to do here. After all, codecs.open() only provide an interface to the codecs and leaves all the processing to them. If a codec thinks that line ends should all be converted to '\n' then so be it. That's also why codecs.open() appends an 'b' to the mode in case it is not already there: otherwise opening files in e.g. UTF-16 on Windows would lose big. I think that the codecs.open() kind of treatment is more reliable than the open() one for text files. Simply because you always know what will happen and can then apply whatever conversion needs to be done in the program. > Furthermore, the *implementation* of universal newlines is useless for > codecs.open, as the newline conversion must happen after decoding, not > before. Right. --=20 Marc-Andre Lemburg CEO eGenix.com Software GmbH _______________________________________________________________________ eGenix.com -- Makers of the Python mx Extensions: mxDateTime,mxODBC,... Python Consulting: http://www.egenix.com/ Python Software: http://www.egenix.com/files/python/ From martin@v.loewis.de Fri Jan 17 10:15:31 2003 From: martin@v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) Date: 17 Jan 2003 11:15:31 +0100 Subject: [I18n-sig] codecs module, readlines and xreadlines In-Reply-To: <3E27C739.9050802@lemburg.com> References: <3E25E551.4010202@pooryorick.com> <3E267839.50301@lemburg.com> <3E26D6F4.8090802@pooryorick.com> <3E26DA6E.9000306@lemburg.com> <3E272810.9050600@dsl-only.net> <200301162157.h0GLv9Z14045@odiug.zope.com> <3E27C739.9050802@lemburg.com> Message-ID: "M.-A. Lemburg" writes: > I'd say: let the codecs decide what to do here. Certainly. Unfortunately, this is not possible at the moment, since it is already codecs.open which uses binary mode, and the codec has no way of knowing what the original opening mode was. > After all, codecs.open() only provide an interface to the codecs and > leaves all the processing to them. If a codec thinks that line ends > should all be converted to '\n' then so be it. That's also why > codecs.open() appends an 'b' to the mode in case it is not already > there: otherwise opening files in e.g. UTF-16 on Windows would lose > big. Again: I do think that it is correct to open the underlying stream in binary. The question is whether the codec should perform newline translation (in addition to decoding, and probably after it). > I think that the codecs.open() kind of treatment is more reliable > than the open() one for text files. Simply because you always know > what will happen [...] This is not really true. The OP complains that you *cannot* know what how line ends will be represented. For the builtin open, you know that a line end will be always \n in text mode, even more so in universal mode. As it is, the representation of a line end in the Unicode data is platform dependent, which is bad for portability. Regards, Martin From Tex Texin Fri Jan 17 12:41:22 2003 From: Tex Texin (Tex Texin) Date: Fri, 17 Jan 2003 07:41:22 -0500 Subject: [I18n-sig] Register now for early bird rates for the 23rd Unicode conference, Prague Message-ID: <3E27F9F2.2A4DC1BE@i18nguy.com> It is time - time to register for the 23rd Unicode conference in Prague! Please see the details below, and check out the program which has been updated. ************************************************************************** Twenty-third Internationalization and Unicode Conference (IUC23) Unicode, Internationalization, the Web: The Global Connection http://www.unicode.org/iuc/iuc23 March 24-26, 2003 Prague, Czech Republic ************************************************************************* Register now! > Just 10 weeks to go > Register now! > Just 10 weeks to go ************************************************************************* NEWS > Visit the Conference Web site ( http://www.unicode.org/iuc/iuc23 ) to check the updated Conference program and register. To help you choose Conference sessions, we've included abstracts of talks and speakers' biographies. > Hotel guest room group rate valid to March 1. > Early bird registration rate valid to March 1. > Find out about the Workshop on Managing Localization Projects, organised by XenCraft, and taking place in the same venue on 27 March -- See: http://www.unicode.org/iuc/iuc23 CONFERENCE SPONSORS Agfa Monotype Corporation Basis Technology Corporation Microsoft Corporation Sun Microsystems, Inc. World Wide Web Consortium (W3C) GLOBAL COMPUTING SHOWCASE Visit the Showcase to find out more about products supporting the Unicode Standard, and products and services that can help you globalize/localize your software, documentation and Internet content. For the first time, we will have an Exhibitors' track as part of the Conference. For more information, please visit the Web site at: http://www.unicode.org/iuc/iuc23/showcase.html CONFERENCE VENUE The Conference will take place at: Marriott Prague Hotel V Celnici 8 Prague, 110 00 Czech Republic Tel: (+420 2) 2288 8888 Fax: (+420 2) 2288 8889 CONFERENCE MANAGEMENT Global Meeting Services Inc. 8949 Lombard Place, #416 San Diego, CA 92122, USA Tel: +1 858 638 0206 (voice) +1 858 638 0504 (fax) Email: info@global-conference.com or: conference@unicode.org THE UNICODE CONSORTIUM The Unicode Consortium was founded as a non-profit organization in 1991. It is dedicated to the development, maintenance and promotion of The Unicode Standard, a worldwide character encoding. The Unicode Standard encodes the characters of the world's principal scripts and languages, and is code-for-code identical to the international standard ISO/IEC 10646. In addition to cooperating with ISO on the future development of ISO/IEC 10646, the Consortium is responsible for providing character properties and algorithms for use in implementations. Today the membership base of the Unicode Consortium includes major computer corporations, software producers, database vendors, research institutions, international agencies and various user groups. For further information on the Unicode Standard, visit the Unicode Web site at http://www.unicode.org or e-mail * * * * * Unicode(r) and the Unicode logo are registered trademarks of Unicode, Inc. Used with permission.