From noreply at sourceforge.net Wed Aug 10 16:43:54 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 07:43:54 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 17:02:50 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 08:02:50 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 17:43:13 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 08:43:13 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Comment added) made by abhijitkankaria You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 17:51:36 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 08:51:36 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:51 Message: Logged In: YES user_id=290026 Apparently, you want to process Unicode on Solaris not as UTF-8 but as UTF-16 encoded. Therefore you need a two-byte data type for the UTF-16 base character type. Which base data-type for UTF-16 are you using elsewhere, unsigned short or a (redefined) wchar_t? ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 19:13:40 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 10:13:40 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Comment added) made by abhijitkankaria You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:13 Message: Logged In: YES user_id=1312629 In my code I am using wchar_t data type, I have not specifically defined it in my code. The existing library is working on MAC as MAC has wchar_t defined as unsigned short in the system headers. On Solaris I use the -fshort-wchar option for compiling so i guess wchar_t gets defined as unsigned short. Let me explain in short what i am trying to do here, I am porting the application from MAC to Solaris. On Mac the expat library is compiled with XML_UNICODE so XML_Char is defined as wchar_t and wchar_t is defined as unsigned short in the system headers. My library which interfaces with expat has wchar_t used in all places as its avaliable on MAC as unsigned short, so it worked fine. Now on Solaris if i have to use the -fshort-wchar option to have two byte wchar_t, I have two options: 1) Compile everything with -fshort-wchar option including system libraries so that all use the wchar_t as two bytes. OR 2) Either change my entire library code to use some thing like XML_Char so I can control how its defined. But in second case still the system libraries will stil use wchar_t defined as long. This is out of way question, please guide or if there is any place where i can get more info on wide build of expat. Thanks. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:51 Message: Logged In: YES user_id=290026 Apparently, you want to process Unicode on Solaris not as UTF-8 but as UTF-16 encoded. Therefore you need a two-byte data type for the UTF-16 base character type. Which base data-type for UTF-16 are you using elsewhere, unsigned short or a (redefined) wchar_t? ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 19:25:31 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 10:25:31 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 13:25 Message: Logged In: YES user_id=290026 I don't know of any specific sources for wchar_t problems, but Google should help you there. I would think that if all your application code is built on the assumption of a 2-byte wchar_t, then it would make sense to use the system libraries compiled for a short wchar_t. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:13 Message: Logged In: YES user_id=1312629 In my code I am using wchar_t data type, I have not specifically defined it in my code. The existing library is working on MAC as MAC has wchar_t defined as unsigned short in the system headers. On Solaris I use the -fshort-wchar option for compiling so i guess wchar_t gets defined as unsigned short. Let me explain in short what i am trying to do here, I am porting the application from MAC to Solaris. On Mac the expat library is compiled with XML_UNICODE so XML_Char is defined as wchar_t and wchar_t is defined as unsigned short in the system headers. My library which interfaces with expat has wchar_t used in all places as its avaliable on MAC as unsigned short, so it worked fine. Now on Solaris if i have to use the -fshort-wchar option to have two byte wchar_t, I have two options: 1) Compile everything with -fshort-wchar option including system libraries so that all use the wchar_t as two bytes. OR 2) Either change my entire library code to use some thing like XML_Char so I can control how its defined. But in second case still the system libraries will stil use wchar_t defined as long. This is out of way question, please guide or if there is any place where i can get more info on wide build of expat. Thanks. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:51 Message: Logged In: YES user_id=290026 Apparently, you want to process Unicode on Solaris not as UTF-8 but as UTF-16 encoded. Therefore you need a two-byte data type for the UTF-16 base character type. Which base data-type for UTF-16 are you using elsewhere, unsigned short or a (redefined) wchar_t? ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 19:26:08 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 10:26:08 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Settings changed) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation >Group: Not a Bug Status: Open Resolution: None >Priority: 1 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 13:25 Message: Logged In: YES user_id=290026 I don't know of any specific sources for wchar_t problems, but Google should help you there. I would think that if all your application code is built on the assumption of a 2-byte wchar_t, then it would make sense to use the system libraries compiled for a short wchar_t. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:13 Message: Logged In: YES user_id=1312629 In my code I am using wchar_t data type, I have not specifically defined it in my code. The existing library is working on MAC as MAC has wchar_t defined as unsigned short in the system headers. On Solaris I use the -fshort-wchar option for compiling so i guess wchar_t gets defined as unsigned short. Let me explain in short what i am trying to do here, I am porting the application from MAC to Solaris. On Mac the expat library is compiled with XML_UNICODE so XML_Char is defined as wchar_t and wchar_t is defined as unsigned short in the system headers. My library which interfaces with expat has wchar_t used in all places as its avaliable on MAC as unsigned short, so it worked fine. Now on Solaris if i have to use the -fshort-wchar option to have two byte wchar_t, I have two options: 1) Compile everything with -fshort-wchar option including system libraries so that all use the wchar_t as two bytes. OR 2) Either change my entire library code to use some thing like XML_Char so I can control how its defined. But in second case still the system libraries will stil use wchar_t defined as long. This is out of way question, please guide or if there is any place where i can get more info on wide build of expat. Thanks. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:51 Message: Logged In: YES user_id=290026 Apparently, you want to process Unicode on Solaris not as UTF-8 but as UTF-16 encoded. Therefore you need a two-byte data type for the UTF-16 base character type. Which base data-type for UTF-16 are you using elsewhere, unsigned short or a (redefined) wchar_t? ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 19:29:21 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 10:29:21 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Comment added) made by abhijitkankaria You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Not a Bug Status: Open Resolution: None Priority: 1 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:29 Message: Logged In: YES user_id=1312629 Yes Google is the only help i have been using for few yrs now..... Thanks for your time and info, made things bit more clear for me. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 13:25 Message: Logged In: YES user_id=290026 I don't know of any specific sources for wchar_t problems, but Google should help you there. I would think that if all your application code is built on the assumption of a 2-byte wchar_t, then it would make sense to use the system libraries compiled for a short wchar_t. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:13 Message: Logged In: YES user_id=1312629 In my code I am using wchar_t data type, I have not specifically defined it in my code. The existing library is working on MAC as MAC has wchar_t defined as unsigned short in the system headers. On Solaris I use the -fshort-wchar option for compiling so i guess wchar_t gets defined as unsigned short. Let me explain in short what i am trying to do here, I am porting the application from MAC to Solaris. On Mac the expat library is compiled with XML_UNICODE so XML_Char is defined as wchar_t and wchar_t is defined as unsigned short in the system headers. My library which interfaces with expat has wchar_t used in all places as its avaliable on MAC as unsigned short, so it worked fine. Now on Solaris if i have to use the -fshort-wchar option to have two byte wchar_t, I have two options: 1) Compile everything with -fshort-wchar option including system libraries so that all use the wchar_t as two bytes. OR 2) Either change my entire library code to use some thing like XML_Char so I can control how its defined. But in second case still the system libraries will stil use wchar_t defined as long. This is out of way question, please guide or if there is any place where i can get more info on wide build of expat. Thanks. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:51 Message: Logged In: YES user_id=290026 Apparently, you want to process Unicode on Solaris not as UTF-8 but as UTF-16 encoded. Therefore you need a two-byte data type for the UTF-16 base character type. Which base data-type for UTF-16 are you using elsewhere, unsigned short or a (redefined) wchar_t? ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 21:33:58 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 12:33:58 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Comment added) made by abhijitkankaria You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Not a Bug Status: Open Resolution: None Priority: 1 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 15:33 Message: Logged In: YES user_id=1312629 I came across this bug : [ 931546 ] Unixode support for Windows and Unix are not compatible This is exactly what problem I am facing too. I got your answer there already. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:29 Message: Logged In: YES user_id=1312629 Yes Google is the only help i have been using for few yrs now..... Thanks for your time and info, made things bit more clear for me. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 13:25 Message: Logged In: YES user_id=290026 I don't know of any specific sources for wchar_t problems, but Google should help you there. I would think that if all your application code is built on the assumption of a 2-byte wchar_t, then it would make sense to use the system libraries compiled for a short wchar_t. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:13 Message: Logged In: YES user_id=1312629 In my code I am using wchar_t data type, I have not specifically defined it in my code. The existing library is working on MAC as MAC has wchar_t defined as unsigned short in the system headers. On Solaris I use the -fshort-wchar option for compiling so i guess wchar_t gets defined as unsigned short. Let me explain in short what i am trying to do here, I am porting the application from MAC to Solaris. On Mac the expat library is compiled with XML_UNICODE so XML_Char is defined as wchar_t and wchar_t is defined as unsigned short in the system headers. My library which interfaces with expat has wchar_t used in all places as its avaliable on MAC as unsigned short, so it worked fine. Now on Solaris if i have to use the -fshort-wchar option to have two byte wchar_t, I have two options: 1) Compile everything with -fshort-wchar option including system libraries so that all use the wchar_t as two bytes. OR 2) Either change my entire library code to use some thing like XML_Char so I can control how its defined. But in second case still the system libraries will stil use wchar_t defined as long. This is out of way question, please guide or if there is any place where i can get more info on wide build of expat. Thanks. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:51 Message: Logged In: YES user_id=290026 Apparently, you want to process Unicode on Solaris not as UTF-8 but as UTF-16 encoded. Therefore you need a two-byte data type for the UTF-16 base character type. Which base data-type for UTF-16 are you using elsewhere, unsigned short or a (redefined) wchar_t? ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From noreply at sourceforge.net Wed Aug 10 21:45:36 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Wed, 10 Aug 2005 12:45:36 -0700 Subject: [Expat-bugs] [ expat-Bugs-1255896 ] Expat 1.95.8 ReadMe Message-ID: Bugs item #1255896, was opened at 2005-08-10 10:43 Message generated for change (Settings changed) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Not a Bug >Status: Closed >Resolution: Rejected Priority: 1 Submitted By: abhijitk (abhijitkankaria) Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Expat 1.95.8 ReadMe Initial Comment: >From Expat ReadMe: -------------------------------------------------------------------------------------- If you are interested in building Expat to provide document information in UTF-16 rather than the default UTF-8, following these instructions: 1. For UTF-16 output as unsigned short (and version/error strings as char), run: ./configure CPPFLAGS=-DXML_UNICODE For UTF-16 output as wchar_t (incl. version/error strings), run: ./configure CFLAGS="-g -O2 -fshort-wchar" CPPFLAGS=-DXML_UNICODE_WCHAR_T 2. Edit the MakeFile, changing: LIBRARY = libexpat.la to: LIBRARY = libexpatw.la (Note the additional "w" in the library name.) 3. Run "make buildlib" (which builds the library only). 4. Run "make installlib" (which installs the library only). -------------------------------------------------------------------------------------- As per the defination of -fshort-wchar: -fshort-wchar Override the underlying type for wchar_t to be short unsigned int instead of the default for the target. This option is useful for building programs to run under WINE. Warning: the -fshort-wchar switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. -------------------------------------------------------------------------------------- So this indicates that the option -fshort-wchar is to be used in case I need UTF-16 output as unsigned short. But as the ReadMe suggests should I use the option -fshort-wchar for UTF-16 output as wchar_t? Please correct me if my understanding is incorrect. ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 15:45 Message: Logged In: YES user_id=290026 I am glad you found some answers, even though they are probably not what you wanted. Closing this issue. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 15:33 Message: Logged In: YES user_id=1312629 I came across this bug : [ 931546 ] Unixode support for Windows and Unix are not compatible This is exactly what problem I am facing too. I got your answer there already. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:29 Message: Logged In: YES user_id=1312629 Yes Google is the only help i have been using for few yrs now..... Thanks for your time and info, made things bit more clear for me. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 13:25 Message: Logged In: YES user_id=290026 I don't know of any specific sources for wchar_t problems, but Google should help you there. I would think that if all your application code is built on the assumption of a 2-byte wchar_t, then it would make sense to use the system libraries compiled for a short wchar_t. ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 13:13 Message: Logged In: YES user_id=1312629 In my code I am using wchar_t data type, I have not specifically defined it in my code. The existing library is working on MAC as MAC has wchar_t defined as unsigned short in the system headers. On Solaris I use the -fshort-wchar option for compiling so i guess wchar_t gets defined as unsigned short. Let me explain in short what i am trying to do here, I am porting the application from MAC to Solaris. On Mac the expat library is compiled with XML_UNICODE so XML_Char is defined as wchar_t and wchar_t is defined as unsigned short in the system headers. My library which interfaces with expat has wchar_t used in all places as its avaliable on MAC as unsigned short, so it worked fine. Now on Solaris if i have to use the -fshort-wchar option to have two byte wchar_t, I have two options: 1) Compile everything with -fshort-wchar option including system libraries so that all use the wchar_t as two bytes. OR 2) Either change my entire library code to use some thing like XML_Char so I can control how its defined. But in second case still the system libraries will stil use wchar_t defined as long. This is out of way question, please guide or if there is any place where i can get more info on wide build of expat. Thanks. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:51 Message: Logged In: YES user_id=290026 Apparently, you want to process Unicode on Solaris not as UTF-8 but as UTF-16 encoded. Therefore you need a two-byte data type for the UTF-16 base character type. Which base data-type for UTF-16 are you using elsewhere, unsigned short or a (redefined) wchar_t? ---------------------------------------------------------------------- Comment By: abhijitk (abhijitkankaria) Date: 2005-08-10 11:43 Message: Logged In: YES user_id=1312629 So wchar_t has to be two byte for expat to work correctly? I am compiling a 32 bit applicaiton on Solaris, so if i dont use the -fshort-wchar, wchar_t will be defined as long. My code does not depend on the size of wchar_t, will expat give the desired result in this scenario? Basically I am getting senmentation fault in my application and so I am looking if the -fshort-wchar switch which causes GCC to generate code that is not binary compatible with code generated without that switchoption is the reason. My own libraries are build with this option but other system libraries (from /usr/lib) are not compiled with this option. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-10 11:02 Message: Logged In: YES user_id=290026 No, use -fshort-wchar only if you want UTF-16 output to be deliveerd as a *two-byte* wchar_t type (necessary because on Unix wchar_t is typically four bytes). ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1255896&group_id=10127 From masroor at kalkitech.com Thu Aug 18 10:24:48 2005 From: masroor at kalkitech.com (Masroor) Date: Thu, 18 Aug 2005 13:54:48 +0530 Subject: [Expat-bugs] problem when calling XMLParserFree( ) Message-ID: hello, i am devoloping an application program on ucLinux (kernal version 2.4). I could successfully build the expat library by cross-compiling with arm-elf-gcc (version 2.95.3). the configuration file for my program is in XML format. When i am using a small config file (lessthan 20K), my application is working properly. But if the file is big, then my program is hanging when it is calling the XMLParserFree( ) function. When i commented XMLParserFree( ) function from my program, then also it is working. What is happening when deallocating the parser? with regards Masroor -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.libexpat.org/pipermail/expat-bugs/attachments/20050818/b66dfc3c/attachment.htm From noreply at sourceforge.net Fri Aug 19 14:55:32 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Fri, 19 Aug 2005 05:55:32 -0700 Subject: [Expat-bugs] [ expat-Bugs-1241534 ] Support for special character set Message-ID: Bugs item #1241534, was opened at 2005-07-20 09:08 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1241534&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Feature Request Status: Open Resolution: None Priority: 5 Submitted By: Sukender (sukender) >Assigned to: Karl Waclawek (kwaclaw) Summary: Support for special character set Initial Comment: I'm not sure, but I think some charcters produce errors. If you try you'll have a parsing error. Try a word with "?" and then replace it by "e" : the error disapear. It's perhaps a problem with my program but it's a strange behaviour ! ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-19 08:55 Message: Logged In: YES user_id=290026 What encoding do you use, and what encoding is specified in the input file? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1241534&group_id=10127 From noreply at sourceforge.net Sun Aug 21 21:30:16 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Sun, 21 Aug 2005 12:30:16 -0700 Subject: [Expat-bugs] [ expat-Bugs-1241534 ] Support for special character set Message-ID: Bugs item #1241534, was opened at 2005-07-20 06:08 Message generated for change (Comment added) made by nobody You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1241534&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Feature Request Status: Open Resolution: None Priority: 5 Submitted By: Sukender (sukender) Assigned to: Karl Waclawek (kwaclaw) Summary: Support for special character set Initial Comment: I'm not sure, but I think some charcters produce errors. If you try you'll have a parsing error. Try a word with "?" and then replace it by "e" : the error disapear. It's perhaps a problem with my program but it's a strange behaviour ! ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2005-08-21 12:30 Message: Logged In: NO I have the same issue. The file is encoded with UTF-8 and I've forced the encoding using others like US-Ascii and ISO-899.. Also when I manually escape the character to &233; (example) it always unescapes to UTF-8 it looks like. My work around was simply to remove all high character values before a parse. ---------------------------------------------------------------------- Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-19 05:55 Message: Logged In: YES user_id=290026 What encoding do you use, and what encoding is specified in the input file? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1241534&group_id=10127 From noreply at sourceforge.net Thu Aug 25 13:26:09 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 25 Aug 2005 04:26:09 -0700 Subject: [Expat-bugs] [ expat-Bugs-1271642 ] ill-formed output for `xmlwf -c -d` on ISO-8859-1 Message-ID: Bugs item #1271642, was opened at 2005-08-25 04:26 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1271642&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: ill-formed output for `xmlwf -c -d` on ISO-8859-1 Initial Comment: If I run: xmlwf -c -d /tmp bug.xml with bug.xml containing: 123 then the result is: é123 The start tag for e is lost and the entity ref in the attribute is copied. Note that this bug does not happen if the encoding is UTF-8 or US-ASCII. Bug reproduced by a third party, see: http://mail.libexpat.org/pipermail/expat-discuss/2005-August/001880.html ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1271642&group_id=10127 From noreply at sourceforge.net Thu Aug 25 17:46:42 2005 From: noreply at sourceforge.net (SourceForge.net) Date: Thu, 25 Aug 2005 08:46:42 -0700 Subject: [Expat-bugs] [ expat-Bugs-1271642 ] ill-formed output for `xmlwf -c -d` on ISO-8859-1 Message-ID: Bugs item #1271642, was opened at 2005-08-25 07:26 Message generated for change (Comment added) made by kwaclaw You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1271642&group_id=10127 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None >Group: Test Required Status: Open Resolution: None Priority: 5 Submitted By: Nobody/Anonymous (nobody) >Assigned to: Karl Waclawek (kwaclaw) Summary: ill-formed output for `xmlwf -c -d` on ISO-8859-1 Initial Comment: If I run: xmlwf -c -d /tmp bug.xml with bug.xml containing: 123 then the result is: é123 The start tag for e is lost and the entity ref in the attribute is copied. Note that this bug does not happen if the encoding is UTF-8 or US-ASCII. Bug reproduced by a third party, see: http://mail.libexpat.org/pipermail/expat-discuss/2005-August/001880.html ---------------------------------------------------------------------- >Comment By: Karl Waclawek (kwaclaw) Date: 2005-08-25 11:46 Message: Logged In: YES user_id=290026 It appears this is caused by an inapproriate call to the default handler in appendAttributeValue(). Removing this call seems to fix the problem. Please apply and test the patch attached as file patch1.diff. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110127&aid=1271642&group_id=10127