[Expat-discuss] problem adding new encoding to perl XML::Parser

Daben Liu dliu@bbn.com
Wed, 28 Aug 2002 12:04:36 -0400


The XML::Parser installed from CPAN does not come with a 
GB2312 encoding support. However, I was not able to add 
the support as instructed by the XML::Encoding package.

To add this support, I did the following:

1. Download GB2312.TXT from ftp.unicode.org
2. Download the XML::Encoding 1.01 and get two binaries:
   make_encmap and compile_encoding
3. run make_encmap as follows:
   make_encmap GB2312 GB2312.TXT > GB2312.encmap
4. Add expat='yes' to the first line of GB2312.encmap
5. run compile_encoding:
   compile_encoding -o GB2312.enc GB2312.encmap
6. copy GB2312.enc to 
   /usr/lib/perl5/site_perl/5.005/i386-linux/XML/Parser/Encodings

Then I made the following perl script:
---------------
#!/usr/bin/perl
use XML::Parser;

my $xmlfile = $ARGV[0];
my $parser = new XML::Parser();
my $doc = $parser->parsefile ("$xmlfile");
---------------

I run this script with a well-formed xml file having a head line 
as: <?xml version="1.0" encoding="GB2312"?>

Following error occurs:

unknown encoding at line 1, column 30, byte 30 at /usr/lib/perl5/site_perl/5.005/i386-linux/XML/Parser.pm line 185

Changing the encoding to other supported ones seem to work without error. 
I'm wondering if there is something I'm missing in the process. 

Thanks for any suggestions!


Daben