From sikora at inova.com.br Wed May 7 23:41:30 2003 From: sikora at inova.com.br (Rodolfo Sikora) Date: Wed May 7 18:41:31 2003 Subject: [Expat-discuss] Help with encoding. Message-ID: <20030507224126.7056.qmail@escudo-local.inova.com.br> I`m having problems with my script trying to parse some xml files, I`m sending exactly what I`m using / doing. thanks for any help -xml file sispub.xml
-import.pl #!/usr/bin/perl use INOVA::UTIL::XML; use strict; my $x = new INOVA::UTIL::XML(file=>'sispub.xml'); my $hash = $x->ler(); print $hash->{filhos}{nota115}{titulo}; print "\n"; exit; - XML.pm module package INOVA::UTIL::XML; use strict; sub new { my $self = shift; my %params = @_; my $class = ref($self) || $self; $self = \%params; bless $self, $class; if(!$self->{file}) { &_erro('Arquivo XML Obrigatório em: '.ref($self)); return undef; } return $self; } sub ler { my $self = shift; return {} unless -f $self->{file}; require XML::Parser; my ($xml, $xml_hash); $xml = new XML::Parser(Style => 'INOVA::UTIL::XML::XMLHandlers'); $xml_hash = $xml->parsefile($self->{file}); return 0 unless ref $xml_hash eq "HASH"; return $xml_hash; } sub gravar { my $self = shift; my $parms = shift; $self->{raiz} = $parms->{raiz} if ref $parms->{raiz} eq "HASH"; $self->{filhos} = $parms->{filhos} if ref $parms->{filhos} eq "HASH"; require IO::File; require XML::Writer; my ($output, $writer); $output = new IO::File(">".$self->{file}); $writer = new XML::Writer(OUTPUT => $output); $writer->startTag('main'); for (keys %{$self->{raiz}}) { $writer->startTag('reg', 'campo' => $_, 'valor' => $self->{raiz}{$_}); $writer->endTag('reg'); } $self->gravarFilhos($writer, $self->{filhos}); # for my $element (keys %{$self->{filhos}}) { # $writer->startTag('tipo', 'nome' => $element); # for (keys %{$self->{filhos}{$element}}) { # $writer->startTag('reg', 'campo' => $_, 'valor' => $self->{filhos}{$element}{$_}); # $writer->endTag('reg'); # } # $writer->endTag('tipo'); # } $writer->endTag('main'); $writer->end(); $output->close(); return 1; } sub gravarFilhos { my $self = shift; my $writer = shift; my $filhos = shift; for my $element (keys %{$filhos}) { $writer->startTag('tipo', 'nome' => $element); for (keys %{$filhos->{$element}}) { if (ref($filhos->{$element}{$_}) eq 'HASH') { #possui subfilhos $self->gravarFilhos($writer, $filhos->{$element}{$_}); } else { $writer->startTag('reg', 'campo' => $_, 'valor' => $filhos->{$element}{$_}); $writer->endTag('reg'); } } $writer->endTag('tipo'); } } package INOVA::UTIL::XML::XMLHandlers; $IUX::tipoNome = ''; %IUX::raiz = (); %IUX::filhos = (); sub Start { my $p = shift; my $elem = shift; my %vars = @_; if ($elem eq "reg") { if ($IUX::tipoNome) { $IUX::filhos{$IUX::tipoNome}{$vars{campo}} = $vars{valor}; } else { $IUX::raiz{$vars{campo}} = $vars{valor} } } elsif ($elem eq "tipo") { $IUX::tipoNome = $vars{nome}; $IUX::filhos{$vars{nome}}{nome} = $vars{nome}; } } sub End { my $p = shift; my $elem = shift; if ($elem eq "tipo") { $IUX::tipoNome = ''; } } sub Final { return { raiz => \%IUX::raiz, filhos => \%IUX::filhos }; } 1; __END__ Here is what happens: [vo0do0@centauro:/tmp] $ perl import.pl éíção Rodolfo Sikora - Desenvolvimento e Operação Departamento de Operações e Tecnologia Inova Tecnologias - http://www.inova.com.br **************************************************** *Velop* - administração, controle e monitoramento da sua comunicação na Internet. http://www.inova.com.br From lphiri at nc.rr.com Thu May 8 22:23:08 2003 From: lphiri at nc.rr.com (Lindani Phiri) Date: Thu May 8 21:20:10 2003 Subject: [Expat-discuss] Running expat without install library Message-ID: <001301c315e2$b1b83ee0$96348c2f@ldantec> Has anyone tried running expat without installing the library on Vx Works or Solaris 7? (By this I mean compiling the source files directly into an executable) If so, any hints on setting up the makefile? I might need to encoporate Expat on a box where I am not allowed to install the libraries. Thanks, L. From dr at netscape.com Thu May 8 21:29:23 2003 From: dr at netscape.com (Dan Rosen) Date: Thu May 8 23:30:49 2003 Subject: [Expat-discuss] Running expat without install library In-Reply-To: <001301c315e2$b1b83ee0$96348c2f@ldantec> References: <001301c315e2$b1b83ee0$96348c2f@ldantec> Message-ID: <3EBB2093.2040705@netscape.com> You should be able to build expat as a static library rather than dynamic. When you build your executable, the necessary object code from the static lib will be included in the executable, and you won't have to install the lib. This method is basically equivalent to compiling the expat source directly into your executable, but I think it'll be easier (in terms of modifying the makefiles or whatever). Cheers, dr Lindani Phiri wrote: > Has anyone tried running expat without installing the library on Vx Works or Solaris 7? > (By this I mean compiling the source files directly into an executable) > If so, any hints on setting up the makefile? > > I might need to encoporate Expat on a box where I am not allowed to install the libraries. > > Thanks, > > L. > > _______________________________________________ > Expat-discuss mailing list > Expat-discuss@libexpat.org > http://mail.libexpat.org/mailman/listinfo/expat-discuss From mba2000 at ioplex.com Sun May 11 18:22:34 2003 From: mba2000 at ioplex.com (Michael B Allen) Date: Sun May 11 17:22:49 2003 Subject: [Expat-discuss] Running expat without install library In-Reply-To: <001301c315e2$b1b83ee0$96348c2f@ldantec> References: <001301c315e2$b1b83ee0$96348c2f@ldantec> Message-ID: <13712.199.43.48.22.1052688154.squirrel@miallen.com> Just link with the .a file. It's an "archive" of object files. > Has anyone tried running expat without installing the library on Vx Works > or Solaris 7? > (By this I mean compiling the source files directly into an executable) > If so, any hints on setting up the makefile? > > I might need to encoporate Expat on a box where I am not allowed to > install the libraries. > > Thanks, > > L. From fabio at soi.city.ac.uk Tue May 13 16:07:27 2003 From: fabio at soi.city.ac.uk (Fabio Venuti) Date: Tue May 13 11:20:18 2003 Subject: [Expat-discuss] getting offsets of attributes Message-ID: <006c01c31961$5dd1f9b0$4f5b288a@cadfael> Hello everybody, sometime ago I asked whether it was possible to get the offset of attributes in some way similar to XML_GetCurrentByteIndex. The answer was no. So I found my own way to do it and would like to share. I am aware that there might be better (=more efficient, more robust) ways to do that. Basically my idea is to get the file being parsed (assuming it's in stdin), then change the pointer to stdin so it points at the beginning of the current xml element, then get the whole element tag into a string, finally find the relative offset of the attributes inside the string. Here is the code, added to the start handler (adapted from the code I used in my program, so there might be some mistakes, though I tried looking at it carefully...). static void start(void *data, const char *el, const char **attr) { /* Meaning of variables: current_pos = position in xml file being parsed xml_byte_proc = bytes already processed Basically xml_byte_proc = BUFFSIZE * (Number of BUFFERS already processed) element_offset = absolute offset of current element element_tag = contains the text of the whole current element tag att_info[] = array containing attributes' offsets and lengths */ ... current_pos = ftell(stdin); element_offset = XML_GetCurrentByteIndex(xmlp) + xml_byte_proc; fseek(stdin, element_offset, SEEK_SET); j=0; while((element_tag[j++] = getc(stdin)) != '>'); element_tag[j] = '\0'; for (j=0; attr[j]; j += 2) { att_info[j] = strlen(element_tag) - strlen(strstr(element_tag, attr[j])); att_info[j+1] = strlen(attr[j+1]); } /* I'm not sure it's necessary to record first current_pos in stdin and then restore it at the end of the processing, but I do it just to be sure... */ fseek(stdin, current_pos, SEEK_SET); It works for my needs. Bye, Fabio From rbinse at profileup.com Tue May 20 20:38:27 2003 From: rbinse at profileup.com (Renaud Binse) Date: Tue May 20 13:38:41 2003 Subject: [Expat-discuss] Pb with & and & in parsing Message-ID: Hi I have a strange problem with expat and perl. I?m parsing a xml file with things like &. After parsing, they are translated to &. For example, with gfgffghfh & fdsfdsf I get gfgffghfh & fdsfdsf And it?s the same problem with quotes Is there a way to deactivate that ? Or is it an encoding problem ? I?m using Expat 1.95.4 and perl 5.8. Renaud From dino at aiesec.pwr.wroc.pl Tue May 20 20:59:07 2003 From: dino at aiesec.pwr.wroc.pl (Marcin Zdun) Date: Tue May 20 13:54:36 2003 Subject: [Expat-discuss] Pb with & and & in parsing In-Reply-To: Message-ID: On Tue, 20 May 2003, Renaud Binse wrote: > Hi > > I have a strange problem with expat and perl. I?m parsing a xml file with > things like &. After parsing, they are translated to &. > > For example, with gfgffghfh & fdsfdsf I get gfgffghfh & > fdsfdsf > > And it?s the same problem with quotes Is there a way to deactivate that ? > Or is it an encoding problem ? > There is no way of deactivating this, and this is behaviour as in specs (http://www.w3.org/TR/REC-xml#sec-physical-struct): all recognized entities are expanded by parser for application. So & becomes &, < - <, and so on -- d.n.hotch/reloaded:came;revolutions:soon "Huh, upgrades!" Thomas Anderson From GSubhash at chn.cognizant.com Wed May 21 12:07:53 2003 From: GSubhash at chn.cognizant.com (Gururajan, Subhashini (Cognizant)) Date: Wed May 21 01:35:29 2003 Subject: [Expat-discuss] expat in C++ Message-ID: <14E2ECED6A08844980C5C5C9BDD247B1CAB081@ctsinentsxua.cts.com> > Hi, > I want to use expat for my C++ implementation. My objective is just to parse the XML file. > The start, end and character data handlers are implemented as methods of a class. Now, I cant use the expat directly as, the function pointer forces me to include the class name while declaring. > So, the start handler declaration in expat.h should include the class name where the start handler is actually implemented. How do I tackle this. Please help me out. > > Thanks & Regards, > > SUBHASHINI > -------------- next part -------------- This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. Visit us at http://www.cognizant.com From GSubhash at chn.cognizant.com Wed May 21 12:00:31 2003 From: GSubhash at chn.cognizant.com (Gururajan, Subhashini (Cognizant)) Date: Wed May 21 08:08:27 2003 Subject: [Expat-discuss] expat in C++ Message-ID: <14E2ECED6A08844980C5C5C9BDD247B1CAB002@ctsinentsxua.cts.com> Hi, I want to use expat for my C++ implementation. My objective is just to parse the XML file. The start, end and character data handlers are implemented as methods of a class. Now, I cant use the expat directly as, the function pointer forces me to include the class name while declaring. So, the start handler declaration in expat.h should include the class name where the start handler is actually implemented. How do I tackle this. Please help me out. Thanks & Regards, SUBHASHINI -------------- next part -------------- This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. Visit us at http://www.cognizant.com From karl at waclawek.net Wed May 21 09:56:58 2003 From: karl at waclawek.net (Karl Waclawek) Date: Wed May 21 09:07:07 2003 Subject: [Expat-discuss] expat in C++ References: <14E2ECED6A08844980C5C5C9BDD247B1CAB081@ctsinentsxua.cts.com> Message-ID: <003501c31f98$76d8de00$9e539696@citkwaclaww2k> ----- Original Message ----- From: "Gururajan, Subhashini (Cognizant)" To: Sent: Wednesday, May 21, 2003 1:37 AM > Hi, > I want to use expat for my C++ implementation. My objective is just to parse the XML file. > The start, end and character data handlers are implemented as methods of a class. > Now, I cant use the expat directly as, the function pointer forces me to include the class name while declaring. > So, the start handler declaration in expat.h should include the class name where the start handler is actually > implemented. How do I tackle this. Please help me out. > There are a few options: You could use global functions as handlers, and pass the object reference through the userData parameter. The function will then use this reference to call the actual class method. Or, you could use one of the C++ wrappers, like Arabica. Check out the links on http://www.libexpat.org/#wrappers. Karl From allan.saywitz at pb.com Wed May 21 10:53:17 2003 From: allan.saywitz at pb.com (allan.saywitz@pb.com) Date: Wed May 21 09:54:47 2003 Subject: [Expat-discuss] expat in C++ Message-ID: There are lots of c++ wrappers for expat on the web, just do a google search on expat C++ Wrapper. Here is two: http://www.codeproject.com/soap/expatimpl.asp http://www.oofile.com.au/xml/expatpp.html I actually wrote my own since we are doing c++ for vse if you can belive that! Using inheritance worked well for me. I wrote a c++ expat wrapper to be used as a base class. This base class has virtual functions for all the different events I want to respond to. So now all I have to do is inherit from my base class and voila, I have a c++ class that reads xml and functions for handling start tag, end tag, char data, etc... Expat is so cool you can easily write all kinds of wrapper classes to fit your needs! Here is some sample code. thanks allan (See attached file: XMLExpatParser.h)(See attached file: XMLExpatParser.cpp) "Gururajan, Subhashini (Cognizant)" To: Subject: [Expat-discuss] expat in C++ Sent by: expat-discuss-bounces@l ibexpat.org 05/21/2003 12:30 AM Hi, I want to use expat for my C++ implementation. My objective is just to parse the XML file. The start, end and character data handlers are implemented as methods of a class. Now, I cant use the expat directly as, the function pointer forces me to include the class name while declaring. So, the start handler declaration in expat.h should include the class name where the start handler is actually implemented. How do I tackle this. Please help me out. Thanks & Regards, SUBHASHINI (See attached file: InterScan_Disclaimer.txt) _______________________________________________ Expat-discuss mailing list Expat-discuss@libexpat.org http://mail.libexpat.org/mailman/listinfo/expat-discuss -------------- next part -------------- A non-text attachment was scrubbed... Name: XMLExpatParser.h Type: application/octet-stream Size: 5623 bytes Desc: not available Url : http://mail.libexpat.org/pipermail/expat-discuss/attachments/20030521/b726a4d5/XMLExpatParser-0002.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: XMLExpatParser.cpp Type: application/octet-stream Size: 7482 bytes Desc: not available Url : http://mail.libexpat.org/pipermail/expat-discuss/attachments/20030521/b726a4d5/XMLExpatParser-0003.obj -------------- next part -------------- A non-text attachment was scrubbed... Name: InterScan_Disclaimer.txt Type: application/octet-stream Size: 524 bytes Desc: not available Url : http://mail.libexpat.org/pipermail/expat-discuss/attachments/20030521/b726a4d5/InterScan_Disclaimer-0001.obj From villas at del.ufrj.br Wed May 21 13:25:59 2003 From: villas at del.ufrj.br (Sergio Barbosa Villas-Boas) Date: Wed May 21 11:28:23 2003 Subject: [Expat-discuss] expat in C++ In-Reply-To: Message-ID: > I actually wrote my own since we are doing c++ for vse if you can belive that! Hi, Alan, and Hi all I got your C++ class for expat. But I have a problem. I can't compile a project from the source in my Visual C++. That's because the expat itself is in C. If I try to compile it in C++, many compatibility issues arise. Do you (Alan) have a working project in Visual C++, all , C++, with the source code of expat and your C++ class ? If you have, I would definitively appreciate to receive a copy of it. Thanks --------------------------------------------------------- +------+ Sergio Barbosa Villas-Boas /------/| villas@del.ufrj.br | sbVB |/ http://www.del.ufrj.br/~villas http://www.sbVB.net +------+ ICQ: 15360729 From allan.saywitz at pb.com Wed May 21 14:57:49 2003 From: allan.saywitz at pb.com (allan.saywitz@pb.com) Date: Wed May 21 13:59:44 2003 Subject: [Expat-discuss] expat in C++ Message-ID: Try this. Note you must go into project setting in VC++ 6.0 and change: 1. On the C++ tab, change Additional Include Path to point to directory that contains expat.h. 2. On the Link Tab, change Additional Librarry Path to point to directory that contains libexpat.lib. 3. Make sure libexpat.dll is in your path. Again, this code is a very very light wrapper. We did this on purpose because we like the flexibility of expat, but wanted an easier interface to use with c++. allan (See attached file: TestExpatParser.zip) "Sergio Barbosa Villas-Boas" To: br> Subject: RE: [Expat-discuss] expat in C++ 05/21/2003 11:25 AM > I actually wrote my own since we are doing c++ for vse if you can belive that! Hi, Alan, and Hi all I got your C++ class for expat. But I have a problem. I can't compile a project from the source in my Visual C++. That's because the expat itself is in C. If I try to compile it in C++, many compatibility issues arise. Do you (Alan) have a working project in Visual C++, all , C++, with the source code of expat and your C++ class ? If you have, I would definitively appreciate to receive a copy of it. Thanks --------------------------------------------------------- +------+ Sergio Barbosa Villas-Boas /------/| villas@del.ufrj.br | sbVB |/ http://www.del.ufrj.br/~villas http://www.sbVB.net +------+ ICQ: 15360729 -------------- next part -------------- A non-text attachment was scrubbed... Name: TestExpatParser.zip Type: application/zip Size: 6255 bytes Desc: not available Url : http://mail.libexpat.org/pipermail/expat-discuss/attachments/20030521/0faf444e/TestExpatParser.zip From villas at del.ufrj.br Wed May 21 16:18:58 2003 From: villas at del.ufrj.br (Sergio Barbosa Villas-Boas) Date: Wed May 21 14:19:49 2003 Subject: [Expat-discuss] expat in C++ In-Reply-To: Message-ID: > Try this... Hi Alan, hi all Thanks the project you sent. But however working, that's not what I was looking for. I wanted a full-from-source-code project in C++ to work in Visual C++. That's because I quite often use unix and g++ as well. I would like to use the same source when using unix. It would be very good to have a set of source files that are easy to compile and use. That is, I want a working VC project that does not rely on a binary library. Expat does come with source code. But as I told you, Expat is written as a tricky C (not C++) code. For example: The source code of Expat includes *.c. C++ compilers won't accept the Expat C code as C++ (due to those tricks). I tried to "fix" the expat source, to make it fit for C++, but there are too many tricks, so I got confused. So, however having the expat source code, I still do not have a working C++ set of sources that if compiled produce the desired effect. That's what I'm looking for. I wonder if you or anyone can help me on that. Thanks in advance --------------------------------------------------------- +------+ Sergio Barbosa Villas-Boas /------/| villas@del.ufrj.br | sbVB |/ http://www.del.ufrj.br/~villas http://www.sbVB.net +------+ ICQ: 15360729 From dino at aiesec.pwr.wroc.pl Wed May 21 22:58:50 2003 From: dino at aiesec.pwr.wroc.pl (Marcin Zdun) Date: Wed May 21 15:52:29 2003 Subject: [Expat-discuss] expat in C++ In-Reply-To: Message-ID: On Wed, 21 May 2003, Sergio Barbosa Villas-Boas wrote: > > Try this... > > Hi Alan, hi all > > Thanks the project you sent. > But however working, that's not what I was looking for. > [...] > That is, I want a working VC project that does not rely on a binary library. > Expat does come with source code. But as I told you, [...] > So, however having the expat source code, I still do not have > a working C++ set of sources that if compiled produce the > desired effect. That's what I'm looking for. > I wonder if you or anyone can help me on that. But, if you want to have expat cooperate with C++ by means of VC++, why not use static linkage to *.lib file and access it through number of wrappers? VC++ knows how to link to *.objs in C-originated *.lib and call its functions, AFAIK gcc for sure must have that "feature". So, solution I prefer is to compile or download static LIB and tell linker to use it. It does _is_ external binary library, but is not any, after linker produces an exe. U have flexibility of expat written in "tricky C", avail to C++ and no DLL/so tossing around - that's what tiggers like best :) -- d.n.hotch/reloaded:revolutions:soon "Huh, upgrades!" Thomas Anderson From cepek at gama.fsv.cvut.cz Wed May 21 20:54:31 2003 From: cepek at gama.fsv.cvut.cz (Ales Cepek) Date: Wed May 21 15:54:32 2003 Subject: [Expat-discuss] Re: expat in C++ References: Message-ID: <87el2s84bj.fsf@krtek.fsv.cvut.cz> "Sergio Barbosa Villas-Boas" writes: > So, however having the expat source code, I still do not have > a working C++ set of sources that if compiled produce the > desired effect. That's what I'm looking for. > I wonder if you or anyone can help me on that. ... I would like to join to Sergio. This feature was available in expat v. 1.1 (it's C source was platform independent and simple to incorporate into any other source code, C++ in my case). Of course I can understand why running configure script is needed for all demanding applications but as it is now I prefer to use the "obsolete" version 1.1. Would it be possible to add "platform independent configuration" to expat? Ales From GSubhash at chn.cognizant.com Fri May 23 15:28:43 2003 From: GSubhash at chn.cognizant.com (Gururajan, Subhashini (Cognizant)) Date: Fri May 23 04:57:23 2003 Subject: [Expat-discuss] expat in C++ Message-ID: <14E2ECED6A08844980C5C5C9BDD247B1DACB65@ctsinentsxua.cts.com> Thanks a lot it worked. -----Original Message----- From: allan.saywitz@pb.com [mailto:allan.saywitz@pb.com] Sent: Wednesday, May 21, 2003 7:23 PM To: Gururajan, Subhashini (Cognizant) Cc: expat-discuss@libexpat.org Subject: Re: [Expat-discuss] expat in C++ There are lots of c++ wrappers for expat on the web, just do a google search on expat C++ Wrapper. Here is two: http://www.codeproject.com/soap/expatimpl.asp http://www.oofile.com.au/xml/expatpp.html I actually wrote my own since we are doing c++ for vse if you can belive that! Using inheritance worked well for me. I wrote a c++ expat wrapper to be used as a base class. This base class has virtual functions for all the different events I want to respond to. So now all I have to do is inherit from my base class and voila, I have a c++ class that reads xml and functions for handling start tag, end tag, char data, etc... Expat is so cool you can easily write all kinds of wrapper classes to fit your needs! Here is some sample code. thanks allan (See attached file: XMLExpatParser.h)(See attached file: XMLExpatParser.cpp) "Gururajan, Subhashini (Cognizant)" To: Subject: [Expat-discuss] expat in C++ Sent by: expat-discuss-bounces@l ibexpat.org 05/21/2003 12:30 AM Hi, I want to use expat for my C++ implementation. My objective is just to parse the XML file. The start, end and character data handlers are implemented as methods of a class. Now, I cant use the expat directly as, the function pointer forces me to include the class name while declaring. So, the start handler declaration in expat.h should include the class name where the start handler is actually implemented. How do I tackle this. Please help me out. Thanks & Regards, SUBHASHINI (See attached file: InterScan_Disclaimer.txt) _______________________________________________ Expat-discuss mailing list Expat-discuss@libexpat.org http://mail.libexpat.org/mailman/listinfo/expat-discuss -------------- next part -------------- This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. Visit us at http://www.cognizant.com From Greg.Martin at TELUS.COM Mon May 26 15:32:52 2003 From: Greg.Martin at TELUS.COM (Greg Martin) Date: Mon May 26 17:32:56 2003 Subject: [Expat-discuss] problem parsing two files in application Message-ID: I'm getting the error : "junk after document element". I've looked through archives and see it usually comes up when there are multiple roots or text after the root element. I don't think this is the case here. I'm parsing two files each with different definitions - the first parses fine but it doesn't seem to matter what I take out of the second it doesn't parse. I did see mention in the archives of a problem parsing subsequent XML files delivered by socket - here they are read off disk. Here is the second file : ]> I'm calling XML_ParserFree() after parsing the first file and XML_ParserCreate() before. The parsing of both files is done in the same thread but in a different scope using C++ (C version of expat I believe). TIA, Greg Martin 780-493-2786 Application Development TP&E - Service Solution TELUS Communication Inc. From karl at waclawek.net Mon May 26 20:45:12 2003 From: karl at waclawek.net (Karl Waclawek) Date: Mon May 26 19:44:03 2003 Subject: [Expat-discuss] problem parsing two files in application References: Message-ID: <001a01c323e0$d9bb40c0$037ca8c0@karl> Which version of Expat? Please post your code (trimmed down to the essentials). Karl ----- Original Message ----- From: "Greg Martin" To: Sent: Monday, May 26, 2003 5:32 PM Subject: [Expat-discuss] problem parsing two files in application I'm getting the error : "junk after document element". I've looked through archives and see it usually comes up when there are multiple roots or text after the root element. I don't think this is the case here. I'm parsing two files each with different definitions - the first parses fine but it doesn't seem to matter what I take out of the second it doesn't parse. I did see mention in the archives of a problem parsing subsequent XML files delivered by socket - here they are read off disk. Here is the second file : ]> I'm calling XML_ParserFree() after parsing the first file and XML_ParserCreate() before. The parsing of both files is done in the same thread but in a different scope using C++ (C version of expat I believe). TIA, Greg Martin 780-493-2786 Application Development TP&E - Service Solution TELUS Communication Inc. _______________________________________________ Expat-discuss mailing list Expat-discuss@libexpat.org http://mail.libexpat.org/mailman/listinfo/expat-discuss From Greg.Martin at TELUS.COM Tue May 27 08:34:10 2003 From: Greg.Martin at TELUS.COM (Greg Martin) Date: Tue May 27 10:34:15 2003 Subject: [Expat-discuss] problem parsing two files in application Message-ID: Thanks for the quick reply. I'm using expat-1.95.6 built for Compaq Tru64. TaskConfig::TaskConfig() throw (TaskConfigError) : first(0) { void taskstart(void *, const char *, const char **); void taskend(void *, const char *); first = new TaskList; first->appId = 0; first->numTasks = 0; first->task = 0; first->next = 0; FILE *fp = fopen(TASK_CONFIG_FILE, "r"); if(fp == 0) { char msg[256]; sprintf(msg, "Couldn't open %s : %s", TASK_CONFIG_FILE, strerror(errno)); throw TaskConfigError(msg); } p = XML_ParserCreate(0); if(p == 0) throw TaskConfigError("Couldn't allocate memory for parser"); XML_SetElementHandler(p, taskstart, taskend); XML_SetUserData(p, (void *)first); char buf[BUFSIZE]; int done = 0; while(!done) { int done; int len; len = fread(buf, 1, BUFSIZE, fp); if(ferror(fp)) { char msg[256]; sprintf(msg, "Read error %s : %s", TASK_CONFIG_FILE, strerror(errno)); throw TaskConfigError(msg); } done = feof(fp); if(XML_Parse(p, buf, len, done) == XML_STATUS_ERROR) { char msg[256]; sprintf(msg, "Parse error at line %d of %s: %s", XML_GetCurrentLineNumber(p), TASK_CONFIG_FILE, XML_ErrorString(XML_GetErrorCode(p))); throw TaskConfigError(msg); } } XML_ParserFree(p); fclose(fp); } TaskConfig::~TaskConfig() { TaskList *tmp = taskConfig->first; while(taskConfig->first != 0) { tmp = tmp->next; delete taskConfig->first; taskConfig->first = tmp; } } void taskstart(void *data, const char *el, const char **attr) { TaskList *tl = (TaskList *)data; if(strcmp(el, "task") == 0) { tl->task = new Tasks[MAX_NUM_TASKS + 1]; tl->appId = strtol(attr[1], 0, 10); if(tl->appId == 0 && errno != 0) throw TaskConfigError(strerror(errno)); tl->next = new TaskList; tl->numTasks = 0; tl->next->appId = 0; tl->next->task = 0; tl->next->next = 0; tl = tl->next; } else { for(int i = 0; i < NUM_TASKS; ++i) { if(strcmp(el, strTasks[i]) == 0) { tl->task[tl->numTasks] = (Tasks)i; ++tl->numTasks; break; } } } } void taskend(void *data, const char *el) { TaskList *tl = (TaskList *)data; if(strcmp(el, "task") == 0) { tl->task[tl->numTasks] = DONE; } } -----Original Message----- From: Karl Waclawek [mailto:karl@waclawek.net] Sent: Monday, May 26, 2003 5:45 PM To: Greg Martin; expat-discuss@libexpat.org Subject: Re: [Expat-discuss] problem parsing two files in application Which version of Expat? Please post your code (trimmed down to the essentials). Karl ----- Original Message ----- From: "Greg Martin" To: Sent: Monday, May 26, 2003 5:32 PM Subject: [Expat-discuss] problem parsing two files in application I'm getting the error : "junk after document element". I've looked through archives and see it usually comes up when there are multiple roots or text after the root element. I don't think this is the case here. I'm parsing two files each with different definitions - the first parses fine but it doesn't seem to matter what I take out of the second it doesn't parse. I did see mention in the archives of a problem parsing subsequent XML files delivered by socket - here they are read off disk. Here is the second file : ]> I'm calling XML_ParserFree() after parsing the first file and XML_ParserCreate() before. The parsing of both files is done in the same thread but in a different scope using C++ (C version of expat I believe). TIA, Greg Martin 780-493-2786 Application Development TP&E - Service Solution TELUS Communication Inc. _______________________________________________ Expat-discuss mailing list Expat-discuss@libexpat.org http://mail.libexpat.org/mailman/listinfo/expat-discuss From karl at waclawek.net Tue May 27 12:08:14 2003 From: karl at waclawek.net (Karl Waclawek) Date: Tue May 27 11:08:21 2003 Subject: [Expat-discuss] problem parsing two files in application References: Message-ID: <003701c32461$cbac24f0$9e539696@citkwaclaww2k> I couldn't see anything wrong with your code. Does this happen always with the second file you parse, not matter which file? Or is it always the same file? Have you checked the file for extra characters at the end? You could also try this: ... if(XML_Parse(p, buf, len, len == 0) == XML_STATUS_ERROR) ... Karl ----- Original Message ----- From: "Greg Martin" To: Sent: Tuesday, May 27, 2003 10:34 AM Subject: RE: [Expat-discuss] problem parsing two files in application Thanks for the quick reply. I'm using expat-1.95.6 built for Compaq Tru64. From Greg.Martin at TELUS.COM Tue May 27 09:58:01 2003 From: Greg.Martin at TELUS.COM (Greg Martin) Date: Tue May 27 11:58:06 2003 Subject: FW: [Expat-discuss] problem parsing two files in application Message-ID: Thanks for your help Karl. Using your suggestion stopped the error and pointed to the real problem. For reasons beyond my ken I had shadowed the done variable so it was changing value with each loop - mea culpa not expat. Greg. -----Original Message----- From: Karl Waclawek [mailto:karl@waclawek.net] Sent: Tuesday, May 27, 2003 9:08 AM To: Greg Martin; expat-discuss@libexpat.org Subject: Re: [Expat-discuss] problem parsing two files in application I couldn't see anything wrong with your code. Does this happen always with the second file you parse, not matter which file? Or is it always the same file? Have you checked the file for extra characters at the end? You could also try this: ... if(XML_Parse(p, buf, len, len == 0) == XML_STATUS_ERROR) ... Karl ----- Original Message ----- From: "Greg Martin" To: Sent: Tuesday, May 27, 2003 10:34 AM Subject: RE: [Expat-discuss] problem parsing two files in application Thanks for the quick reply. I'm using expat-1.95.6 built for Compaq Tru64. From karl at waclawek.net Tue May 27 13:14:58 2003 From: karl at waclawek.net (Karl Waclawek) Date: Tue May 27 12:15:04 2003 Subject: [Expat-discuss] problem parsing two files in application References: Message-ID: <001301c3246b$1e4fe940$9e539696@citkwaclaww2k> I should have caught this. Well, at least we know now. ;-) We are actually quite proud (knock on wood !!!) that we haven't received any real bug report (for a programming error) in a long time. Karl ----- Original Message ----- From: "Greg Martin" To: "Karl Waclawek" Sent: Tuesday, May 27, 2003 11:56 AM Subject: RE: [Expat-discuss] problem parsing two files in application Thanks for your help Karl. Using your suggestion stopped the error and pointed to the real problem. For reasons beyond my ken I had shadowed the done variable so it was changing value with each loop - mea culpa not expat. Greg. From GSubhash at chn.cognizant.com Thu May 29 13:51:29 2003 From: GSubhash at chn.cognizant.com (Gururajan, Subhashini (Cognizant)) Date: Thu May 29 03:19:57 2003 Subject: [Expat-discuss] problem parsing two files in application Message-ID: <14E2ECED6A08844980C5C5C9BDD247B1F08985@ctsinentsxua.cts.com> Hi, I get the 'not well-formed (invalid token) " error while parsing. I do not see any problem with my XML file. Please help me out. here is the piece of code that i am trying out. attched is the XML file that i am parsing while(!feof(fp)){ int done; int len; fgets(Buff, sizeof(Buff), fp); len = strlen(Buff); if (ferror(fp)) { fprintf(stderr, "Read error\n"); exit(-1); } if (! parser.parse(Buff, len)) { fprintf(stderr, "Parse error at line :\n#%s#\n",Buff); printf("Error: %s",parser.getErrorString(parser.getErrorCode())); exit(-1); } } int XMLExpatParser::parse(const char* buffer, int len) { m_pXmlBuffer = new char[255]; strncpy(m_pXmlBuffer,buffer,strlen(buffer)); m_nDepth = 0; if(len == 0) return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, strlen(m_pXmlBuffer), len); else return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, len, 0); } Regards, -Subhashini -----Original Message----- From: Greg Martin [mailto:Greg.Martin@TELUS.COM] Sent: Tuesday, May 27, 2003 9:28 PM To: expat-discuss@libexpat.org Subject: FW: [Expat-discuss] problem parsing two files in application Thanks for your help Karl. Using your suggestion stopped the error and pointed to the real problem. For reasons beyond my ken I had shadowed the done variable so it was changing value with each loop - mea culpa not expat. Greg. -----Original Message----- From: Karl Waclawek [mailto:karl@waclawek.net] Sent: Tuesday, May 27, 2003 9:08 AM To: Greg Martin; expat-discuss@libexpat.org Subject: Re: [Expat-discuss] problem parsing two files in application I couldn't see anything wrong with your code. Does this happen always with the second file you parse, not matter which file? Or is it always the same file? Have you checked the file for extra characters at the end? You could also try this: ... if(XML_Parse(p, buf, len, len == 0) == XML_STATUS_ERROR) ... Karl ----- Original Message ----- From: "Greg Martin" To: Sent: Tuesday, May 27, 2003 10:34 AM Subject: RE: [Expat-discuss] problem parsing two files in application Thanks for the quick reply. I'm using expat-1.95.6 built for Compaq Tru64. _______________________________________________ Expat-discuss mailing list Expat-discuss@libexpat.org http://mail.libexpat.org/mailman/listinfo/expat-discuss -------------- next part -------------- BDE Customer ID BDE TRANSACTION ID SUMMARISED Current Date FPC CFP Profile Id OPTIONAL FPCDSN.FIL1 OPTIONAL OPTIONAL MVSSYS OPTIONAL OPTIONAL -------------- next part -------------- This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. Visit us at http://www.cognizant.com From karl at waclawek.net Thu May 29 10:01:19 2003 From: karl at waclawek.net (Karl Waclawek) Date: Thu May 29 09:01:29 2003 Subject: [Expat-discuss] problem parsing two files in application References: <14E2ECED6A08844980C5C5C9BDD247B1F08985@ctsinentsxua.cts.com> Message-ID: <001801c325e2$6596e3d0$9e539696@citkwaclaww2k> Do not to treat the input as string. This gives problems if there a re null bytes, like in a UTF-16 encoding. Just follow the examples in the docs and demo programs that come with Expat. Karl ----- Original Message ----- From: "Gururajan, Subhashini (Cognizant)" To: "Greg Martin" ; Sent: Thursday, May 29, 2003 3:21 AM Subject: RE: [Expat-discuss] problem parsing two files in application Hi, I get the 'not well-formed (invalid token) " error while parsing. I do not see any problem with my XML file. Please help me out. here is the piece of code that i am trying out. attched is the XML file that i am parsing while(!feof(fp)){ int done; int len; fgets(Buff, sizeof(Buff), fp); len = strlen(Buff); if (ferror(fp)) { fprintf(stderr, "Read error\n"); exit(-1); } if (! parser.parse(Buff, len)) { fprintf(stderr, "Parse error at line :\n#%s#\n",Buff); printf("Error: %s",parser.getErrorString(parser.getErrorCode())); exit(-1); } } int XMLExpatParser::parse(const char* buffer, int len) { m_pXmlBuffer = new char[255]; strncpy(m_pXmlBuffer,buffer,strlen(buffer)); m_nDepth = 0; if(len == 0) return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, strlen(m_pXmlBuffer), len); else return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, len, 0); } Regards, -Subhashini -----Original Message----- From: Greg Martin [mailto:Greg.Martin@TELUS.COM] Sent: Tuesday, May 27, 2003 9:28 PM To: expat-discuss@libexpat.org Subject: FW: [Expat-discuss] problem parsing two files in application Thanks for your help Karl. Using your suggestion stopped the error and pointed to the real problem. For reasons beyond my ken I had shadowed the done variable so it was changing value with each loop - mea culpa not expat. Greg. -----Original Message----- From: Karl Waclawek [mailto:karl@waclawek.net] Sent: Tuesday, May 27, 2003 9:08 AM To: Greg Martin; expat-discuss@libexpat.org Subject: Re: [Expat-discuss] problem parsing two files in application I couldn't see anything wrong with your code. Does this happen always with the second file you parse, not matter which file? Or is it always the same file? Have you checked the file for extra characters at the end? You could also try this: ... if(XML_Parse(p, buf, len, len == 0) == XML_STATUS_ERROR) ... Karl ----- Original Message ----- From: "Greg Martin" To: Sent: Tuesday, May 27, 2003 10:34 AM Subject: RE: [Expat-discuss] problem parsing two files in application Thanks for the quick reply. I'm using expat-1.95.6 built for Compaq Tru64. _______________________________________________ Expat-discuss mailing list Expat-discuss@libexpat.org http://mail.libexpat.org/mailman/listinfo/expat-discuss -------------------------------------------------------------------------------- > _______________________________________________ > Expat-discuss mailing list > Expat-discuss@libexpat.org > http://mail.libexpat.org/mailman/listinfo/expat-discuss > From dr at netscape.com Thu May 29 12:50:55 2003 From: dr at netscape.com (Dan Rosen) Date: Thu May 29 14:53:11 2003 Subject: [Expat-discuss] not well-formed (invalid token) In-Reply-To: <14E2ECED6A08844980C5C5C9BDD247B1F08985@ctsinentsxua.cts.com> References: <14E2ECED6A08844980C5C5C9BDD247B1F08985@ctsinentsxua.cts.com> Message-ID: <3ED6568F.4070409@netscape.com> Your main problem is that you're treating your file as a C string, which may not be a correct assumption. You're in C++, so use fstreams. Try something like this: void parse (std::istream& data) { /* ... setup ... */ char buffer[256]; while (!data.eof()) { a_Data.read(buffer, 256); XML_Status status = XML_Parse(parser, buffer, data.gcount(), data.eof() ? 1 : 0); } /* ... error handling ... */ } Cheers, dr Gururajan, Subhashini (Cognizant) wrote: > Hi, > I get the 'not well-formed (invalid token) " error while parsing. > I do not see any problem with my XML file. Please help me out. > > > here is the piece of code that i am trying out. attched is the XML file that i am parsing > > while(!feof(fp)){ > int done; > int len; > fgets(Buff, sizeof(Buff), fp); > len = strlen(Buff); > if (ferror(fp)) { > fprintf(stderr, "Read error\n"); > exit(-1); > } > if (! parser.parse(Buff, len)) { > fprintf(stderr, "Parse error at line :\n#%s#\n",Buff); > printf("Error: %s",parser.getErrorString(parser.getErrorCode())); > exit(-1); > } > } > > > int XMLExpatParser::parse(const char* buffer, int len) > { > m_pXmlBuffer = new char[255]; > strncpy(m_pXmlBuffer,buffer,strlen(buffer)); > m_nDepth = 0; > if(len == 0) > return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, strlen(m_pXmlBuffer), > len); > else > return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, len, 0); > } > > Regards, > -Subhashini From GSubhash at chn.cognizant.com Fri May 30 17:31:56 2003 From: GSubhash at chn.cognizant.com (Gururajan, Subhashini (Cognizant)) Date: Fri May 30 07:00:40 2003 Subject: [Expat-discuss] not well-formed (invalid token) Message-ID: <14E2ECED6A08844980C5C5C9BDD247B1F5F08D@ctsinentsxua.cts.com> Thanks for the response. The problem is solved. But I see that the character handler is called more than once even though I have just one block of word in between my tags. The consecutive calls to character handler do not have any text associated with them. -Subha -----Original Message----- From: Dan Rosen [mailto:dr@netscape.com] Sent: Friday, May 30, 2003 12:21 AM To: Gururajan, Subhashini (Cognizant) Cc: expat-discuss@libexpat.org Subject: Re: [Expat-discuss] not well-formed (invalid token) Your main problem is that you're treating your file as a C string, which may not be a correct assumption. You're in C++, so use fstreams. Try something like this: void parse (std::istream& data) { /* ... setup ... */ char buffer[256]; while (!data.eof()) { a_Data.read(buffer, 256); XML_Status status = XML_Parse(parser, buffer, data.gcount(), data.eof() ? 1 : 0); } /* ... error handling ... */ } Cheers, dr Gururajan, Subhashini (Cognizant) wrote: > Hi, > I get the 'not well-formed (invalid token) " error while parsing. > I do not see any problem with my XML file. Please help me out. > > > here is the piece of code that i am trying out. attched is the XML file that i am parsing > > while(!feof(fp)){ > int done; > int len; > fgets(Buff, sizeof(Buff), fp); > len = strlen(Buff); > if (ferror(fp)) { > fprintf(stderr, "Read error\n"); > exit(-1); > } > if (! parser.parse(Buff, len)) { > fprintf(stderr, "Parse error at line :\n#%s#\n",Buff); > printf("Error: %s",parser.getErrorString(parser.getErrorCode())); > exit(-1); > } > } > > > int XMLExpatParser::parse(const char* buffer, int len) > { > m_pXmlBuffer = new char[255]; > strncpy(m_pXmlBuffer,buffer,strlen(buffer)); > m_nDepth = 0; > if(len == 0) > return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, strlen(m_pXmlBuffer), > len); > else > return XML_Parse(m_pXMLExpatParser, m_pXmlBuffer, len, 0); > } > > Regards, > -Subhashini -------------- next part -------------- BDE BDEID SUMMARISED Currentdate FPC CFPProfileId OPTIONAL FPCDSN.FIL1 OPTIONAL OPTIONAL MVSSYS OPTIONAL OPTIONAL -------------- next part -------------- This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorised review, use, disclosure, dissemination, forwarding, printing or copying of this email or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. Visit us at http://www.cognizant.com From karl at waclawek.net Fri May 30 09:55:16 2003 From: karl at waclawek.net (Karl Waclawek) Date: Fri May 30 08:55:25 2003 Subject: [Expat-discuss] not well-formed (invalid token) References: <14E2ECED6A08844980C5C5C9BDD247B1F5F08D@ctsinentsxua.cts.com> Message-ID: <001801c326aa$b7fca6f0$9e539696@citkwaclaww2k> ---- Original Message ----- From: "Gururajan, Subhashini (Cognizant)" To: "Dan Rosen" > Thanks for the response. The problem is solved. But I see that the character handler > is called more than once even though I have just one block of word in between my tags. > The consecutive calls to character handler do not have any text associated with them. If there are line breaks, Expat will have to normalize them, which interrupts the character handler callbacks. So you should see line breaks in those calls that you consider empty text. Karl From dr at netscape.com Fri May 30 12:12:12 2003 From: dr at netscape.com (Dan Rosen) Date: Fri May 30 14:14:29 2003 Subject: [Expat-discuss] not well-formed (invalid token) In-Reply-To: <001801c326aa$b7fca6f0$9e539696@citkwaclaww2k> References: <14E2ECED6A08844980C5C5C9BDD247B1F5F08D@ctsinentsxua.cts.com> <001801c326aa$b7fca6f0$9e539696@citkwaclaww2k> Message-ID: <3ED79EFC.4080006@netscape.com> Subhashini, I didn't receive the full text of your reply to my post (only the part quoted in Karl's reply) but here's what I tend to do: >>Thanks for the response. The problem is solved. But I see that the character handler >>is called more than once even though I have just one block of word in between my tags. >>The consecutive calls to character handler do not have any text associated with them. You'll get multiple character callbacks in a number of instances. Line breaks are one such case, entity normalization and CDATA blocks also cause this. What I tend to do in my own code is coalesce the multiple character callbacks into one (with logic like "keep adding to my string until I get some other callback"), and pass the resulting complete string on to my application. Hope this helps, Dan