[Expat-discuss] How to fetch information for the position of each element in a file

Stefano Sabatini stefano.sabatini-lala at poste.it
Mon Jan 21 12:31:33 CET 2008


Hi all, this is my first post here.

I have an application which needs to parse an XML file, and I would
like to print out the position of *each* element in the parsed file.

Actually I slightly hacked outline.c to this:

/*****************************************************************/
#include <stdio.h>
#include <expat.h>

#define BUFFSIZE        8192

char Buff[BUFFSIZE];

int Depth;

typedef struct UserData {
    char *filename;
    XML_Parser *p;
} UserData;

/* macro which defines the start handler for an element */
static void XMLCALL
start(void *data, const char *el, const char **attr)
{
    int i;
    UserData *user_data = (UserData *)data;

    /* indent according to the indentation depth */
    for (i = 0; i < Depth; i++)
        printf("  ");

    printf("%s:%d:%s", user_data->filename, XML_GetCurrentLineNumber(user_data->parser), el)
    for (i = 0; attr[i]; i += 2) {
        printf(" %s='%s'", attr[i], attr[i + 1]);
    }

    printf("\n");
    Depth++;
}

/* end handler */
static void XMLCALL
end(void *data, const char *el)
{
    Depth--;
}

int main(int argc, char *argv[])
{
    UserData data;

    XML_Parser parser = XML_ParserCreate(NULL);
    if (!parser) {
        fprintf(stderr, "Couldn't allocate memory for parser\n");
        exit(-1);
    }

    /* set the start and end handler for each element of the document */
    XML_SetElementHandler(parser, start, end);

    data.filename = "stdin";
    data.parser = &parser;

    /* this sets the pointer to pass to the various handler function
     * you need to fill accordingly this struct */
    XML_SetUserData(parser, &data);

    for (;;) {
        int done;
        int len;

        len = fread(Buff, 1, BUFFSIZE, stdin);
        if (ferror(stdin)) {
            fprintf(stderr, "Read error\n");
            exit(-1);
        }
        done = feof(stdin);

        if (XML_Parse(parser, Buff, len, done) == XML_STATUS_ERROR) {
            fprintf(stderr, "Parse error at line %d:\n%s\n",
                    XML_GetCurrentLineNumber(parser),
                    XML_ErrorString(XML_GetErrorCode(parser)));
            exit(-1);
        }

        /* when it reads EOF then quit the loop */
        if (done)
            break;
    }
    return 0;
}
/*****************************************************************/

The start element handler accesses the parser struct and calls on it the
XML_GetCurrentLineNumber function: 

printf("%s:%d:%s", user_data->filename, XML_GetCurrentLineNumber(user_data->parser), el)

Unfortunately this doesn't work, for example with this sample file:
<sample>

  <foo>it is me, foo</foo>
  <bar>it is you, bar</bar>

</sample>

I get this output:
$ cat sample.xml | outline-passing-data
stdin:1:sample
  stdin:1:foo
  stdin:1:bar

I wonder if there is some way to get the actual position for every
parsed element, this seems a very reasonable request since such
information could be used for example when performing the semantical
analysis of the XML tree to print out where exactly happened a
semantical error.

Any help will be highly appreciated.

Regards.
-- 
Stefano Sabatini
Linux user number 337176 (see http://counter.li.org)


More information about the Expat-discuss mailing list