[Bulk] Re: Alternatives to XML?

Roland Koebler rk-list at simple-is-better.org
Fri Aug 26 10:02:13 EDT 2016


Hi,

> It is *my* XML, and I know that I only use the offending characters inside
> attributes, and attributes are the only place where double-quote marks are
> allowed.
> 
> So this is my conversion routine -
> 
> lines = string.split('"')  # split on attributes
> for pos, line in enumerate(lines):
>    if pos%2:  # every 2nd line is an attribute
>        lines[pos] = line.replace('<', '<').replace('>', '>')
> return '"'.join(lines)
OMG!
So, you have a fileformat, which looks like XML, but actually isn't XML,
and will break if used with some "real" XML.

Although I don't like XML, if you want XML, you should follow Chris advice:
On Thu, Aug 25, 2016 at 09:40:03PM +1000, Chris Angelico wrote:
> just make sure it's always valid XML, rather
> than some "XML-like" file structure.

So, please:
- Don't try to write your own (not-quite-)XML-parser.
- Read how XML-files work.
- Read https://docs.python.org/3/library/xml.html
  and https://pypi.python.org/pypi/defusedxml/
- Think what you have done.
- Use a sensible XML-parser/dumper. This should escape most special-
  characters for you (at least: < > & " ').


Roland



More information about the Python-list mailing list