issue in handling CSV data

Peter J. Holzer hjp-python at hjp.at
Sun Sep 8 12:39:47 EDT 2019


On 2019-09-08 05:41:07 -0700, Sharan Basappa wrote:
> On Sunday, 8 September 2019 04:56:29 UTC-4, Andrea D'Amore  wrote:
> > On Sun, 8 Sep 2019 at 02:19, Sharan Basappa <sharan.basappa at gmail.com> wrote:
> > > As you can see, the string "\t"81 is causing the error.
> > > It seems to be due to char "\t".
> > 
> > It is not clear what format do you expect to be in the file.
> > You say "it is CSV" so your actual payload seems to be a pair of three
> > bytes (a tab and two hex digits in ASCII) per line.
> 
> The issue seems to be presence of tabs along with the numbers in a single string. So, when I try to convert strings to numbers, it fails due to presence of tabs.
> 
> Here is the hex dump:
> 
> 22 61 64 64 72 65 73 73 2c 22 09 22 6c 65 6e 67 
> 74 68 2c 22 09 22 38 31 2c 22 09 35 63 0d 0a 22 
> 61 64 64 72 65 73 73 2c 22 09 22 6c 65 6e 67 74 
...

This looks like this:

"address,"      "length,"       "81,"   5c
"address,"      "length,"       "04,"   11
"address,"      "length,"       "e1,"   17
"address,"      "length,"       "6a,"   6c
...

Note that the commas are within the quotes. I'd say Andrea is correct:
This is a tab-separated file, not a comma-separated file. But for some
reason all fields except the last end with a comma. 

I would 

a) try to convince the person producing the file to clean up the mess

b) if that is not successful, use the csv module to read the file with
   separator tab and then discard the trailing commas.

        hp


-- 
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp at hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20190908/226e65dd/attachment.sig>


More information about the Python-list mailing list