String character encoding when converting data from one type/format to another

Dave Angel davea at davea.name
Wed Jan 7 08:52:42 EST 2015


On 01/07/2015 08:32 AM, Jacob Kruger wrote:
> Thanks.
>
Please don't top-post.  Put your responses after each quoted part you're 
responding to.  And if there are parts you're not responding to, please 
delete them.
>
> Issue with knowing encoding could just be that am pretty sure at least
> some of the data capture is done via copy/paste from one MS app to
> another, which could possibly result in a whole bunch of different
> character sets, etc. being copied across, so it comes down to that while
> can't control sources of data, need to manipulate/work with it to make
> it useful on our side now.
>

Copy/paste to/from properly written Windows programs is done in Unicode, 
so the problem should only be one of how the data was saved.  There, 
Windows is much more sloppy.

Chances are that a given machine will use a consistent encoding, so a 
given file should be consistent, unless it was used over a network.  And 
if all the machines that generate this data are in the same company, 
they might all use the same one as well.


-- 
DaveA



More information about the Python-list mailing list