From jni.soma at gmail.com Mon Aug 1 09:49:56 2016 From: jni.soma at gmail.com (Juan Nunez-Iglesias) Date: Mon, 1 Aug 2016 06:49:56 -0700 Subject: [melbourne-pug] Some of the libraries/tutorials mentioned today Message-ID: Hi Melbourne Pythonistas, The weather and lack of agenda conspired to make todays PUG little, cosy, and freeform. I don?t know what *everyone?s* experience of it was, but I had some very interesting discussions with various attendees. I hope you all got something out of it! (And I hope everyone signed up to this list!) Here are some of the things we discussed: - PyCon-Australia is in Melbourne this year and coming very soon! Aug 12-16, at the Melbourne Convention Centre. Evidently it?s sold out but you can write to contact at pycon-au.org to get added to the waitlist, as well as to attend the miniconfs, which still have some tickets. If you?re a student, you can also ask about financial assistance. The deadline for that also passed but someone might be able to help you out. - SciPy 2016 was in Austin last month and it was spectacular. There is a YouTube playlist containing all the talks. I can recommend the Numba and parallel computing tutorials, (possibly) the scikit-image tutorial by yours truly, Matt Rocklin and Jim Crist?s talk on Dask for distributed computing, and Dan Allan?s talk on asyncio, among others. Most of these have slides/repositories online. - [*Sponsored Link* =P] If you are learning Python for science, as some attendees were, you could do worse than Elegant SciPy , written by myself, Harriet Dashnow of the Murdoch Children?s Research Institute (at the Royal Children?s Hospital), and St?fan van der Walt, creator of scikit-image and currently a fellow at the Berkeley Institute for Data Science. The book is in pre-release so we are still finalising repos and URLs, so you?ll need to access the data files here: https://github.com/elegant-scipy/elegant-scipy-data. Follow @elegantscipy on Twitter for updates! - The book is aimed at people with *some* programming experience. If you?re just getting started, Software Carpentry has some fantastic beginner materials, completely free to use. - If you want to get into Python development and are happy to donate your time, there is just an enormous number of exciting open source projects that could use your help, regardless of your level of ability. Find a project you?re excited about, lurk on their GitHub issues page, and find a place to jump in! Some projects even create ?easy? or ?beginner? tags for issues that don?t require heaps of Python expertise. I?m probably missing things. Feel free to respond with more! Thank you to everyone who came, and see you at PyCon and at next month?s MPUG! Juan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianna.laugher at gmail.com Mon Aug 1 20:02:43 2016 From: brianna.laugher at gmail.com (Brianna Laugher) Date: Tue, 2 Aug 2016 10:02:43 +1000 Subject: [melbourne-pug] PyConAU 2016 sells out Message-ID: We are thrilled to announce that *PyCon Australia 2016 has SOLD OUT!* Not in the bad way, of course -- we still have our principles, that beautiful is better than ugly, and explicit is better than implicit, and the rest . Just in the way that 500 of our best Pythonista mates will be joining us in Melbourne in a mere two weeks for a huge PyConAU #7! For would-be attendees who left it a little too late, you can email us with the subject line *"waiting list"* and we will put you in the queue in case there are cancellations. However, this is not a promising queue. Another option is to purchase a miniconf-only ticket that lets you attend on Friday only. If that interests you, email us with the subject line *"miniconf-only ticket"*. There is a limited number of these tickets available too so don't hesitate any longer. For attendees who did register and pay in time, if you forgot to buy a dinner ticket , you can still purchase one before 6 August. Or, if you want to upgrade your ticket to become a Contributor? , we would be thrilled to hear from you. Email away! Finally, if you are a company feeling a little left out of all this Pythontastic fabulousness, never fear - it is not too late to become a sponsor and we would love to hear from you ! === About PyCon Australia === PyCon Australia is the national conference for the Python programming community. The seventh PyCon Australia will be held on August 12-16 2016 in Melbourne, bringing together professional, student and enthusiast developers with a love for programming in Python. PyCon Australia informs the country?s developers with presentations by experts and core developers of Python, as well as the libraries and frameworks that they rely on. To find out more about PyCon Australia 2016, visit our website at http://pycon-au.org, follow us at @pyconau or e-mail us at contact at pycon-au.org. PyCon Australia is presented by Linux Australia (www.linux.org.au) and acknowledges the support of our Platinum Sponsors, DevDemand.co and IRESS; and our Gold sponsors, Google Australia, Optiver and Hewlett Packard Enterprise. For full details of our sponsors, see our website. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brianna.laugher at gmail.com Sun Aug 7 04:54:18 2016 From: brianna.laugher at gmail.com (Brianna Laugher) Date: Sun, 7 Aug 2016 18:54:18 +1000 Subject: [melbourne-pug] PyCon AU related events: Data Carpentry, MelbDjango, PyLadies, Running group Message-ID: It's PyCon Australia week! We trust you are as excited as we are! ? There are a few events happening this week in close association with PyCon AU. Check out the details below: On Wednesday and Thursday, there will be a *Data Carpentry Workshop * at VLSCI (University of Melbourne). This is a hands-on workshop teaching basic concepts, skills and tools for working more effectively with data. It's well suited to researchers and scientists, but open to anyone (including people not attending the conference). Topics covered will include OpenRefine, SQL, and Pandas. It's free to attend but see the web page to register. Data Carpentry is a sibling organization of Software Carpentry. Where Software Carpentry teaches best practices in software development, the Data Carpentry focus is on the introductory computational skills needed for data management and analysis in all domains of research. We are big fans of Software Carpentry and Data Carpentry at PyCon Australia; last year a Software Carpentry workshop was held alongside PyCon AU in Brisbane, so we are happy to see this tradition continue. Let's pythonify ALL the scientists! On Thursday evening, there will be the *MelbDjango birthday party * to celebrate MelbDjango turning three! That's a huge achievement for any meetup to sustain, let alone the other things they have also done - organising MelbDjango Camp, and MelbDjango Schools to share their knowledge and love for Django to new audiences. Congratulations to Brenton and the Common Code crew -- here's to another three years. ? Finally, on Saturday morning, the *PyLadies breakfast * will be happening at Left Bank Melbourne, just up the road on Southbank. This is also a free event, but please register by Friday for catering. The PyLadies Melbourne chapter just formed this year, and it's an exciting and rare chance for women and genderqueer/non-binary Pythonistas from around the country to meet in person and make connections. *We are looking for a sponsor for this event.* As the conference is sold out we unfortunately can't offer any free tickets, but we would be heartened to see financial support forthcoming for diversity and inclusion in the tech industry. *If you or your company is interested, please contact us .* On the days of the conference, there will be a *Running group *. There are some great runs planned and it's the best way to prepare for a day of sitting down watching presentations, so pack your sneakers! === About PyCon Australia === PyCon Australia is the national conference for the Python programming community. The seventh PyCon Australia will be held on August 12-16 2016 in Melbourne, bringing together professional, student and enthusiast developers with a love for programming in Python. PyCon Australia informs the country?s developers with presentations by experts and core developers of Python, as well as the libraries and frameworks that they rely on. To find out more about PyCon Australia 2016, visit our website at http://pycon-au.org, follow us at @pyconau or e-mail us at contact at pycon-au.org. PyCon Australia is presented by Linux Australia (www.linux.org.au) and acknowledges the support of our Platinum Sponsors, DevDemand.co and IRESS; and our Gold sponsors, Google Australia, Optiver and Hewlett Packard Enterprise. For full details of our sponsors, see our website. -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at metrak.com Sun Aug 7 15:20:25 2016 From: paul at metrak.com (paul sorenson) Date: Sun, 7 Aug 2016 12:20:25 -0700 Subject: [melbourne-pug] PyCon AU related events: Data Carpentry, MelbDjango, PyLadies, Running group In-Reply-To: References: Message-ID: Sounds like a great program. A couple of my colleagues, Cooper Lees and Jason Fried will be in town so put on some warmish weather and check out stuff we use Python for at Facebook, from network management to data pipelines and machine learning. cheers On 08/07/2016 01:54 AM, Brianna Laugher wrote: > > It's PyCon Australia week! We trust you are as excited as we are! ? > > There are a few events happening this week in close association with > PyCon AU. Check out the details below: > > On Wednesday and Thursday, there will be a *Data Carpentry Workshop > * > at VLSCI (University of Melbourne). This is a hands-on workshop > teaching basic concepts, skills and tools for working more effectively > with data. It's well suited to researchers and scientists, but open to > anyone (including people not attending the conference). Topics covered > will include OpenRefine, SQL, and Pandas. It's free to attend but see > the web page to register. > > Data Carpentry is a sibling organization of Software Carpentry. Where > Software Carpentry teaches best practices in software development, the > Data Carpentry focus is on the introductory computational skills > needed for data management and analysis in all domains of research. > > We are big fans of Software Carpentry and Data Carpentry at PyCon > Australia; last year a Software Carpentry workshop was held alongside > PyCon AU in Brisbane, so we are happy to see this tradition continue. > Let's pythonify ALL the scientists! > > On Thursday evening, there will be the *MelbDjango birthday party > * to celebrate > MelbDjango turning three! That's a huge achievement for any meetup to > sustain, let alone the other things they have also done - organising > MelbDjango Camp, and MelbDjango Schools to share their knowledge and > love for Django to new audiences. Congratulations to Brenton and the > Common Code crew -- here's to another three years. ? > > Finally, on Saturday morning, the *PyLadies breakfast > * > will be happening at Left Bank Melbourne, just up the road on > Southbank. This is also a free event, but please register by Friday > for catering. > > The PyLadies Melbourne chapter just > formed this year, and it's an exciting and rare chance for women and > genderqueer/non-binary Pythonistas from around the country to meet in > person and make connections. *We are looking for a sponsor for this > event.* As the conference is sold out we unfortunately can't offer any > free tickets, but we would be heartened to see financial support > forthcoming for diversity and inclusion in the tech industry. *If you > or your company is interested, please contact us > .* > > On the days of the conference, there will be a *Running group > *. There are some great runs > planned and it's the best way to prepare for a day of sitting down > watching presentations, so pack your sneakers! > > === About PyCon Australia === > > > PyCon Australia is the national conference for the Python programming > community. The seventh PyCon Australia will be held on August 12-16 > 2016 in Melbourne, bringing together professional, student and > enthusiast developers with a love for programming in Python. PyCon > Australia informs the country?s developers with presentations by > experts and core developers of Python, as well as the libraries and > frameworks that they rely on. > > > To find out more about PyCon Australia 2016, visit our website at > http://pycon-au.org, follow us at @pyconau > or e-mail us at contact at pycon-au.org > . > > > PyCon Australia is presented by Linux Australia (www.linux.org.au > ) and acknowledges the support of our > Platinum Sponsors, DevDemand.co and IRESS; and our Gold sponsors, > Google Australia, Optiver and Hewlett Packard Enterprise. For full > details of our sponsors, see our website. > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Mon Aug 8 01:17:10 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Mon, 8 Aug 2016 15:17:10 +1000 Subject: [melbourne-pug] PyCon AU related events: Data Carpentry, MelbDjango, PyLadies, Running group In-Reply-To: References: Message-ID: Hi Paul We should be able to arrange at least 15 celcius. And beer! Cheers Mike On 8/08/2016 5:20 AM, paul sorenson wrote: > Sounds like a great program.? A couple of my colleagues, Cooper Lees > and Jason Fried will be in town so put on some warmish weather and check > out stuff we use Python for at Facebook, from network management to data > pipelines and machine learning. > > cheers > > On 08/07/2016 01:54 AM, Brianna Laugher wrote: >> >> It's PyCon Australia week! We trust you are as excited as we are! ???? >> >> There are a few events happening this week in close association with >> PyCon AU. Check out the details below: >> >> On Wednesday and Thursday, there will be a *Data Carpentry Workshop >> * >> at VLSCI (University of Melbourne). This is a hands-on workshop >> teaching basic concepts, skills and tools for working more effectively >> with data. It's well suited to researchers and scientists, but open to >> anyone (including people not attending the conference). Topics covered >> will include OpenRefine, SQL, and Pandas. It's free to attend but see >> the web page to register. >> >> Data Carpentry is a sibling organization of Software Carpentry. Where >> Software Carpentry teaches best practices in software development, the >> Data Carpentry focus is on the introductory computational skills >> needed for data management and analysis in all domains of research. >> >> We are big fans of Software Carpentry and Data Carpentry at PyCon >> Australia; last year a Software Carpentry workshop was held alongside >> PyCon AU in Brisbane, so we are happy to see this tradition continue. >> Let's pythonify ALL the scientists! >> >> On Thursday evening, there will be the *MelbDjango birthday party >> * to celebrate >> MelbDjango turning three! That's a huge achievement for any meetup to >> sustain, let alone the other things they have also done - organising >> MelbDjango Camp, and MelbDjango Schools to share their knowledge and >> love for Django to new audiences. Congratulations to Brenton and the >> Common Code crew -- here's to another three years. ???? >> >> Finally, on Saturday morning, the *PyLadies breakfast >> * >> will be happening at Left Bank Melbourne, just up the road on >> Southbank. This is also a free event, but please register by Friday >> for catering. >> >> The PyLadies Melbourne chapter just >> formed this year, and it's an exciting and rare chance for women and >> genderqueer/non-binary Pythonistas from around the country to meet in >> person and make connections. *We are looking for a sponsor for this >> event.* As the conference is sold out we unfortunately can't offer any >> free tickets, but we would be heartened to see financial support >> forthcoming for diversity and inclusion in the tech industry. *If you >> or your company is interested, please contact us >> .* >> >> On the days of the conference, there will be a *Running group >> *. There are some great runs >> planned and it's the best way to prepare for a day of sitting down >> watching presentations, so pack your sneakers! >> >> === About PyCon Australia === >> >> >> PyCon Australia is the national conference for the Python programming >> community. The seventh PyCon Australia will be held on August 12-16 >> 2016 in Melbourne, bringing together professional, student and >> enthusiast developers with a love for programming in Python. PyCon >> Australia informs the country???s developers with presentations by >> experts and core developers of Python, as well as the libraries and >> frameworks that they rely on. >> >> >> To find out more about PyCon Australia 2016, visit our website at >> http://pycon-au.org, follow us at @pyconau >> or e-mail us at contact at pycon-au.org >> . >> >> >> PyCon Australia is presented by Linux Australia >> (www.linux.org.au) and acknowledges the >> support of our Platinum Sponsors, DevDemand.co and IRESS; and our Gold >> sponsors, Google Australia, Optiver and Hewlett Packard Enterprise. >> For full details of our sponsors, see our website. >> >> >> _______________________________________________ >> melbourne-pug mailing list >> melbourne-pug at python.org >> https://mail.python.org/mailman/listinfo/melbourne-pug > > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > From pizza at netspace.net.au Mon Aug 8 09:12:05 2016 From: pizza at netspace.net.au (Jason King) Date: Mon, 08 Aug 2016 23:12:05 +1000 Subject: [melbourne-pug] train cancellations friday night Message-ID: <57A88525.1040602@netspace.net.au> Note that on the friday of the conference, it looks like metro is cancelling trains on several lines from 4:30 pm onwards. It doesn't look like its cancelling everything, but you might have to wait a while to get a ride home. www.metrotrains.com.au/planned-works/ From ben+python at benfinney.id.au Mon Aug 8 20:38:19 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2016 10:38:19 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance Message-ID: <85shue22v8.fsf@benfinney.id.au> Howdy all, How can I specify which database (by its alias name) a Django ModelForm should use? (I'm having trouble getting the message onto the Django forum, so I'm asking this Python-related question here too.) A Django ModelForm knows its corresponding model, and the fields included. The ModelForm instance clearly knows how to specify a database, internally. It can validate its fields against the database, and can save a new model instance to the database. This implies its operations have knowledge of which database to use. What I need is to access that as an external user, when creating the instance. I can't find how to specify any database other than the default, when creating the ModelForm nor when it interacts with the database. What is the equivalent for using='foo' when instantiating a ModelForm for the model, or calling its methods (ModelForm.clean, ModelForm.save, etc.)? -- \ ?A free press is one where it's okay to state the conclusion | `\ you're led to by the evidence.? ?Bill Moyers | _o__) | Ben Finney From anthony.briggs at gmail.com Mon Aug 8 21:12:00 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 9 Aug 2016 11:12:00 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance In-Reply-To: <85shue22v8.fsf@benfinney.id.au> References: <85shue22v8.fsf@benfinney.id.au> Message-ID: Hi Ben, The database is normally routed via the model, rather than the form, so a ModelForm would generally just pick whatever the model uses. I would imagine that trying to hack on the form directly would be a Bad Plan(tm). https://docs.djangoproject.com/en/1.10/topics/db/multi-db/#automatic-database-routing That has info on setting up simple read/write replicas, 'using' in raw queries, etc. too. Hope that helps, Anthony On 9 August 2016 at 10:38, Ben Finney wrote: > Howdy all, > > How can I specify which database (by its alias name) a Django ModelForm > should use? > > (I'm having trouble getting the message onto the Django forum, so I'm > asking this Python-related question here too.) > > A Django ModelForm knows its corresponding model, and the fields > included. > > The ModelForm instance clearly knows how to specify a database, > internally. It can validate its fields against the database, and can > save a new model instance to the database. This implies its operations > have knowledge of which database to use. > > What I need is to access that as an external user, when creating the > instance. I can't find how to specify any database other than the > default, when creating the ModelForm nor when it interacts with the > database. > > What is the equivalent for using='foo' when instantiating a ModelForm > for the model, or calling its methods (ModelForm.clean, ModelForm.save, > etc.)? > > -- > \ ?A free press is one where it's okay to state the conclusion | > `\ you're led to by the evidence.? ?Bill Moyers | > _o__) | > Ben Finney > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Mon Aug 8 21:32:53 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2016 11:32:53 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance References: <85shue22v8.fsf@benfinney.id.au> Message-ID: <85oa5220ca.fsf@benfinney.id.au> Anthony Briggs writes: > The database is normally routed via the model, rather than the form, > so a ModelForm would generally just pick whatever the model uses. Okay, so how do I specify the arguments the ModelForm will use when interacting with the Model instance it creates? I'm trying to hook into the ModelForm's validation (ModelForm.clean) and model creation (ModelForm.save) behaviour, to tell it which database to use for that operation. > I would imagine that trying to hack on the form directly would be a > Bad Plan(tm). Sure, I'm trying to do this as an external user of the ModelForm. > https://docs.djangoproject.com/en/1.10/topics/db/multi-db/#automatic-database-routing I'm not interested in automatically routing all operations; I want to *specify* which database, for a particular operation. If it matters: this is in a management command, where I need to be able to specify from the command line that a particular database alias is the context for a command. The same way I can with the ?using? hook of ModelManager.using('foo'), or Model.save(using='foo'). What is the equivalent for a ModelForm instance? -- \ Fry: ?Take that, poor people!? Leela: ?But Fry, you?re not | `\ rich.? Fry: ?No, but I will be someday, and then people like | _o__) me better watch out!? ?Futurama | Ben Finney From anthony.briggs at gmail.com Mon Aug 8 22:06:38 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 9 Aug 2016 12:06:38 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance In-Reply-To: <85oa5220ca.fsf@benfinney.id.au> References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> Message-ID: Hi Ben, I'm not sure what you're trying to do. Trying to write a management command using model forms makes no sense - wouldn't you just load the model instance directly in that case? FWIW, there's only one instance of 'using' in Model forms, when saving over an existing instance: https://github.com/django/django/blob/master/django/forms/models.py#L819 Other than that, model forms would be using whatever's set for the model. Worst case, you can create a new subclass, override the ModelForm's save() method ( https://github.com/django/django/blob/master/django/forms/models.py#L433) and make it do what you need, but that overrides a bunch of Django's guarantees (mainly that the model table will be there to save to, or that it'll be in sync with what you currently have in memory) Anthony On 9 August 2016 at 11:32, Ben Finney wrote: > Anthony Briggs writes: > > > The database is normally routed via the model, rather than the form, > > so a ModelForm would generally just pick whatever the model uses. > > Okay, so how do I specify the arguments the ModelForm will use when > interacting with the Model instance it creates? > > I'm trying to hook into the ModelForm's validation (ModelForm.clean) and > model creation (ModelForm.save) behaviour, to tell it which database to > use for that operation. > > > I would imagine that trying to hack on the form directly would be a > > Bad Plan(tm). > > Sure, I'm trying to do this as an external user of the ModelForm. > > > https://docs.djangoproject.com/en/1.10/topics/db/multi- > db/#automatic-database-routing > > I'm not interested in automatically routing all operations; I want to > *specify* which database, for a particular operation. > > If it matters: this is in a management command, where I need to be able > to specify from the command line that a particular database alias is the > context for a command. > > The same way I can with the ?using? hook of ModelManager.using('foo'), > or Model.save(using='foo'). What is the equivalent for a ModelForm > instance? > > -- > \ Fry: ?Take that, poor people!? Leela: ?But Fry, you?re not | > `\ rich.? Fry: ?No, but I will be someday, and then people like | > _o__) me better watch out!? ?Futurama | > Ben Finney > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Mon Aug 8 22:35:38 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2016 12:35:38 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> Message-ID: <85inva1xfp.fsf@benfinney.id.au> Anthony Briggs writes: > I'm not sure what you're trying to do. Trying to write a management > command using model forms makes no sense - wouldn't you just load the > model instance directly in that case? The ModelForm validation functionality is being used, to validate input. Pseudocode:: import csv from cumquat_app.forms import CumquatImportForm db_alias = 'foo' reader = csv.DictReader(input_file) for row in reader: fields = make_fields_from_input_row(reader) # Wanted: ?form = CumquatInputForm(fields, using=db_alias)?. form = CumquatImportForm(fields) # Wanted: ?if form.is_valid(using=db_alias)?. if form.is_valid(): # Wanted: ?form.save(using=db_alias)?. form.save() At both points of interaction with the database ? validating the fields, and saving an instance ? I am expecting a way to specify *which* database to interact with. I have the option to tell the CumquatImportForm (which is a ModelForm subclass) to avoid the database when it saves; it returns an instance, which I can then instruct to save to a specific database:: cumquat = form.save(commit=False) cumquat.save(using='foo') But I can't seem to get an equivalent hook for the ?clean? or ?is_valid? methods, which also interact with the database to validate field (e.g. for unique constraints). How can I tell the ModelForm that its interactions with the database, be they instance creation or querysets or anything else, should be to the database whose alias I specify? The equivalent of the ?using='foo'?, above? -- \ ?? a voice said reassuringly: cheer up, things could get worse. | `\ So I cheered up and, sure enough, things got worse.? ?James C. | _o__) Hagerty, 1909?1981 | Ben Finney From anthony.briggs at gmail.com Mon Aug 8 22:48:16 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 9 Aug 2016 12:48:16 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance In-Reply-To: <85inva1xfp.fsf@benfinney.id.au> References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> Message-ID: Right, so you can override the ModelForm's save() method, but you lose all of the other nice Django things, like model init, checking, etc. which you'll also have to recreate manually. Or you can do it the Right Way(tm) and write a custom database router as per the reply before that. Look for `PrimaryReplicaRouter` in https://docs.djangoproject.com/en/1.10/topics/db/multi- db/#automatic-database-routing, I'm pretty sure you can make it do what you need to (eg. with a method/attribute to specify the database name) Anthony On 9 August 2016 at 12:35, Ben Finney wrote: > Anthony Briggs writes: > > > I'm not sure what you're trying to do. Trying to write a management > > command using model forms makes no sense - wouldn't you just load the > > model instance directly in that case? > > The ModelForm validation functionality is being used, to validate input. > > Pseudocode:: > > import csv > > from cumquat_app.forms import CumquatImportForm > > db_alias = 'foo' > > reader = csv.DictReader(input_file) > for row in reader: > fields = make_fields_from_input_row(reader) > > # Wanted: ?form = CumquatInputForm(fields, using=db_alias)?. > form = CumquatImportForm(fields) > > # Wanted: ?if form.is_valid(using=db_alias)?. > if form.is_valid(): > > # Wanted: ?form.save(using=db_alias)?. > form.save() > > At both points of interaction with the database ? validating the fields, > and saving an instance ? I am expecting a way to specify *which* > database to interact with. > > I have the option to tell the CumquatImportForm (which is a ModelForm > subclass) to avoid the database when it saves; it returns an instance, > which I can then instruct to save to a specific database:: > > cumquat = form.save(commit=False) > cumquat.save(using='foo') > > But I can't seem to get an equivalent hook for the ?clean? or ?is_valid? > methods, which also interact with the database to validate field (e.g. > for unique constraints). > > How can I tell the ModelForm that its interactions with the database, be > they instance creation or querysets or anything else, should be to the > database whose alias I specify? The equivalent of the ?using='foo'?, above? > > -- > \ ?? a voice said reassuringly: cheer up, things could get worse. | > `\ So I cheered up and, sure enough, things got worse.? ?James C. | > _o__) Hagerty, 1909?1981 | > Ben Finney > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 9 00:15:16 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 9 Aug 2016 14:15:16 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance In-Reply-To: <85inva1xfp.fsf@benfinney.id.au> References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> Message-ID: Ben Can I ask why you are using manage.py to import csv data instead of a migration? I ask because I'm just starting to think about an upcoming csv (xlsx actually) import task of my own. Thanks Mike On 9/08/2016 12:35 PM, Ben Finney wrote: > > The ModelForm validation functionality is being used, to validate input. > > Pseudocode:: > > import csv > > from cumquat_app.forms import CumquatImportForm > > db_alias = 'foo' > > reader = csv.DictReader(input_file) > for row in reader: > fields = make_fields_from_input_row(reader) > > # Wanted: ???form = CumquatInputForm(fields, using=db_alias)???. > form = CumquatImportForm(fields) > > # Wanted: ???if form.is_valid(using=db_alias)???. > if form.is_valid(): > > # Wanted: ???form.save(using=db_alias)???. > form.save() > From ben+python at benfinney.id.au Tue Aug 9 00:29:49 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2016 14:29:49 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> Message-ID: <85eg5y1s5e.fsf@benfinney.id.au> Anthony Briggs writes: > Right, so you can override the ModelForm's save() method, but you lose > all of the other nice Django things, like model init, checking, etc. > which you'll also have to recreate manually. Yes, exactly. All I need is to tell the ModelForm which database it should specify when interacting with the Model. > Or you can do it the Right Way(tm) and write a custom database router I appreciate the advice, but no, a database router is *not* right for this. Specifying the database routing policy in a configuration file is no use to me; the database is not known at configuration time, and I'm not talking about multiple simultaneously-connected databases. That is not the use case. Rather, I'm wanting to specify *exactly one* database at *run-time*, and have the operation use *that* database only (or fail if it can't). The configured multi-database routing is no use for that. The ?ModelManager.db_router? feature almost does the trick , letting me specify which database the manager will talk to. If I can create a Model instance using that, the ModelForm will use it correctly, I think. But that gets the order wrong; I don't have the right field values for instantiating the Model, and so it will fail. That's what the ModelForm is doing to begin with: validating the fields, transforming them, and creating the Model instance for me. So I still need to have the ModelForm connect to the database I specify at run-time. -- \ ?People are very open-minded about new things, as long as | `\ they're exactly like the old ones.? ?Charles F. Kettering | _o__) | Ben Finney From ben+python at benfinney.id.au Tue Aug 9 00:50:21 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2016 14:50:21 +1000 Subject: [melbourne-pug] Django: Best practice for importing data (was: Specify the database for a Django ModelForm instance) References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> Message-ID: <85a8gm1r76.fsf_-_@benfinney.id.au> Mike Dewhirst writes: > Can I ask why you are using manage.py to import csv data instead of a > migration? There are different scenarios when each makes sense. * When the data import is conceptually a defining feature of the database (e.g. a collection of status values), and should exist from the first creation of the database, storing the data as a fixture is the best way. Use an initial data fixture when your database is just starting, and you know this data should exist in every new instance of the application. * When the change in data is reflective of a change in the application behaviour (e.g. the application was only offering Country and State, but now needs to also offer City values), populating the new information with a migration is the right: the data should be there whenever the application behaviour aligns with that database state, and should not be there otherwise. Use a migration when the change in data is required for every instance of the application which exists at that state of the program code. * When the change in data does not represent any specific change in application behaviour (e.g. the application offered 135 manually-entered products for sale, and now the owner wants to import another 6?000 additional products), a migration is too much hassle because there's no implied different behaviour of the application. For small data additions, that do not need to be applied in every instance of the application, you could just use the application's user interface ? or the Admin ? to create them. But if the amount is large, a custom management command is best for automating it. > I ask because I'm just starting to think about an upcoming csv (xlsx > actually) import task of my own. I hope that helps. -- \ ?I was in the first submarine. Instead of a periscope, they had | `\ a kaleidoscope. ?We're surrounded.?? ?Steven Wright | _o__) | Ben Finney From anthony.briggs at gmail.com Tue Aug 9 00:52:41 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 9 Aug 2016 14:52:41 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance In-Reply-To: <85eg5y1s5e.fsf@benfinney.id.au> References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> <85eg5y1s5e.fsf@benfinney.id.au> Message-ID: On 9 August 2016 at 14:29, Ben Finney wrote: > Anthony Briggs writes: > > > Or you can do it the Right Way(tm) and write a custom database router > > I appreciate the advice, but no, a database router is *not* right for > this. Specifying the database routing policy in a configuration file is > no use to me; the database is not known at configuration time, and I'm > not talking about multiple simultaneously-connected databases. That is > not the use case. > > Rather, I'm wanting to specify *exactly one* database at *run-time*, and > have the operation use *that* database only (or fail if it can't). The > configured multi-database routing is no use for that. > You seem to be very focused on one specific, tactical thing which cuts across the grain of how Django works. You can get what you want, but it involves hollowing out ModelForm and replacing most of the code to couple it to the database. OTOH, the router is just a class whose methods return strings, so you can make it do whatever you want. About the only thing I'm not 100% sure on is how to find the router instance at run time, but once you have that, you can swap your db_alias='foo' for router.set_database('foo'). ConnectionRouter or django.db.router is possibly the place to start looking (https://github.com/django/django/blob/master/django/db/utils.py#L237 or https://github.com/django/django/blob/master/django/db/__init__.py#L18) -- > \ ?People are very open-minded about new things, as long as | > `\ they're exactly like the old ones.? ?Charles F. Kettering | > _o__) | > Ben Finney > Also, your sig is pretty amusing in context :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 9 01:04:34 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 9 Aug 2016 15:04:34 +1000 Subject: [melbourne-pug] Django: Best practice for importing data In-Reply-To: <85a8gm1r76.fsf_-_@benfinney.id.au> References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> <85a8gm1r76.fsf_-_@benfinney.id.au> Message-ID: Thanks Ben That confirms it for me. I'll go with migrations. Mike On 9/08/2016 2:50 PM, Ben Finney wrote: > Mike Dewhirst writes: > >> Can I ask why you are using manage.py to import csv data instead of a >> migration? > > There are different scenarios when each makes sense. > > * When the data import is conceptually a defining feature of the > database (e.g. a collection of status values), and should exist from > the first creation of the database, storing the data as a fixture is > the best way. > > > > Use an initial data fixture when your database is just starting, and > you know this data should exist in every new instance of the > application. > > * When the change in data is reflective of a change in the application > behaviour (e.g. the application was only offering Country and State, > but now needs to also offer City values), populating the new > information with a migration is the right: the data should be there > whenever the application behaviour aligns with that database state, > and should not be there otherwise. > > > > Use a migration when the change in data is required for every instance > of the application which exists at that state of the program code. > > * When the change in data does not represent any specific change in > application behaviour (e.g. the application offered 135 > manually-entered products for sale, and now the owner wants to import > another 6???000 additional products), a migration is too much hassle > because there's no implied different behaviour of the application. > > > > For small data additions, that do not need to be applied in every > instance of the application, you could just use the application's user > interface ??? or the Admin ??? to create them. But if the amount is large, > a custom management command is best for automating it. > >> I ask because I'm just starting to think about an upcoming csv (xlsx >> actually) import task of my own. > > I hope that helps. > From ben+python at benfinney.id.au Tue Aug 9 01:07:02 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2016 15:07:02 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> <85eg5y1s5e.fsf@benfinney.id.au> Message-ID: <8560ra1qfd.fsf@benfinney.id.au> Anthony Briggs writes: > On 9 August 2016 at 14:29, Ben Finney wrote: > > > I'm wanting to specify *exactly one* database at *run-time*, and > > have the operation use *that* database only (or fail if it can't). > > The configured multi-database routing is no use for that. > > You seem to be very focused on one specific, tactical thing which cuts > across the grain of how Django works. That would be true, if not for the fact that I've provided numerous examples where Django explicitly provides the ability to specify at run-time exactly which database to use. So in that light, it's already something Django is evidently comfortable doing. I'm open to the idea that Django's ModelForm can't do this; but it seems perverse to say that a feature provided explicitly and officially in many other closely-related parts is ?cutting against the grain of how Django works?. > You can get what you want, but it involves hollowing out ModelForm and > replacing most of the code to couple it to the database. So, rather than use a ModelForm for validating field values and creating the model instance ? very much in line with what the ModelForm is meant to do ? > OTOH, the router is just a class whose methods return strings, so you can > make it do whatever you want. ? you're instead advising me to cut across the grain of how a database router works (i.e. I don't want it to choose based on a configured policy; I don't want it to fall back to the ?default?; I need to specify it at run time; I need it not for whole classes of operations but only at the point of creating a Model instance; etc.) and hollow most of it out, to do a specific, tactical thing that isn't part of what it's meant for. I'm open to being convinced otherwise, but so far the evidence is heavily *against* using that tool for the requirements I've described. -- \ ?? one of the main causes of the fall of the Roman Empire was | `\ that, lacking zero, they had no way to indicate successful | _o__) termination of their C programs.? ?Robert Firth | Ben Finney From anthony.briggs at gmail.com Tue Aug 9 01:19:46 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 9 Aug 2016 15:19:46 +1000 Subject: [melbourne-pug] Specify the database for a Django ModelForm instance In-Reply-To: <8560ra1qfd.fsf@benfinney.id.au> References: <85shue22v8.fsf@benfinney.id.au> <85oa5220ca.fsf@benfinney.id.au> <85inva1xfp.fsf@benfinney.id.au> <85eg5y1s5e.fsf@benfinney.id.au> <8560ra1qfd.fsf@benfinney.id.au> Message-ID: Database routers are for routing database queries. Forms and ModelForms are for validating user input. If you want to route an update to a specific database using a ModelForm, then you're going to have a bad time, ie. you'll need to write a lot of code for the specific API that you want, as opposed to writing a handful of lines for a custom router (that *will* let you specify a database at runtime). Not sure I can put it more plainly than that ?\_(?)_/? On 9 August 2016 at 15:07, Ben Finney wrote: > Anthony Briggs writes: > > > On 9 August 2016 at 14:29, Ben Finney > wrote: > > > > > I'm wanting to specify *exactly one* database at *run-time*, and > > > have the operation use *that* database only (or fail if it can't). > > > The configured multi-database routing is no use for that. > > > > You seem to be very focused on one specific, tactical thing which cuts > > across the grain of how Django works. > > That would be true, if not for the fact that I've provided numerous > examples where Django explicitly provides the ability to specify at > run-time exactly which database to use. > > So in that light, it's already something Django is evidently comfortable > doing. > > I'm open to the idea that Django's ModelForm can't do this; but it seems > perverse to say that a feature provided explicitly and officially in > many other closely-related parts is ?cutting against the grain of how > Django works?. > > > You can get what you want, but it involves hollowing out ModelForm and > > replacing most of the code to couple it to the database. > > So, rather than use a ModelForm for validating field values and creating > the model instance ? very much in line with what the ModelForm is meant > to do ? > > > OTOH, the router is just a class whose methods return strings, so you can > > make it do whatever you want. > > ? you're instead advising me to cut across the grain of how a database > router works (i.e. I don't want it to choose based on a configured > policy; I don't want it to fall back to the ?default?; I need to specify > it at run time; I need it not for whole classes of operations but only > at the point of creating a Model instance; etc.) and hollow most of it > out, to do a specific, tactical thing that isn't part of what it's meant > for. > > I'm open to being convinced otherwise, but so far the evidence is > heavily *against* using that tool for the requirements I've described. > > -- > \ ?? one of the main causes of the fall of the Roman Empire was | > `\ that, lacking zero, they had no way to indicate successful | > _o__) termination of their C programs.? ?Robert Firth | > Ben Finney > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Mon Aug 15 21:01:30 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 11:01:30 +1000 Subject: [melbourne-pug] Unicode for windows dummies Message-ID: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> If anyone can point me to the appropriate advice for resolving the error below I would be most appreciative. Really very appreciative. I think I understand Unicode in theory and have reread a lot of articles including ... * https://docs.python.org/3/library/codecs.html#encodings-and-unicode * https://pythonconquerstheuniverse.wordpress.com/2010/05/30/unicode-beginners-introduction-for-dummies-made-simple/ * https://pythonconquerstheuniverse.wordpress.com/2010/06/04/unicode-for-dummies-just-use-utf-8/ * https://en.wikipedia.org/wiki/UTF-8 This is the error which has stumped me ... (xxex3) C:\Users\mike\env\xxex3\ssds>python substance/data_imports/map_csv.py Traceback (most recent call last): File "substance/data_imports/map_csv.py", line 139, in csvdata = CsvImport(csvfile, company, start, finish) File "substance/data_imports/map_csv.py", line 127, in __init__ print("%s" % cells) File "C:\Users\mike\env\xxex3\lib\encodings\cp850.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\u2030' in position 7452: character maps to I have saved the csv file involved as utf-8 using LibreOffice 5 on Windows 8.1. from the original Microsoft Excel spreadsheet. This is in Python 3.5 on Windows but it also needs to run in Python 2.7 on Ubuntu 14.04 server (no gui). map_csv.py [1] is the beginning of a module I want to develop into a generic data import facility. I'm starting with a specific csv file I need to import (not mine and its contents are private) and all it does at the moment is read in the file and print the lines to stdout. I have tried utf-8 encoding each line and that gets past the error but just produces a set of chars a snippet of which below [2]. Decoding that as utf-8 reproduces the error as might be expected. I have also tried decoding as utf-16 and encoding it as utf-8 but that didn't work either. Thanks for reading this far Mike [1] ... from __future__ import unicode_literals import os class CsvImport(object): """ Imports a csv file and converts it into a list of lists """ def __init__(self, csvfile, company, start, finish): self.company = company self.rows = list() with open(csvfile, "r") as csv: i = 0 self.rows = csv.readlines() for line in self.rows: i += 1 cells = list(line) if i >= start: print("%s" % cells) if i > finish: break if __name__ == "__main__": company = "Calia Pty Ltd" dirname = "{0}/csv".format(company.split()[0].lower()) filename = "{0}1.csv".format(company.split()[0].lower()) start = 105 finish = 404 currdir = os.path.realpath(os.path.dirname(__file__)).replace('\\', '/') csvfile = os.path.join(currdir, dirname, filename) csvdata = CsvImport(csvfile, company, start, finish) [1] ... , 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, 65, 99, 117, 116, 101, 32, 72, 97, 122, 97, 114, 100, 32, 84, 111, 32, 84, 104, 101, 32, 65, 113, 117, 97, 116, 105, 99, 32, 69, 110, 118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, 44, 44, 44, 44, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 67, 104, 114, 111, 110, 105, 99, 32, 72, 97, 122, 97, 114, 100, 32, 84, 111, 32, 84, 104, 101, 32, 65, 113, 117, 97, 116, 105, 99, 32, 69, 110, 118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, 44, 50, 44, 34, 78, 47, 65, 34, 44, 34, 71, 72, 83, 48, 57, 34, 44, 34, 72, 52, 49, 49, 34, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 72, 97, 122, 97, 114, 100, 111, 117, 115, 32, 84, 111, 32, 84, 104, 101, 32, 79, 122, 111, 110, 101, 32, 76, 97, 121, 101, 114, 46, 34, 44, 44, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 34, 34, 44, 34, 65, 100, 100, 105, 116, 105, 111, 110, 97, 108, 32, 78, 111, 110, 45, 71, 72, 83, 32, 72, 97, 122, 97, 114, 100, 32, 83, 116, 97, 116, 101, 109, 101, 110, 116, 34, 44, 34, 65, 85, 72, 48, 54, 54, 34, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 10] -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Mon Aug 15 21:27:33 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 11:27:33 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 11:01, Mike Dewhirst wrote: > If anyone can point me to the appropriate advice for resolving the error > below I would be most appreciative. Really very appreciative. > > I think I understand Unicode in theory and have reread a lot of articles > including ... The article we recommend for getting a good grasp is http://nedbatchelder.com/text/unipain.html or Joel Spolsky's article on the subject. > print("%s" % cells) Bit of a red flag. print("%s" % x) seems a funny way to write print(x). > cells = list(line) This is a list of characters, not a list of comma-separated values. Perhaps you wanted cells = line.split(',') or the CSV module which will handle quoted values too. > I have tried utf-8 encoding each line and that gets past the error but just > produces a set of chars a snippet of which below [2]. > Decoding that as utf-8 > reproduces the error as might be expected. I have also tried decoding as > utf-16 and encoding it as utf-8 but that didn't work either. As for the encode error, this tells you that it is trying to convert some text into bytes. I'm not sure if it is sys.stdout.write that is failing to do this, or the %. What is the value of sys.stdout.encoding at this point? -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From news02 at metrak.com Mon Aug 15 21:37:42 2016 From: news02 at metrak.com (paul sorenson) Date: Mon, 15 Aug 2016 18:37:42 -0700 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 08/15/2016 06:01 PM, Mike Dewhirst wrote: > > map_csv.py [1] is the beginning of a module I want to develop into a > generic data import facility. ... Might not help your encode problem but have you looked at csvkit? https://csvkit.readthedocs.io/en/0.9.1/ cheers -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Mon Aug 15 23:57:26 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 13:57:26 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16/08/2016 11:27 AM, William ML Leslie wrote: > On 16 August 2016 at 11:01, Mike Dewhirst wrote: >> If anyone can point me to the appropriate advice for resolving the error >> below I would be most appreciative. Really very appreciative. >> >> I think I understand Unicode in theory and have reread a lot of articles >> including ... > > The article we recommend for getting a good grasp is > http://nedbatchelder.com/text/unipain.html or Joel Spolsky's article > on the subject. > >> print("%s" % cells) > > Bit of a red flag. print("%s" % x) seems a funny way to write print(x). You're right. It is just a habit I picked up somewhere. I'll rejig that. As it happens print(x) made no difference. > >> cells = list(line) > > This is a list of characters, not a list of comma-separated values. > Perhaps you wanted cells = line.split(',') I did but the output didn't change. or the CSV module which > will handle quoted values too. I'll have look at that. > >> I have tried utf-8 encoding each line and that gets past the error but just >> produces a set of chars a snippet of which below [2]. >> Decoding that as utf-8 >> reproduces the error as might be expected. I have also tried decoding as >> utf-16 and encoding it as utf-8 but that didn't work either. > > As for the encode error, this tells you that it is trying to convert > some text into bytes. I'm not sure if it is sys.stdout.write that is > failing to do this, or the %. It looks like both but I'm sure it is Windows stuffing me up with that cp850 in the traceback. > What is the value of sys.stdout.encoding at this point? It is just a waypoint. I just wrote the class init and wanted to prove it produced data I can use before writing the necessary data mapping and import methods. Once it is working I'll probably put a conditional in there so it only produces stdout output when unit testing. Thanks William, I'll look at the csv module after trying out Paul's suggestion to look at csvkit Cheers Mike > From william.leslie.ttg at gmail.com Mon Aug 15 23:59:42 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 13:59:42 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 13:57, Mike Dewhirst wrote: > On 16/08/2016 11:27 AM, William ML Leslie wrote: >> What is the value of sys.stdout.encoding at this point? > > > It is just a waypoint. I just wrote the class init and wanted to prove it > produced data I can use before writing the necessary data mapping and import > methods. Once it is working I'll probably put a conditional in there so it > only produces stdout output when unit testing. I meant literally, what is its value. import sys print(sys.stdout.encoding) -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From miked at dewhirst.com.au Tue Aug 16 00:23:47 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 14:23:47 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: <8998e5e1-260e-a2f2-04dd-2243cda3c009@dewhirst.com.au> On 16/08/2016 1:59 PM, William ML Leslie wrote: > On 16 August 2016 at 13:57, Mike Dewhirst wrote: >> On 16/08/2016 11:27 AM, William ML Leslie wrote: >>> What is the value of sys.stdout.encoding at this point? >> >> >> It is just a waypoint. I just wrote the class init and wanted to prove it >> produced data I can use before writing the necessary data mapping and import >> methods. Once it is working I'll probably put a conditional in there so it >> only produces stdout output when unit testing. > > I meant literally, what is its value. > > import sys > print(sys.stdout.encoding) cp850 There is no cp850 in the env vars so it must come from somewhere else. > From anthony.briggs at gmail.com Tue Aug 16 00:24:35 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 16 Aug 2016 14:24:35 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: Hi Mike, I was just trying to solve a similar problem at the PyconAU sprints :) The error is that there are some things / Unicode strings which don't translate to Windows 'charmap' characters, and can't be printed to the terminal. You can replicate it with this code: print("M?? h??v??r??r??ft ???? f????l ??f ????l??".encode("cp1252")) The solution depends on what you're trying to do: - If it's a one-off thing, you can find and eliminate the utf-8 characters in the csv file. - Failing that, you can encode to 'cp1252' and replace or ignore the unicode characters that don't map. https://docs.python.org/3/howto/unicode.html#the-string-type has more details, but something like line.decode("cp1252", "replace") on the lines that you're reading from the csv file should work (ie. convert to windows encoding - Thirdly, there's a package called unicodecsv, which is a drop-in utf-8 version of the csv module, and might fix your unicode errors. Anthony On 16 August 2016 at 11:01, Mike Dewhirst wrote: > If anyone can point me to the appropriate advice for resolving the error > below I would be most appreciative. Really very appreciative. > > I think I understand Unicode in theory and have reread a lot of articles > including ... > > * https://docs.python.org/3/library/codecs.html#encodings-and-unicode > * https://pythonconquerstheuniverse.wordpress.com/2010/05/30/ > unicode-beginners-introduction-for-dummies-made-simple/ > * https://pythonconquerstheuniverse.wordpress.com/2010/06/04/ > unicode-for-dummies-just-use-utf-8/ > * https://en.wikipedia.org/wiki/UTF-8 > > This is the error which has stumped me ... > > (xxex3) C:\Users\mike\env\xxex3\ssds>python substance/data_imports/map_csv.py > > Traceback (most recent call last): > > File "substance/data_imports/map_csv.py", line 139, in > > csvdata = CsvImport(csvfile, company, start, finish) > > File "substance/data_imports/map_csv.py", line 127, in __init__ > > print("%s" % cells) > > File "C:\Users\mike\env\xxex3\lib\encodings\cp850.py", line 19, in encode > > return codecs.charmap_encode(input,self.errors,encoding_map)[0] > > UnicodeEncodeError: 'charmap' codec can't encode character '\u2030' in position 7452: character maps to > > > I have saved the csv file involved as utf-8 using LibreOffice 5 on Windows > 8.1. from the original Microsoft Excel spreadsheet. > > This is in Python 3.5 on Windows but it also needs to run in Python 2.7 on > Ubuntu 14.04 server (no gui). > > map_csv.py [1] is the beginning of a module I want to develop into a > generic data import facility. I'm starting with a specific csv file I need > to import (not mine and its contents are private) and all it does at the > moment is read in the file and print the lines to stdout. > > I have tried utf-8 encoding each line and that gets past the error but > just produces a set of chars a snippet of which below [2]. Decoding that as > utf-8 reproduces the error as might be expected. I have also tried decoding > as utf-16 and encoding it as utf-8 but that didn't work either. > > Thanks for reading this far > > Mike > > [1] ... > > from __future__ import unicode_literals > > import os > > class CsvImport(object): > > """ Imports a csv file and converts it into a list of lists """ > > def __init__(self, csvfile, company, start, finish): > > self.company = company > > self.rows = list() > > with open(csvfile, "r") as csv: > > i = 0 > > self.rows = csv.readlines() > > for line in self.rows: > > i += 1 > > cells = list(line) > > if i >= start: > > print("%s" % cells) > > if i > finish: > > break > > if __name__ == "__main__": > > company = "Calia Pty Ltd" > > dirname = "{0}/csv".format(company.split()[0].lower()) > > filename = "{0}1.csv".format(company.split()[0].lower()) > > start = 105 > > finish = 404 > > currdir = os.path.realpath(os.path.dirname(__file__)).replace('\\', '/') > > csvfile = os.path.join(currdir, dirname, filename) > > csvdata = CsvImport(csvfile, company, start, finish) > > [1] ... , 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, > 34, 65, 99, 117, 116, 101, 32, 72, 97, 122, 97, 114, 100, 32, 84, 111, 32, > 84, 104, 101, 32, 65, 113, 117, 97, 116, 105, 99, 32, 69, 110, 118, 105, > 114, 111, 110, 109, 101, 110, 116, 46, 34, 44, 44, 44, 44, 44, 44, 44, 48, > 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, > 34, 44, 34, 34, 44, 34, 67, 104, 114, 111, 110, 105, 99, 32, 72, 97, 122, > 97, 114, 100, 32, 84, 111, 32, 84, 104, 101, 32, 65, 113, 117, 97, 116, > 105, 99, 32, 69, 110, 118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, > 44, 50, 44, 34, 78, 47, 65, 34, 44, 34, 71, 72, 83, 48, 57, 34, 44, 34, 72, > 52, 49, 49, 34, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, > 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 72, 97, > 122, 97, 114, 100, 111, 117, 115, 32, 84, 111, 32, 84, 104, 101, 32, 79, > 122, 111, 110, 101, 32, 76, 97, 121, 101, 114, 46, 34, 44, 44, 44, 44, 44, > 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 34, 34, 44, 34, 65, 100, > 100, 105, 116, 105, 111, 110, 97, 108, 32, 78, 111, 110, 45, 71, 72, 83, > 32, 72, 97, 122, 97, 114, 100, 32, 83, 116, 97, 116, 101, 109, 101, 110, > 116, 34, 44, 34, 65, 85, 72, 48, 54, 54, 34, 44, 48, 46, 48, 48, 48, 48, > 48, 37, 44, 34, 34, 10] > > > > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Tue Aug 16 00:30:06 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 14:30:06 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 14:24, Anthony Briggs wrote: > Hi Mike, > > I was just trying to solve a similar problem at the PyconAU sprints :) > > The error is that there are some things / Unicode strings which don't > translate to Windows 'charmap' characters, and can't be printed to the > terminal. You can replicate it with this code: > > print("M?? h??v??r??r??ft ???? f????l ??f ????l??".encode("cp1252")) The print() is redundant here, the function call is never reached; this is different to the example, where it is the print itself that is failing. print("M?? h??v??r??r??ft ???? f????l ??f ????l??") will also reproduce the error, but for a different reason: stdout now fails to encode. -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From william.leslie.ttg at gmail.com Tue Aug 16 00:34:50 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 14:34:50 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <8998e5e1-260e-a2f2-04dd-2243cda3c009@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <8998e5e1-260e-a2f2-04dd-2243cda3c009@dewhirst.com.au> Message-ID: On 16 August 2016 at 14:23, Mike Dewhirst wrote: > On 16/08/2016 1:59 PM, William ML Leslie wrote: >> import sys >> print(sys.stdout.encoding) > > > cp850 > > There is no cp850 in the env vars so it must come from somewhere else. > You could set it to something else if you trust that the terminal will actually display those characters. sys.stdout.encoding = 'UTF-16' Alternatively, since you're printing the list for representation purposes, you could print(', '.join(repr(cell) for cell in cells)). -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From anthony.briggs at gmail.com Tue Aug 16 00:40:13 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 16 Aug 2016 14:40:13 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 14:30, William ML Leslie wrote: > On 16 August 2016 at 14:24, Anthony Briggs > wrote: > > Hi Mike, > > > > I was just trying to solve a similar problem at the PyconAU sprints :) > > > > The error is that there are some things / Unicode strings which don't > > translate to Windows 'charmap' characters, and can't be printed to the > > terminal. You can replicate it with this code: > > > > print("M?? h??v??r??r??ft ???? f????l ??f ????l??".encode("cp1252")) > > The print() is redundant here, the function call is never reached; > this is different to the example, where it is the print itself that is > failing. > > print("M?? h??v??r??r??ft ???? f????l ??f ????l??") > > will also reproduce the error, but for a different reason: stdout now > fails to encode. > print("M?? h??v??r??r??ft ???? f????l ??f ????l??") works just fine for me, since you're just printing an internal Python string. The problem is from trying to print a binary string (which is what you get from .encode()) as an internal Python string. If you specify an encoding, the error goes away: print("M?? h??v??r??r??ft ???? f????l ??f ????l??".encode("utf-8").decode("cp1252", "replace")) which is not what you would do in practice, but the 'replace' will drive over anything not displayable on a terminal, so ?\_(?)_/? I also forgot the fourth option, which is to specify the encoding when opening the file, eg. file('blah.csv', 'w', encoding='utf-8') Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Tue Aug 16 00:57:14 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 14:57:14 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 14:40, Anthony Briggs wrote: > print("M?? h??v??r??r??ft ???? f????l ??f ????l??") > > works just fine for me, since you're just printing an internal Python > string. It will work fine unless you're on Mike's machine - if sys.stdout.encoding is cp850 and you've got unicode_literals imported (or are using python3), it won't. >The problem is from trying to print a binary string (which is what > you get from .encode()) as an internal Python string. If you specify an > encoding, the error goes away: > > print("M?? h??v??r??r??ft ???? f????l ??f > ????l??".encode("utf-8").decode("cp1252", "replace")) The only reason to encode to utf-8 and then decode from cp1252 is to fix incorrect input. I think you mean .encode("cp1252", "replace").decode("cp1252") -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From miked at dewhirst.com.au Tue Aug 16 01:09:30 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 15:09:30 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: <62ec7f50-3bc7-077a-cd50-71c9d9896021@dewhirst.com.au> On 16/08/2016 2:57 PM, William ML Leslie wrote: > On 16 August 2016 at 14:40, Anthony Briggs wrote: >> print("M?????? h??????v?????r?????r?????ft ?????????? f???????????l ?????f ??????????l?????") >> >> works just fine for me, since you're just printing an internal Python >> string. > > It will work fine unless you're on Mike's machine - if > sys.stdout.encoding is cp850 and you've got unicode_literals imported > (or are using python3), it won't. Ok. I'm on Mike's machine and I'm using Python 3.5 ... But I just discovered a command >chcp 1252 which switches the active code page and sys.stdout respects that. > >> The problem is from trying to print a binary string (which is what >> you get from .encode()) as an internal Python string. If you specify an >> encoding, the error goes away: >> >> print("M?????? h??????v?????r?????r?????ft ?????????? f???????????l ?????f >> ??????????l?????".encode("utf-8").decode("cp1252", "replace")) > > The only reason to encode to utf-8 and then decode from cp1252 is to > fix incorrect input. > > I think you mean .encode("cp1252", "replace").decode("cp1252") > From anthony.briggs at gmail.com Tue Aug 16 01:28:47 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 16 Aug 2016 15:28:47 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 14:57, William ML Leslie wrote: > On 16 August 2016 at 14:40, Anthony Briggs > wrote: > > print("M?? h??v??r??r??ft ???? f????l ??f ????l??") > > > > works just fine for me, since you're just printing an internal Python > > string. > > It will work fine unless you're on Mike's machine - if > sys.stdout.encoding is cp850 and you've got unicode_literals imported > (or are using python3), it won't. > That string is translated to a cp1252 character set, so I'd be surprised if it didn't work. OTOH, try utf-8 characters in a Windows Python REPL, and you don't even make it to the end of the string :) print("M? h?v?r?r?ft ?? f?ll ?f ??ls") >The problem is from trying to print a binary string (which is what > > you get from .encode()) as an internal Python string. If you specify an > > encoding, the error goes away: > > > > print("M?? h??v??r??r??ft ???? f????l ??f > > ????l??".encode("utf-8").decode("cp1252", "replace")) > > The only reason to encode to utf-8 and then decode from cp1252 is to > fix incorrect input. > > I think you mean .encode("cp1252", "replace").decode("cp1252") > No - the point was to get a binary string that doesn't translate nicely into cp1252, otherwise you don't need the 'replace' parameter. This is Mike's core problem - he's reading bytes from a utf-8 file, and trying to print that to the terminal. Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 16 01:51:11 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 15:51:11 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> First of all, thank you all very much for this support. I was seriously contemplating a change of career to something easy like becoming an Olympic gymnast and giving up unicode forever ... but anyway ... From the top, some examples which might shed some light ... class CsvImport(object): """ Imports a csv file and converts it into a list of lists """ def __init__(self, csvfile, company, start, finish): self.company = company self.rows = list() with open(csvfile, "r") as csv: i = 0 self.rows = csv.readlines() for line in self.rows: line = line.encode("utf-8").decode("cp1252", "replace") #line = line.encode("cp1252").decode("cp1252", "replace") #line = line.encode("cp1252") #line = line.encode("utf-8") i += 1 cells = list(line) # this requires a bytes-like object not 'str' #cells = line.split(",") if i >= start: # as expected this includes the [] brackets around each row #print(cells) # this omits the [] brackets but otherwise output is identical print(', '.join(repr(cell) for cell in cells)) if i > finish: break Output with different code page settings ... line = line.encode("utf-8").decode("cp1252", "replace") , ',', '\x00', ',', '\x00', ',', '\x00', ',', '\x00', '0', '\x00', '.', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '%', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', ',', '\x00', '"', '\x00', 'A', '\x00', ' d', '\x00', 'd', '\x00', 'i', '\x00', 't', '\x00', 'i', '\x00', 'o', '\x00', 'n', '\x00', 'a ', '\x00', 'l', '\x00', ' ', '\x00', 'N', '\x00', 'o', '\x00', 'n', '\x00', '-', '\x00', 'G' , '\x00', 'H', '\x00', 'S', '\x00', ' ', '\x00', 'H', '\x00', 'a', '\x00', 'z', '\x00', 'a', '\x00', 'r', '\x00', 'd', '\x00', ' ', '\x00', 'S', '\x00', 't', '\x00', 'a', '\x00', 't', '\x00', 'e', '\x00', 'm', '\x00', 'e', '\x00', 'n', '\x00', 't', '\x00', '"', '\x00', ',', ' \x00', ',', '\x00', '0', '\x00', '.', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '0', '\ x00', '0', '\x00', '%', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', '\n' line = line.encode("cp1252").decode("cp1252", "replace") , ',', '\x00', ',', '\x00', ',', '\x00', ',', '\x00', '0', '\x00', '.', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '%', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', ',', '\x00', '"', '\x00', 'A', '\x00', ' d', '\x00', 'd', '\x00', 'i', '\x00', 't', '\x00', 'i', '\x00', 'o', '\x00', 'n', '\x00', 'a ', '\x00', 'l', '\x00', ' ', '\x00', 'N', '\x00', 'o', '\x00', 'n', '\x00', '-', '\x00', 'G' , '\x00', 'H', '\x00', 'S', '\x00', ' ', '\x00', 'H', '\x00', 'a', '\x00', 'z', '\x00', 'a', '\x00', 'r', '\x00', 'd', '\x00', ' ', '\x00', 'S', '\x00', 't', '\x00', 'a', '\x00', 't', '\x00', 'e', '\x00', 'm', '\x00', 'e', '\x00', 'n', '\x00', 't', '\x00', '"', '\x00', ',', ' \x00', ',', '\x00', '0', '\x00', '.', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '0', '\ x00', '0', '\x00', '%', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', '\n' Both of which are identical ... so now line = line.encode("cp1252") , 44, 0, 34, 0, 72, 0, 52, 0, 49, 0, 49, 0, 34, 0, 44, 0, 44, 0, 44, 0, 48, 0, 46, 0, 48, 0, 48, 0, 48, 0, 48, 0, 48, 0, 37, 0, 44, 0, 34, 0, 34, 0, 44, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 7 2, 0, 97, 0, 122, 0, 97, 0, 114, 0, 100, 0, 111, 0, 117, 0, 115, 0, 32, 0, 84, 0, 111, 0, 32 , 0, 84, 0, 104, 0, 101, 0, 32, 0, 79, 0, 122, 0, 111, 0, 110, 0, 101, 0, 32, 0, 76, 0, 97, 0, 121, 0, 101, 0, 114, 0, 46, 0, 34, 0, 44, 0, 44, 0, 44, 0, 44, 0, 44, 0, 48, 0, 46, 0, 48 , 0, 48, 0, 48, 0, 48, 0, 48, 0, 37, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 65, 0, 100, 0, 100, 0, 105, 0, 116, 0, 105, 0, 111, 0, 110, 0, 97, 0, 108, 0, 32, 0, 78, 0, 111, 0, 110, 0, 45, 0, 71, 0, 72, 0, 83, 0, 32, 0, 72, 0, 97, 0, 122, 0, 97, 0, 114, 0, 100, 0, 32, 0, 83, 0, 116, 0, 97, 0, 116, 0, 101, 0, 109, 0, 101, 0, 110, 0, 116, 0, 34, 0, 44, 0, 44, 0, 48, 0, 46, 0, 48, 0, 48, 0, 48, 0, 48, 0, 48, 0, 37, 0, 44, 0, 34, 0, 34, 0, 1 0 line = line.encode("cp1252") , 44, 0, 34, 0, 72, 0, 52, 0, 49, 0, 49, 0, 34, 0, 44, 0, 44, 0, 44, 0, 48, 0, 46, 0, 48, 0, 48, 0, 48, 0, 48, 0, 48, 0, 37, 0, 44, 0, 34, 0, 34, 0, 44, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 7 2, 0, 97, 0, 122, 0, 97, 0, 114, 0, 100, 0, 111, 0, 117, 0, 115, 0, 32, 0, 84, 0, 111, 0, 32 , 0, 84, 0, 104, 0, 101, 0, 32, 0, 79, 0, 122, 0, 111, 0, 110, 0, 101, 0, 32, 0, 76, 0, 97, 0, 121, 0, 101, 0, 114, 0, 46, 0, 34, 0, 44, 0, 44, 0, 44, 0, 44, 0, 44, 0, 48, 0, 46, 0, 48 , 0, 48, 0, 48, 0, 48, 0, 48, 0, 37, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 34, 0, 44, 0, 34, 0, 65, 0, 100, 0, 100, 0, 105, 0, 116, 0, 105, 0, 111, 0, 110, 0, 97, 0, 108, 0, 32, 0, 78, 0, 111, 0, 110, 0, 45, 0, 71, 0, 72, 0, 83, 0, 32, 0, 72, 0, 97, 0, 122, 0, 97, 0, 114, 0, 100, 0, 32, 0, 83, 0, 116, 0, 97, 0, 116, 0, 101, 0, 109, 0, 101, 0, 110, 0, 116, 0, 34, 0, 44, 0, 44, 0, 48, 0, 46, 0, 48, 0, 48, 0, 48, 0, 48, 0, 48, 0, 37, 0, 44, 0, 34, 0, 34, 0, 1 0 And both of these are identical. And for completeness ... (xxex3) C:\Users\mike\env\xxex3\ssds>python Python 3.5.1 (v3.5.1:37a07cee5969, Dec 6 2015, 01:54:25) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.stdout.encoding 'cp1252' >>> On 16/08/2016 3:28 PM, Anthony Briggs wrote: > > > On 16 August 2016 at 14:57, William ML Leslie > > > wrote: > > On 16 August 2016 at 14:40, Anthony Briggs > > wrote: > > print("M?????? h??????v?????r?????r?????ft ?????????? > f???????????l ?????f ??????????l?????") > > > > works just fine for me, since you're just printing an internal > Python > > string. > > It will work fine unless you're on Mike's machine - if > sys.stdout.encoding is cp850 and you've got unicode_literals imported > (or are using python3), it won't. > > > That string is translated to a cp1252 character set, so I'd be > surprised if it didn't work. > > OTOH, try utf-8 characters in a Windows Python REPL, and you don't > even make it to the end of the string :) > > print("M?? h??v??r??r? ft ???? f??ll ??f ????ls") > > >The problem is from trying to print a binary string (which is what > > you get from .encode()) as an internal Python string. If you > specify an > > encoding, the error goes away: > > > > print("M?????? h??????v?????r?????r?????ft ?????????? > f???????????l ?????f > > ??????????l?????".encode("utf-8").decode("cp1252", "replace")) > > The only reason to encode to utf-8 and then decode from cp1252 is to > fix incorrect input. > > I think you mean .encode("cp1252", "replace").decode("cp1252") > > > No - the point was to get a binary string that doesn't translate > nicely into cp1252, otherwise you don't need the 'replace' parameter. > This is Mike's core problem - he's reading bytes from a utf-8 file, > and trying to print that to the terminal. > > Anthony > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Tue Aug 16 02:08:02 2016 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 16 Aug 2016 16:08:02 +1000 Subject: [melbourne-pug] Unicode for windows dummies References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> Message-ID: <85y43xxn3x.fsf@benfinney.id.au> Mike Dewhirst writes: > First of all, thank you all very much for this support. I was > seriously contemplating a change of career to something easy like > becoming an Olympic gymnast and giving up unicode forever Unicode isn't a career. It's a calling. -- \ ?Never express yourself more clearly than you are able to | `\ think.? ?Niels Bohr | _o__) | Ben Finney From miked at dewhirst.com.au Tue Aug 16 03:06:20 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 17:06:20 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <85y43xxn3x.fsf@benfinney.id.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> <85y43xxn3x.fsf@benfinney.id.au> Message-ID: <204356e2-929d-5de9-2921-3602259631a5@dewhirst.com.au> On 16/08/2016 4:08 PM, Ben Finney wrote: > Unicode isn't a career. It's a calling. Must be easier without Windows From william.leslie.ttg at gmail.com Tue Aug 16 03:12:40 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 17:12:40 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 15:28, Anthony Briggs wrote: > > > On 16 August 2016 at 14:57, William ML Leslie < > william.leslie.ttg at gmail.com> wrote: > >> On 16 August 2016 at 14:40, Anthony Briggs >> wrote: >> > print("M?? h??v??r??r??ft ???? f????l ??f ????l??") >> > >> > works just fine for me, since you're just printing an internal Python >> > string. >> >> It will work fine unless you're on Mike's machine - if >> sys.stdout.encoding is cp850 and you've got unicode_literals imported >> (or are using python3), it won't. >> > > That string is translated to a cp1252 character set, so I'd be surprised > if it didn't work. > > OTOH, try utf-8 characters in a Windows Python REPL, and you don't even > make it to the end of the string :) > > print("M? h?v?r?r?ft ?? f?ll ?f ??ls") > ?All of those characters are represented in cp1252? and can print on a windows terminal, but I think we're confusing two things here, so lets try them both: >>> s = b'M\xc3\xbf h\xc3\xb4v\xc3\xa8r\xc3\xa7r\xc3\xa0ft \xc3\xae\xc3\x9f f\xc3\xbbll \xc3\xb6f \xc3\xa9\xc3\xaals' Here, s is the text you sent in utf-8, in case my mail client gets confused. >>> ?print(s.decode('utf-8'))? | M? h?v?r?r?ft ?? f?ll ?f ??ls This works, because s.decode('utf-8') is a valid text string, mappable by cp1252. >>> print(s.decode('cp1252')) | M?? h??v??r??r? ft ???? f??ll ??f ????ls This succeeds, as all possible bytes are mapped by cp1252. However, it prints nonsense. This case is different, though. In python3, reading from an open file will give us text, and it happens that the text (from the default encoding) is not representable in cp1252. For an example, >>> t = u'given \u2113\u2081 = 7' >>> print(t) This will not work on a machine with cp1252 as the codec. >>> print(t.encode("cp1252", "replace").decode("cp1252")) | given ?? = 7 Will replace the correct number of characters with qmarks, preserving the structure of the text. >>> print(t.encode("utf-8").decode("cp1252", "replace")) | given ?????? = 7 gives nonsense. > > >The problem is from trying to print a binary string (which is what >> > you get from .encode()) as an internal Python string. If you specify an >> > encoding, the error goes away: >> > >> > print("M?? h??v??r??r??ft ???? f????l ??f >> > ????l??".encode("utf-8").decode("cp1252", "replace")) >> >> The only reason to encode to utf-8 and then decode from cp1252 is to >> fix incorrect input. >> >> I think you mean .encode("cp1252", "replace").decode("cp1252") >> > > No - the point was to get a binary string that doesn't translate nicely > into cp1252, otherwise you don't need the 'replace' parameter. This is > Mike's core problem - he's reading bytes from a utf-8 file, and trying to > print that to the terminal. > ?First things first - mike isn't reading bytes. open() in python 3.5 gives text; but the text he gets is not representable in cp850.? All bytestrings "translate nicely" into cp1252 when you .decode("cp1252") them, where by nicely I presume you mean not raising an exception, not actually making sense. .decode("cp1252") can /never/ fail when applied to a bytestring (so the "replace" is redundant), and the result can never fail to encode to cp1252. -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Tue Aug 16 03:17:46 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 17:17:46 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> Message-ID: On 16 August 2016 at 15:51, Mike Dewhirst wrote: > Output with different code page settings ... > > line = line.encode("utf-8").decode("cp1252", "replace") > > , ',', '\x00', ',', '\x00', ',', '\x00', ',', '\x00', '0', '\x00', '.', > '\x00', '0', '\x00', > '0', '\x00', '0', '\x00', '0', '\x00', '0', '\x00', '%', '\x00', ',', > '\x00', '"', '\x00', > '"', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', ',', '\x00', '"', > '\x00', 'A', '\x00', ' > d', '\x00', 'd', '\x00', 'i', '\x00', 't', '\x00', 'i', '\x00', 'o', > '\x00', 'n', '\x00', 'a > ', '\x00', 'l', '\x00', ' ', '\x00', 'N', '\x00', 'o', '\x00', 'n', > '\x00', '-', '\x00', 'G' > , '\x00', 'H', '\x00', 'S', '\x00', ' ', '\x00', 'H', '\x00', 'a', '\x00', > 'z', '\x00', 'a', > '\x00', 'r', '\x00', 'd', '\x00', ' ', '\x00', 'S', '\x00', 't', '\x00', > 'a', '\x00', 't', > '\x00', 'e', '\x00', 'm', '\x00', 'e', '\x00', 'n', '\x00', 't', '\x00', > '"', '\x00', ',', ' > \x00', ',', '\x00', '0', '\x00', '.', '\x00', '0', '\x00', '0', '\x00', > '0', '\x00', '0', '\ > x00', '0', '\x00', '%', '\x00', ',', '\x00', '"', '\x00', '"', '\x00', '\n' > ?Illustrative: you're opening a UTF-16 file with the default encoding of utf-8. ?open(csvfile, 'r', encoding='UTF-16')? -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Tue Aug 16 03:22:05 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 17:22:05 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> Message-ID: On 16 August 2016 at 15:51, Mike Dewhirst wrote: > > # this requires a bytes-like object not 'str' > > #cells = line.split(",") > > ?You got that exception because you had one of the .encode steps above. Don't do any .encode before the split. -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 16 03:31:16 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 17:31:16 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> Message-ID: <72ad077e-8943-2128-2127-d0313a075a21@dewhirst.com.au> On 16/08/2016 5:17 PM, William ML Leslie wrote: > , encoding='UTF-16' Output snippet shown below ... with open(csvfile, "r", encoding='utf-16') as csv: i = 0 self.rows = csv.readlines() for line in self.rows: #line = line.encode("utf-8").decode("cp1252", "replace") #line = line.encode("cp1252").decode("cp1252", "replace") #line = line.encode("cp1252") line = line.encode("utf-8") i += 1 cells = list(line) # this requires a bytes-like object not 'str' #cells = line.split(",") if i >= start: # as expected this includes the [] brackets around each row print(cells) # this omits the [] brackets but otherwise output is identical #print(', '.join(repr(cell) for cell in cells)) if i > finish: break ... 118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, 44, 44, 44, 44, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 67, 104, 114, 111, 110, 105, 99, 32, 72, 97, 122, 97, 114, 100, 32, 84, 111, 32, 84, 104, 1 01, 32, 65, 113, 117, 97, 116, 105, 99, 32, 69, 110, 118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, 44, 50, 44, 34, 78, 47, 65, 34, 44, 34, 71, 72, 83, 48, 57, 34, 44, 34, 72, 52 , 49, 49, 34, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34 , 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 72, 97, 122, 97, 114, 100, 111, 117, 115, 32, 84, 111, 32, 84, 104, 101, 32, 79, 122, 111, 110, 101, 32, 76, 97, 121, 101, 114, 46, 34 , 44, 44, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 34, 34, 44, 34, 65, 10 0, 100, 105, 116, 105, 111, 110, 97, 108, 32, 78, 111, 110, 45, 71, 72, 83, 32, 72, 97, 122, 97, 114, 100, 32, 83, 116, 97, 116, 101, 109, 101, 110, 116, 34, 44, 34, 65, 85, 72, 48, 54 , 54, 34, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 10] -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 16 03:33:19 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 17:33:19 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> Message-ID: On 16/08/2016 5:22 PM, William ML Leslie wrote: > > On 16 August 2016 at 15:51, Mike Dewhirst > wrote: > > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? # this requires a bytes-like object not 'str' > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #cells = line.split(",") > > > ???You got that exception because you had one of the .encode steps > above. Don't do any .encode before the split. I won't - that line remains commented out. See my email of a minute ago. The comment is there to remind me. Mike > > -- > William Leslie > > Notice: > Likely much of this email is, by the nature of copyright, covered > under copyright law.? You absolutely MAY reproduce any part of it in > accordance with the copyright law of the nation you are reading this > in.? Any attempt to DENY YOU THOSE RIGHTS would be illegal without > prior contractual agreement. > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Tue Aug 16 03:36:42 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 17:36:42 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <72ad077e-8943-2128-2127-d0313a075a21@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> <72ad077e-8943-2128-2127-d0313a075a21@dewhirst.com.au> Message-ID: ?What does something like this do for you?? with open(csvfile, "r", encoding='utf-16') as csv: > > self.rows = csv.readlines() > > for > ?i, ? > line in > ?enumerate(? > self.rows > ?)? > : > > cells = line.split(",") > > if i >= start: > > print(', '.join(cells) > ?.encode('cp1252', 'replace').decode('cp1252')? > ) > > if i > finish: > > break > > > -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 16 03:40:14 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 17:40:14 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16/08/2016 11:37 AM, paul sorenson wrote: > On 08/15/2016 06:01 PM, Mike Dewhirst wrote: >> >> map_csv.py [1] is the beginning of a module I want to develop into a >> generic data import facility. ... > Might not help your encode problem but have you looked at csvkit? > > https://csvkit.readthedocs.io/en/0.9.1/ On the grounds that "Might not" isn't really boolean I'll give it a try just in case it helps :-) Thanks Paul Mike > > cheers > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 16 04:20:33 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 18:20:33 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> <72ad077e-8943-2128-2127-d0313a075a21@dewhirst.com.au> Message-ID: <50b269b9-1449-e16c-31b2-c0116d9bbffc@dewhirst.com.au> On 16/08/2016 5:36 PM, William ML Leslie wrote: > ???What does something like this do for you???? Before responding here is something which I think spells curtains for my Windows laptop. http://stackoverflow.com/questions/878972/windows-cmd-encoding-change-causes-python-crash I had a look at the csf file and it had somehow converted itself to utf-16. I suppose it was me but I didn't notice. I do remember saving as a few different encodings and I must have forgotten. It is now definitely utf-8 and shall remain so. I'll be back M > > with open(csvfile, "r", encoding='utf-16') as csv: > > ? ? ? self.rows = csv.readlines() > > ? ? ? for > ???i, ??? > line in > ???enumerate(??? > self.rows > ???)??? > : > > ? ? ? ? ? ? ? cells = line.split(",") > > ? ? ? ? ? ? ? if i >= start: > > ? ? ? ? ? ? ? ? ? ? ? print(', '.join(cells) > ???.encode('cp1252', 'replace').decode('cp1252')??? > ) > > ? ? ? ? ? ? ? if i > finish: > > ? ? ? ? ? ? ? ? ? ? ? break > > > > -- > William Leslie > > Notice: > Likely much of this email is, by the nature of copyright, covered > under copyright law.? You absolutely MAY reproduce any part of it in > accordance with the copyright law of the nation you are reading this > in.? Any attempt to DENY YOU THOSE RIGHTS would be illegal without > prior contractual agreement. > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug -------------- next part -------------- An HTML attachment was scrubbed... URL: From miked at dewhirst.com.au Tue Aug 16 04:35:52 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 18:35:52 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> <72ad077e-8943-2128-2127-d0313a075a21@dewhirst.com.au> Message-ID: <6401af07-7d24-dd47-7941-33de424a63d0@dewhirst.com.au> On 16/08/2016 5:36 PM, William ML Leslie wrote: > ???What does something like this do for you???? That is an official, gold plated win! Thanks for your persistence William. Everything is utf-8 now. I found a Windows registry hack to convert the codepage to utf-8 or actually cp65001 as Microsoft prefer to call it. No more cp850 or cp1252. Then I adjusted your ... with open(csvfile, "r", encoding='utf-16') as csv: ... to utf-8 and we have readable output. I'm cooking again. Apart from beer next time we meet and a glowing credit in the project contribution list you have my sincere gratitude. Fantastic Mike > > with open(csvfile, "r", encoding='utf-16') as csv: > > ? ? ? self.rows = csv.readlines() > > ? ? ? for > ???i, ??? > line in > ???enumerate(??? > self.rows > ???)??? > : > > ? ? ? ? ? ? ? cells = line.split(",") > > ? ? ? ? ? ? ? if i >= start: > > ? ? ? ? ? ? ? ? ? ? ? print(', '.join(cells) > ???.encode('cp1252', 'replace').decode('cp1252')??? > ) > > ? ? ? ? ? ? ? if i > finish: > > ? ? ? ? ? ? ? ? ? ? ? break > > > > -- > William Leslie > > Notice: > Likely much of this email is, by the nature of copyright, covered > under copyright law.? You absolutely MAY reproduce any part of it in > accordance with the copyright law of the nation you are reading this > in.? Any attempt to DENY YOU THOSE RIGHTS would be illegal without > prior contractual agreement. > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug From miked at dewhirst.com.au Tue Aug 16 04:47:10 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Tue, 16 Aug 2016 18:47:10 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <6401af07-7d24-dd47-7941-33de424a63d0@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> <72ad077e-8943-2128-2127-d0313a075a21@dewhirst.com.au> <6401af07-7d24-dd47-7941-33de424a63d0@dewhirst.com.au> Message-ID: <25358ca7-4fd6-8ebe-1e63-38f38b847a1f@dewhirst.com.au> Postscript ... Just went back into the csv file and it had switched *itself* back to utf-16. This does not compute. I need to lift my understanding somewhat. Mike On 16/08/2016 6:35 PM, Mike Dewhirst wrote: > On 16/08/2016 5:36 PM, William ML Leslie wrote: >> ????????What does something like this do for you????????? > > That is an official, gold plated win! > > Thanks for your persistence William. Everything is utf-8 now. I found > a Windows registry hack to convert the codepage to utf-8 or actually > cp65001 as Microsoft prefer to call it. No more cp850 or cp1252. Then > I adjusted your ... > > with open(csvfile, "r", encoding='utf-16') as csv: > > ... to utf-8 and we have readable output. I'm cooking again. > > Apart from beer next time we meet and a glowing credit in the project > contribution list you have my sincere gratitude. > > Fantastic > > Mike > >> >> with open(csvfile, "r", encoding='utf-16') as csv: >> >> ?? ?? ?? self.rows = csv.readlines() >> >> ?? ?? ?? for >> ????????i, ???????? >> line in >> ????????enumerate(???????? >> self.rows >> ????????)???????? >> : >> >> ?? ?? ?? ?? ?? ?? ?? cells = line.split(",") >> >> ?? ?? ?? ?? ?? ?? ?? if i >= start: >> >> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? print(', '.join(cells) >> ????????.encode('cp1252', 'replace').decode('cp1252')???????? >> ) >> >> ?? ?? ?? ?? ?? ?? ?? if i > finish: >> >> ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? ?? break >> >> >> >> -- >> William Leslie >> >> Notice: >> Likely much of this email is, by the nature of copyright, covered >> under copyright law.?? You absolutely MAY reproduce any part of it >> in accordance with the copyright law of the nation you are reading >> this in.?? Any attempt to DENY YOU THOSE RIGHTS would be illegal >> without prior contractual agreement. >> >> >> _______________________________________________ >> melbourne-pug mailing list >> melbourne-pug at python.org >> https://mail.python.org/mailman/listinfo/melbourne-pug > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug From william.leslie.ttg at gmail.com Tue Aug 16 06:04:52 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 20:04:52 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <25358ca7-4fd6-8ebe-1e63-38f38b847a1f@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> <22bd44c3-2938-9d2a-e585-ceed6cb837b5@dewhirst.com.au> <72ad077e-8943-2128-2127-d0313a075a21@dewhirst.com.au> <6401af07-7d24-dd47-7941-33de424a63d0@dewhirst.com.au> <25358ca7-4fd6-8ebe-1e63-38f38b847a1f@dewhirst.com.au> Message-ID: On 16 August 2016 at 18:47, Mike Dewhirst wrote: > Postscript ... > > Just went back into the csv file and it had switched *itself* back to > utf-16. This does not compute. I need to lift my understanding somewhat. > ?LibreOffice asks what encoding you want to save in, but other programs, including text editors, may not.? On 16 August 2016 at 18:35, Mike Dewhirst wrote: > On 16/08/2016 5:36 PM, William ML Leslie wrote: > >> ???What does something like this do for you???? >> > > That is an official, gold plated win! > > Thanks for your persistence William. Everything is utf-8 now. I found a > Windows registry hack to convert the codepage to utf-8 or actually cp65001 > as Microsoft prefer to call it. No more cp850 or cp1252. Then I adjusted > your ... > > with open(csvfile, "r", encoding='utf-16') as csv: > > ... to utf-8 and we have readable output. I'm cooking again. > Cool, I learned a few things too. I think the main points to take away for me are: * You can get errors="replace" behaviour for select stdout lines by encoding to and then decoding from that encoding, with the replace handler. I wonder how difficult it would be for print() and TextIOBase.write() to support errors= as a keyword argument. * Check if things are really UTF-8 - they often aren't. * Also, repr(str) no longer returns an ASCII-clean string in python3.? > > Apart from beer next time we meet and a glowing credit in the project > contribution list you have my sincere gratitude. > ?Great, looking forward to it! I hope csvkit helps, too. Offhand, I'm sorry if I was overly assertive at any point on this list. I tend to get twitchy when dealing with Unicode, but I don't think that's uncommon. -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. -------------- next part -------------- An HTML attachment was scrubbed... URL: From anthony.briggs at gmail.com Tue Aug 16 06:55:39 2016 From: anthony.briggs at gmail.com (Anthony Briggs) Date: Tue, 16 Aug 2016 20:55:39 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 17:12, William ML Leslie wrote: > > On 16 August 2016 at 15:28, Anthony Briggs > wrote: >> >> That string is translated to a cp1252 character set, so I'd be surprised >> if it didn't work. >> >> OTOH, try utf-8 characters in a Windows Python REPL, and you don't even >> make it to the end of the string :) >> >> print("M? h?v?r?r?ft ?? f?ll ?f ??ls") >> > > ?All of those characters are represented in cp1252? and can print on a > windows terminal, > *No, they can't! * Hint: I'm sitting in front of a Windows terminal. PS C:\Users\Anthony> python Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> print("M?? h??v??r??r??ft ???? f????l ??f ????l??") M?? h??v??r??r??ft ???? f????l ??f ????l?? >>> print("M? h?v?r?r?ft ?? f?ll ?f ??ls") File "", line 0 ^ SyntaxError: 'utf-8' codec can't decode byte 0x98 in position 8: invalid start byte >>> print("M? h?v?r?r?ft ?? f?ll ?f ??ls".decode('utf-8')) File "", line 0 ^ SyntaxError: 'utf-8' codec can't decode byte 0x98 in position 8: invalid start byte >>> print("M? h?v?r?r?ft ?? f?ll ?f ??ls".decode('cp1252')) File "", line 0 ^ SyntaxError: 'utf-8' codec can't decode byte 0x98 in position 8: invalid start byte >>> print("M? h?v?r?r?ft ?? f?ll ?f ??ls".decode('cp1252', "replace")) File "", line 0 ^ SyntaxError: 'utf-8' codec can't decode byte 0x98 in position 8: invalid start byte >>> And the results for your test code: PS C:\Users\Anthony> python Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:44:40) [MSC v.1600 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> s = b'M\xc3\xbf h\xc3\xb4v\xc3\xa8r\xc3\xa7r\xc3\xa0ft \xc3\xae\xc3\x9f f\xc3\xbbll \xc3\xb6f \xc3\xa9\xc3\xaals' >>> print(s.decode('utf-8')) M?? h??v??r??r??ft ???? f??ll ??f ????ls >>> t = u'given \u2113\u2081 = 7' >>> print(t) given ?????? = 7 >>> print(t.encode("cp1252", "replace").decode("cp1252")) given ?? = 7 >>> print(t.encode("utf-8").decode("cp1252", "replace")) given ???????????????? = 7 >>> Totally different. If you're not actually testing that what you're saying works, please stop confusing the issue. Anthony -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave at montagesoftware.com.au Mon Aug 15 21:16:51 2016 From: dave at montagesoftware.com.au (David Micallef) Date: Tue, 16 Aug 2016 01:16:51 +0000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: Hi Mike I use the linux program iconv before importing to csv for these issues. Even though the csv's are supposed to be UTF-8 I find systems sometimes slip in something that is not. the -c in arguments ignores errors and moves right along removing problem parts from the output The following function is from UTF-8 to UTF-8 seems pointless though it works because the -c def _convert_to_utf8(self): old_file_path = self.file_path self.file_path = old_file_path.replace('.', '-utf8.') LOG.info('Converting to UTF8, new file: %s' % self.file_path) cmd = ' '.join(['iconv', '-f', 'UTF-8', '-t', 'UTF-8', '-c', old_file_path, '>', self.file_path]) LOG.info(cmd) system(cmd) On Tue, 16 Aug 2016 at 11:04 Mike Dewhirst wrote: > If anyone can point me to the appropriate advice for resolving the error > below I would be most appreciative. Really very appreciative. > > I think I understand Unicode in theory and have reread a lot of articles > including ... > > * https://docs.python.org/3/library/codecs.html#encodings-and-unicode > * > https://pythonconquerstheuniverse.wordpress.com/2010/05/30/unicode-beginners-introduction-for-dummies-made-simple/ > * > https://pythonconquerstheuniverse.wordpress.com/2010/06/04/unicode-for-dummies-just-use-utf-8/ > * https://en.wikipedia.org/wiki/UTF-8 > > This is the error which has stumped me ... > > (xxex3) C:\Users\mike\env\xxex3\ssds>python substance/data_imports/map_csv.py > > Traceback (most recent call last): > > File "substance/data_imports/map_csv.py", line 139, in > > csvdata = CsvImport(csvfile, company, start, finish) > > File "substance/data_imports/map_csv.py", line 127, in __init__ > > print("%s" % cells) > > File "C:\Users\mike\env\xxex3\lib\encodings\cp850.py", line 19, in encode > > return codecs.charmap_encode(input,self.errors,encoding_map)[0] > > UnicodeEncodeError: 'charmap' codec can't encode character '\u2030' in position 7452: character maps to > > > I have saved the csv file involved as utf-8 using LibreOffice 5 on Windows > 8.1. from the original Microsoft Excel spreadsheet. > > This is in Python 3.5 on Windows but it also needs to run in Python 2.7 on > Ubuntu 14.04 server (no gui). > > map_csv.py [1] is the beginning of a module I want to develop into a > generic data import facility. I'm starting with a specific csv file I need > to import (not mine and its contents are private) and all it does at the > moment is read in the file and print the lines to stdout. > > I have tried utf-8 encoding each line and that gets past the error but > just produces a set of chars a snippet of which below [2]. Decoding that as > utf-8 reproduces the error as might be expected. I have also tried decoding > as utf-16 and encoding it as utf-8 but that didn't work either. > > Thanks for reading this far > > Mike > > [1] ... > > from __future__ import unicode_literals > > import os > > class CsvImport(object): > > """ Imports a csv file and converts it into a list of lists """ > > def __init__(self, csvfile, company, start, finish): > > self.company = company > > self.rows = list() > > with open(csvfile, "r") as csv: > > i = 0 > > self.rows = csv.readlines() > > for line in self.rows: > > i += 1 > > cells = list(line) > > if i >= start: > > print("%s" % cells) > > if i > finish: > > break > > if __name__ == "__main__": > > company = "Calia Pty Ltd" > > dirname = "{0}/csv".format(company.split()[0].lower()) > > filename = "{0}1.csv".format(company.split()[0].lower()) > > start = 105 > > finish = 404 > > currdir = os.path.realpath(os.path.dirname(__file__)).replace('\\', '/') > > csvfile = os.path.join(currdir, dirname, filename) > > csvdata = CsvImport(csvfile, company, start, finish) > > [1] ... , 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, > 34, 65, 99, 117, 116, 101, 32, 72, 97, 122, 97, 114, 100, 32, 84, 111, 32, > 84, 104, 101, 32, 65, 113, 117, 97, 116, 105, 99, 32, 69, 110, 118, 105, > 114, 111, 110, 109, 101, 110, 116, 46, 34, 44, 44, 44, 44, 44, 44, 44, 48, > 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, > 34, 44, 34, 34, 44, 34, 67, 104, 114, 111, 110, 105, 99, 32, 72, 97, 122, > 97, 114, 100, 32, 84, 111, 32, 84, 104, 101, 32, 65, 113, 117, 97, 116, > 105, 99, 32, 69, 110, 118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, > 44, 50, 44, 34, 78, 47, 65, 34, 44, 34, 71, 72, 83, 48, 57, 34, 44, 34, 72, > 52, 49, 49, 34, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, > 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 72, 97, > 122, 97, 114, 100, 111, 117, 115, 32, 84, 111, 32, 84, 104, 101, 32, 79, > 122, 111, 110, 101, 32, 76, 97, 121, 101, 114, 46, 34, 44, 44, 44, 44, 44, > 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 34, 34, 44, 34, 65, 100, > 100, 105, 116, 105, 111, 110, 97, 108, 32, 78, 111, 110, 45, 71, 72, 83, > 32, 72, 97, 122, 97, 114, 100, 32, 83, 116, 97, 116, 101, 109, 101, 110, > 116, 34, 44, 34, 65, 85, 72, 48, 54, 54, 34, 44, 48, 46, 48, 48, 48, 48, > 48, 37, 44, 34, 34, 10] > > > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From onward.edkim at gmail.com Mon Aug 15 22:08:05 2016 From: onward.edkim at gmail.com (Edward Kim) Date: Tue, 16 Aug 2016 02:08:05 +0000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: This is not an exact answer but... CSV with encoding is always having trouble all around beacuse csv doesn't have any encoding information in the file. Alternatively, you can create xls file easily using well-made libs such as xlwt, openpyxl, xlsxwriter. On Tue, 16 Aug 2016 at 11:44 AM, paul sorenson wrote: > On 08/15/2016 06:01 PM, Mike Dewhirst wrote: > > > map_csv.py [1] is the beginning of a module I want to develop into a > generic data import facility. ... > > Might not help your encode problem but have you looked at csvkit? > > https://csvkit.readthedocs.io/en/0.9.1/ > > cheers > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > -------------- next part -------------- An HTML attachment was scrubbed... URL: From william.leslie.ttg at gmail.com Tue Aug 16 07:51:21 2016 From: william.leslie.ttg at gmail.com (William ML Leslie) Date: Tue, 16 Aug 2016 21:51:21 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: On 16 August 2016 at 20:55, Anthony Briggs wrote: > > > On 16 August 2016 at 17:12, William ML Leslie > wrote: >> >> >> On 16 August 2016 at 15:28, Anthony Briggs >> wrote: >>> >>> That string is translated to a cp1252 character set, so I'd be surprised >>> if it didn't work. >>> >>> OTOH, try utf-8 characters in a Windows Python REPL, and you don't even >>> make it to the end of the string :) >>> >>> print("M? h?v?r?r?ft ?? f?ll ?f ??ls") >> >> >> All of those characters are represented in cp1252 and can print on a >> windows terminal, > > > No, they can't! http://imgur.com/a/JTqOV -- William Leslie Notice: Likely much of this email is, by the nature of copyright, covered under copyright law. You absolutely MAY reproduce any part of it in accordance with the copyright law of the nation you are reading this in. Any attempt to DENY YOU THOSE RIGHTS would be illegal without prior contractual agreement. From miked at dewhirst.com.au Tue Aug 16 20:24:13 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Wed, 17 Aug 2016 10:24:13 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: <1d3a8fe6-a3d6-13fc-7e48-89046f963693@dewhirst.com.au> On 16/08/2016 11:16 AM, David Micallef wrote: > Hi Mike > > I use the linux program iconv before importing to csv for these > issues. Even though the csv's are supposed to be UTF-8 I find systems > sometimes slip in something that is not. the -c in arguments ignores > errors and moves right along removing problem parts from the output > > The following function is from UTF-8 to UTF-8 seems pointless though > it works because the -c Thanks David. I'll need something like this for deploying on Linux ... Cheers mike > > def _convert_to_utf8(self): > old_file_path =self.file_path > self.file_path = old_file_path.replace('.','-utf8.') > LOG.info('Converting to UTF8, new file: %s' %self.file_path) > cmd =' '.join(['iconv','-f','UTF-8','-t','UTF-8','-c', > old_file_path,'>',self.file_path]) > LOG.info(cmd) > system(cmd) > > On Tue, 16 Aug 2016 at 11:04 Mike Dewhirst > wrote: > > If anyone can point me to the appropriate advice for resolving the > error below I would be most appreciative. Really very appreciative. > > I think I understand Unicode in theory and have reread a lot of > articles including ... > > * https://docs.python.org/3/library/codecs.html#encodings-and-unicode > * > https://pythonconquerstheuniverse.wordpress.com/2010/05/30/unicode-beginners-introduction-for-dummies-made-simple/ > * > https://pythonconquerstheuniverse.wordpress.com/2010/06/04/unicode-for-dummies-just-use-utf-8/ > * https://en.wikipedia.org/wiki/UTF-8 > > This is the error which has stumped me ... > > (xxex3) C:\Users\mike\env\xxex3\ssds>python > substance/data_imports/map_csv.py > > Traceback (most recent call last): > > ? File "substance/data_imports/map_csv.py", line 139, in > > ? ? ? csvdata = CsvImport(csvfile, company, start, finish) > > ? File "substance/data_imports/map_csv.py", line 127, in __init__ > > ? ? ? print("%s" % cells) > > ? File "C:\Users\mike\env\xxex3\lib\encodings\cp850.py", line 19, > in encode > > ? ? ? return codecs.charmap_encode(input,self.errors,encoding_map)[0] > > UnicodeEncodeError: 'charmap' codec can't encode character > '\u2030' in position 7452: character maps to > > > I have saved the csv file involved as utf-8 using LibreOffice 5 on > Windows 8.1. from the original Microsoft Excel spreadsheet. > > This is in Python 3.5 on Windows but it also needs to run in > Python 2.7 on Ubuntu 14.04 server (no gui). > > map_csv.py [1] is the beginning of a module I want to develop into > a generic data import facility. I'm starting with a specific csv > file I need to import (not mine and its contents are private) and > all it does at the moment is read in the file and print the lines > to stdout. > > I have tried utf-8 encoding each line and that gets past the error > but just produces a set of chars a snippet of which below [2]. > Decoding that as utf-8 reproduces the error as might be expected. > I have also tried decoding as utf-16 and encoding it as utf-8 but > that didn't work either. > > Thanks for reading this far > > Mike > > [1] ... > > from __future__ import unicode_literals > > import os > > class CsvImport(object): > > ? ? ? """ Imports a csv file and converts it into a list of lists """ > > ? ? ? def __init__(self, csvfile, company, start, finish): > > ? ? ? ? ? ? ? self.company = company > > ? ? ? ? ? ? ? self.rows = list() > > ? ? ? ? ? ? ? with open(csvfile, "r") as csv: > > ? ? ? ? ? ? ? ? ? ? ? i = 0 > > ? ? ? ? ? ? ? ? ? ? ? self.rows = csv.readlines() > > ? ? ? ? ? ? ? ? ? ? ? for line in self.rows: > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? i += 1 > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? cells = list(line) > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? if i >= start: > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? print("%s" % cells) > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? if i > finish: > > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? break > > if __name__ == "__main__": > > ? ? ? company = "Calia Pty Ltd" > > ? ? ? dirname = "{0}/csv".format(company.split()[0].lower()) > > ? ? ? filename = "{0}1.csv".format(company.split()[0].lower()) > > ? ? ? start = 105 > > ? ? ? finish = 404 > > ? ? ? currdir = > os.path.realpath(os.path.dirname(__file__)).replace('\\', '/') > > ? ? ? csvfile = os.path.join(currdir, dirname, filename) > > ? ? ? csvdata = CsvImport(csvfile, company, start, finish) > > [1] ... , 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, > 34, 44, 34, 65, 99, 117, 116, 101, 32, 72, 97, 122, 97, 114, 100, > 32, 84, 111, 32, 84, 104, 101, 32, 65, 113, 117, 97, 116, 105, 99, > 32, 69, 110, 118, 105, 114, 111, 110, 109, 101, 110, 116, 46, 34, > 44, 44, 44, 44, 44, 44, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, > 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 34, 44, > 34, 67, 104, 114, 111, 110, 105, 99, 32, 72, 97, 122, 97, 114, > 100, 32, 84, 111, 32, 84, 104, 101, 32, 65, 113, 117, 97, 116, > 105, 99, 32, 69, 110, 118, 105, 114, 111,? 110, 109, 101, 110, > 116, 46, 34, 44, 50, 44, 34, 78, 47, 65, 34, 44, 34, 71, 72, 83, > 48, 57, 34, 44, 34, 72, 52, 49, 49, 34, 44, 44, 44, 48, 46, 48, > 48, 48, 48, 48, 37, 44, 34, 34, 44, 44, 34, 34, 44, 34, 34, 44, > 34, 34, 44, 34, 34, 44, 34, 34, 44, 34, 72, 97, 122, 97, 114, 100, > 111, 117, 115, 32, 84, 111, 32, 84, 104, 101, 32, 79, 122, 111, > 110, 101, 32, 76, 97, 121, 101, 114, 46, 34, 44, 44, 44, 44, 44, > 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 44, 34, 34, 44, 34, > 65, 100, 100, 105, 116, 105, 111, 110, 97, 108, 32, 78, 111, 110, > 45, 71, 72, 83, 32, 72, 97, 122, 97, 114, 100, 32, 83, 116, 97, > 116, 101, 109, 101, 110, 116, 34, 44, 34, 65, 85, 72, 48, 54, 54, > 34, 44, 48, 46, 48, 48, 48, 48, 48, 37, 44, 34, 34, 10] > > > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug From miked at dewhirst.com.au Tue Aug 16 20:26:38 2016 From: miked at dewhirst.com.au (Mike Dewhirst) Date: Wed, 17 Aug 2016 10:26:38 +1000 Subject: [melbourne-pug] Unicode for windows dummies In-Reply-To: References: <1a2fe8b5-c6f0-a8f5-9fa5-2f5f477bdfb7@dewhirst.com.au> Message-ID: <12cc0a7a-8aa2-2257-1c2b-993bce43115a@dewhirst.com.au> On 16/08/2016 12:08 PM, Edward Kim wrote: > This is not an exact answer but... CSV with encoding is always having > trouble all around beacuse csv doesn't have any encoding information > in the file. Alternatively, you can create xls file easily using > well-made libs such as xlwt, openpyxl, xlsxwriter. Thanks Edward. I'll check them out. Interesting that emails from you and David didn't get here until late last night. Must be a blockage somewhere. Cheers Mike > > On Tue, 16 Aug 2016 at 11:44 AM, paul sorenson > wrote: > > On 08/15/2016 06:01 PM, Mike Dewhirst wrote: >> >> map_csv.py [1] is the beginning of a module I want to develop >> into a generic data import facility. ... > Might not help your encode problem but have you looked at csvkit? > > https://csvkit.readthedocs.io/en/0.9.1/ > > cheers > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug > > > > _______________________________________________ > melbourne-pug mailing list > melbourne-pug at python.org > https://mail.python.org/mailman/listinfo/melbourne-pug From tleeuwenburg at gmail.com Tue Aug 16 22:34:20 2016 From: tleeuwenburg at gmail.com (Tennessee Leeuwenburg) Date: Wed, 17 Aug 2016 12:34:20 +1000 Subject: [melbourne-pug] Job opportunity doing scientific programming / data sciencey stuff Message-ID: Hey, So there's a position open here at BoM: https://tinyurl.com/juo4my7. The job title looks a little prosaic, but the work is really engaging and we have a good team in place. Feel free to ping me with any questions. The job ad contains all the information, on $$$, job function, duration, expectations etc. It's all exactly as you see it, so I don't think I need to repeat that content here. The position is for a mid to senior developer. Thanks, -Tennessee -------------- next part -------------- An HTML attachment was scrubbed... URL: From ed at pythoncharmers.com Mon Aug 22 19:58:23 2016 From: ed at pythoncharmers.com (Ed Schofield) Date: Tue, 23 Aug 2016 09:58:23 +1000 Subject: [melbourne-pug] Python meeting - Monday 5 September Message-ID: <21B08E9B-239C-46F7-9843-ADCD284EED73@pythoncharmers.com> Hi everyone! We're looking forward to our next monthly Python meeting! When: 5.45pm for 6pm, Monday 5th September Where: VLSCI Seminar Room, Ground Floor, 700 Swanston Street, Carlton How to get there: Take a tram 5-10 mins north from Melbourne Central station. What: 1. ImageXD; summary of SciPy 2016 / PyCon AU: Juan Nunez-Iglesias (30 mins) 2. Creating custom styled reports from Jupyter notebooks: Ed Schofield (20 mins) 3. Open slot: email me or the list if you'd like to give a talk! (20 mins) 4. Lightning talks! (5 x 2 minutes) As in July, we'll have space for a few lightning talks (2 mins each). If you've been thinking about giving a talk but hesitating, this is your chance! After the talks we'll again head to a restaurant on Lygon Street for dinner / drinks. We hope to see you there! :-) Cheers, Ed -- Dr. Edward Schofield Python Charmers http://pythoncharmers.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From hap at unimelb.edu.au Mon Aug 29 20:39:36 2016 From: hap at unimelb.edu.au (Phil Harris) Date: Tue, 30 Aug 2016 00:39:36 +0000 Subject: [melbourne-pug] python code for Emotiv EEG Message-ID: Hi, Any list members with an interest in EEG? I'd like to develop some tools to visualise and manipulate live EEG signals generated by the Emotiv headset (emotiv.com) If you are interested in exploring please drop me a line. Regards, Phil Dr Philip Harris Honorary Fellow, Faculty of Business and Economics The University of Melbourne 198 Berkeley Street, Parkville, Vic 3010 Australia T: + 61 3 8344 1884 | F: + 61 3 9348 1921 | E: hap at unimelb.edu.au W: http://www.managementmarketing.unimelb.edu.au -------------- next part -------------- An HTML attachment was scrubbed... URL: