[CentralOH] Django Management Command Memory Usage

Eric Floehr eric at intellovations.com
Mon Jun 4 17:08:56 CEST 2012


Running as raw SQL will save memory (and possibly time) as you aren't
loading each row from database to Python and back.  The drawback would be
you would be tied to whatever GIS implementation you are using (PostGIS
supports a number).  Here would be the equivalent PostGIS SQL to do this:

update <Places DB table> set point=ST_SetSRID(ST_Point(x/1000.0, y/1000.0),
<SRID>) where x is not NULL and y is not NULL;

The default <SRID> in PostGIS is 4326, so that's likely what your point
column will expect.

A couple of notes:

ST_Point, and most other PostGIS commands expect longitude,latitude form
... so in this case, x would be longitude and y latitude.

Also, in your Python implementation, "if record.x and record.y" will fail
when x or y is 0, which are valid, not just None.  This has tripped me up
in the past :-).  So better would be "if record.x is not None and record.y
is not None".

Cheers,
Eric




On Mon, Jun 4, 2012 at 10:39 AM, Kurtis Mullins <kurtis.mullins at gmail.com>wrote:

> Hey,
>
> It looks like you've got the fat trimmed off this one about as much as you
> can. I'd say filter out your results but even then you're still using every
> result. I'd recommend modifying this to run as raw SQL and I'm sure your
> memory usage will go down significantly. Pulling in this many records as
> Python Objects and conversely creating new Python objects on top of that
> (your Point objects) is most likely the cause of this memory usage.
>
>
> On Mon, Jun 4, 2012 at 10:28 AM, <jep200404 at columbus.rr.com> wrote:
>
>> How can I reduce the memory usage in a Django management command?
>> I have some Django code like follows in a management program:
>>
>> class Command(BaseCommand):
>> ...
>>    def handle(self, *args, **options):
>>        for record in Places.objects.all():
>>            if record.x and record.y:
>>                record.point = (
>>                    Point(float(record.x)/1000.,
>>                    float(record.y)/1000.))
>>            else:
>>                record.point = None
>>            record.save()
>>        django.db.connection.close()
>>
>> In the settings.py file I have:
>>
>> DEBUG = False
>>
>> Places has millions of rows.
>> top reveals that the program is using 18.6 Gigabytes of memory.
>> How can I reduce that memory usage?
>> Am I neglecting to close or release something?
>>
>> The only dox I'm finding about memory use related to query sets
>> advise to use iterators instead of converting to a list.
>> I'm already following that advice, but I'm not finding
>> further guidance about memory use about record modification.
>>
>> Since DEBUG is False, I've already heeding the following.
>>
>>
>> https://docs.djangoproject.com/en/dev/faq/models/#why-is-django-leaking-memory
>>
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>
>> I have found that the following might be nice,
>> but doubt it addresses the memory issue.
>>
>>            record.save(update_fields=['point'])
>>
>> _______________________________________________
>> CentralOH mailing list
>> CentralOH at python.org
>> http://mail.python.org/mailman/listinfo/centraloh
>>
>
>
> _______________________________________________
> CentralOH mailing list
> CentralOH at python.org
> http://mail.python.org/mailman/listinfo/centraloh
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/centraloh/attachments/20120604/ee7c73c0/attachment-0001.html>


More information about the CentralOH mailing list