Pandas or Numpy

Avi Gross avigross at verizon.net
Sun Jan 23 13:29:09 EST 2022


Definitely it sounds like you may use both. Quite a bit of what people do using DataFrame objects includes working on copies of individual columns, which often are numpy Series or the like and in the other direction, can be used to create or amend a pandas DataFrame. Plus, many operations used to select some subset of rows will use things like  a data structure holding the integer indexes you want or a boolean where True means take it and False means ignore it. 
Many real life applications just incorporate both numpy as np and pandas as pd and sometimes also use other Python functionality such as lists or matrices which unfortunately generally are just a list of lists. 
Python was built with a different philosophy than some languages like R in which much of what is in numpy and pandas is built in, in  other ways, and has been extended by packages. Python built mostly on flexible lists so the modules you are asking about do make faster and perhaps better versions. And, to be fair, python has lots of nifty features that R is largely missing and had to be added externally. 
Both used properly can do a nice job on the kind of things you want but with a warning. Your description suggests some of the data you will be using or making can get quite large. So make sure you look into the dtypes for parts of your data so you do not store small integers in full sized integers but in signed or unsigned bytes. Data with only two possible values, might be stored as boolean. And note many operations can be done in place, rather than creating a new object. If you are worried about space usage, or time spent on garbage collection, as in any programming language, there are recognized ideas about how you might tighten up your code using existing paradigms. I do admit some tricks have costs and it takes real testing to see if it even matters to try. But in general, much of numpy and pandas are already optimized in lots of compiled code  so using these rather than python lists and other data structures can already be a big plus.
Good luck.

-----Original Message-----
From: Julius Hamilton <juliushamilton100 at gmail.com>
To: Chris Angelico <rosuav at gmail.com>
Cc: python-list at python.org
Sent: Sun, Jan 23, 2022 1:05 pm
Subject: Re: Pandas or Numpy

Hey,


I don’t know but in case you don’t get other good answers, I’m pretty sure
Numpy is more of a mathematical library and Pandas is definitely for
handling spreadsheet data.


So maybe both.


Julius

On Sun 23. Jan 2022 at 18:28, Chris Angelico <rosuav at gmail.com> wrote:

> On Mon, 24 Jan 2022 at 04:10, Tobiah <toby at tobiah.org> wrote:
> >
> > I know very little about either.  I need to handle score input files
> > for Csound.  Each line is a list of floating point values where each
> > column has a particular meaning to the program.
> >
> > I need to compose large (hundreds, thousands, maybe millions) lists
> > and be able to do math on, or possibly sort by various columns, among
> other
> > operations.  A common requirement would be to do the same math operation
> > on each value in a column, or redistribute the values according to an
> > exponential curve, etc.
> >
> > One wrinkle is that the first column of a Csound score is actually a
> > single character.  I was thinking if the data types all had to be the
> > same, then I'd make a translation table or just use the ascii value
> > of the character, but if I could mix types that might be a smidge better.
> >
> > It seems like both libraries are possible choices.  Would one
> > be the obvious choice for me?
> >
>
> I'm not an expert, but that sounds like a job for Pandas to me. It's
> excellent at handling tabular data, and yes, it's fine with a mixture
> of types. Everything else you've described should work fine (not sure
> how to redistribute on an exponential curve, but I'm sure it's not
> hard).
>
> BTW, Pandas is built on top of Numpy, so it's kinda "both".
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


More information about the Python-list mailing list