From rth.yurchak at gmail.com  Thu Feb  2 10:09:08 2023
From: rth.yurchak at gmail.com (Roman Yurchak)
Date: Thu, 2 Feb 2023 16:09:08 +0100
Subject: [Pandas-dev] pandas-vet
Message-ID: <CAMXUfRxSvHmjY9NXJiscob3s8qOUCem+6P-n_a+b012Jd5j6Tw@mail.gmail.com>

Hi,

There was interesting work done in https://github.com/deppen8/pandas-vet
for enforcing automated checks on pandas code.

I was wondering if the core teams had some opinions on the enforced rules
and could comment to what extent there is a consensus on those, whether
they are consistent with what's recommended in the pandas docs.
Particularly on things, like pivot_table vs unstack, .array vs .values, and
melt vs stack.

Currently working on a largish legacy code with lots of pandas code, so IMO
something like pyupgrade for pandas could really be great. Also now that
pandas-vet is implemented in ruff, I feel it has the potential to become
mainstream in a few years. Just checking whether there is some consensus on
what could / should be enforced for pandas linting.

For the rule "'inplace = True' should be avoided; it has inconsistent
behavior": if there is an issue,  this could be fixed in some future major
release, right ?

Thanks,

Roman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230202/9161de59/attachment.html>

From garcia.marc at gmail.com  Thu Feb  2 11:59:16 2023
From: garcia.marc at gmail.com (Marc Garcia)
Date: Thu, 2 Feb 2023 17:59:16 +0100
Subject: [Pandas-dev] pandas-vet
In-Reply-To: <CAMXUfRxSvHmjY9NXJiscob3s8qOUCem+6P-n_a+b012Jd5j6Tw@mail.gmail.com>
References: <CAMXUfRxSvHmjY9NXJiscob3s8qOUCem+6P-n_a+b012Jd5j6Tw@mail.gmail.com>
Message-ID: <CAEk5N5srqDZSt_TW64GCfkLUWxYQYT=ZHH8LQw+H0WUDMVLYwQ@mail.gmail.com>

Thanks for starting this discussion. I think each of the points need an
independent discussion. In general I think the solution would be to
deprecate things in pandas.

For the inplace keyword, there is consensus to deprecate it. There was even
before pandas 1.0 and plans to remove it everywhere before it, and we
almost removed it for pandas 2.0 (not finally happening), but there are
still few details to discuss. I guess a linter can help before we start
raising the FutureWarnings.

For isna/isnull, the initial plan and obvious solution was to also
deprecate isnull, but it was decided
<https://github.com/pandas-dev/pandas/pull/16972#issuecomment-317216086> it
was too common. Seems like deprecating it is a better option that a linter.
But I guess if the linter is popular enough could help. I'd personally just
deprecate things we don't want users to use (instead of encouraging a
linter), but if there is no consensus to deprecate, maybe there will be in
the future and the linter can help. Some things are trickier, but I guess
in general we could end up deprecating things like Series.values in favor
of .array or .to_numpy()...

Personally -1 on `import pandas as pd`. If we had to rewrite things I guess
the numpy module would be simply named np, so no aliasing is needed. And
the pandas module namespace is much smaller and not used so frequently, and
shorting it to pd has almost no impact in code verbosity. I never alias the
pandas module name, and while consistency across projects can be nice,
seems odd to have a linter to recommend something that is more a tradition
than a good practice. At least that's my opinion.

On Thu, Feb 2, 2023 at 4:09 PM Roman Yurchak <rth.yurchak at gmail.com> wrote:

> Hi,
>
> There was interesting work done in https://github.com/deppen8/pandas-vet
> for enforcing automated checks on pandas code.
>
> I was wondering if the core teams had some opinions on the enforced rules
> and could comment to what extent there is a consensus on those, whether
> they are consistent with what's recommended in the pandas docs.
> Particularly on things, like pivot_table vs unstack, .array vs .values, and
> melt vs stack.
>
> Currently working on a largish legacy code with lots of pandas code, so
> IMO something like pyupgrade for pandas could really be great. Also now
> that pandas-vet is implemented in ruff, I feel it has the potential to
> become mainstream in a few years. Just checking whether there is some
> consensus on what could / should be enforced for pandas linting.
>
> For the rule "'inplace = True' should be avoided; it has inconsistent
> behavior": if there is an issue,  this could be fixed in some future major
> release, right ?
>
> Thanks,
>
> Roman
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230202/8bfd68b6/attachment.html>

From lthomas at enthought.com  Fri Feb  3 07:52:46 2023
From: lthomas at enthought.com (lthomas at enthought.com)
Date: Fri, 03 Feb 2023 04:52:46 -0800 (PST)
Subject: [Pandas-dev] SciPy 2023 Call for Proposals
Message-ID: <63dd039e.050a0220.1818.30ab@mx.google.com>

SciPy 2023 Call for Proposals
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230203/d32d8fbd/attachment.html>

From jorisvandenbossche at gmail.com  Wed Feb  8 10:04:23 2023
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Wed, 8 Feb 2023 16:04:23 +0100
Subject: [Pandas-dev] EA Naming Conventions
In-Reply-To: <CAKf8g9Q+=M8mM2C4Gyc10rhc=7FZnQ578mMm9YQ0Wy2xRWskbQ@mail.gmail.com>
References: <CAKf8g9Q+=M8mM2C4Gyc10rhc=7FZnQ578mMm9YQ0Wy2xRWskbQ@mail.gmail.com>
Message-ID: <CALQtMBYQZokbitJp0f7Pyt4XwX9L8f0otJD7_21PG6KrLyvWew@mail.gmail.com>

On Thu, 26 Jan 2023 at 23:30, Brock Mendel <jbrockmendel at gmail.com> wrote:

> For historical reasons we've built up an EA namespace without much
> internal logic in terms of what is public/private.  While this isn't _that_
> big of a deal, it'd be nice to make this more coherent.  I see two useful
> options:
>

In my opinion (and recollection), at the start when ExtensionArrays were
introduced, the rule was quite clear: *everything* on the base class is
considered as public for developers (EA implementors can (or need to)
override those), and then whether the actual name is public vs private
(i.e. leading underscore or not) depends on whether it should be public for
end users (not implementors).

And we use documentation / comments to indicate to developers (EA
implementors) which parts are required to implement or are optional to
implement.


>
> 1) Use the traditional "an underscore means this should only be called
> from within self".  Very few methods on the base class satisfy that
> characteristic, including the constructor _from_sequence.  One benefit of
> moving to this is it would make "official" that we shouldn't be using
> _values_for_foo from outside EA methods.
>

We don't want to make all those "private" functions for EAs to implement
public to end-users, so I don't think this is an option.
Also, there *are* some valid cases to call the _values_for_.. methods
outside of other EA methods, so this is not a general rule.


> 2) Use underscores to signal to 3rd party authors whether or not there
> exists a working (not necessarily performant) implementation on the base
> class.  In this scenario authors would _have_ to implement private methods,
> while implementing public methods would be optional.
>
> That would make some of the currently private (and not useful for
end-users) methods public, and some public methods private (if we do that
for existing methods, and not as a rule for new methods). But what is the
main goal you want to achieve here? That it is clearer for EA implementors
what they need to implement? (currently we use AbstractMethodError for that
which seems already clear to me, and we have base tests that you can
inherit that should cover those basic things you need to implement)

Joris


> Thoughts?
>
>
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230208/8209fabe/attachment.html>

From jorisvandenbossche at gmail.com  Wed Feb  8 11:40:45 2023
From: jorisvandenbossche at gmail.com (Joris Van den Bossche)
Date: Wed, 8 Feb 2023 17:40:45 +0100
Subject: [Pandas-dev] February 2023 bi-monthly community meeting (Wednesday
 February 8, UTC 18:00)
Message-ID: <CALQtMBYT-ZOUsP5GKxuPDJXaX_g_KwP7joN3Pv-hOYxPcBAFqg@mail.gmail.com>

Hi all,

A late reminder that the next bi-monthly (twice a month) dev call is today
in a bit more than 1 hour (Wednesday, February 8) at 18:00 UTC. Our
calendar is at
https://pandas.pydata.org/docs/development/meeting.html#calendar to check
your local time.

The pandas Community Meeting is a regular sync meeting for the project's
maintainers which is
open to the community. All are welcome to attend!

Video Call:
https://us06web.zoom.us/j/84484803210?pwd=TjUxNmcyNHcvcG9SNGJvbE53Y21GZz09
Meeting notes:
https://docs.google.com/document/u/1/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/edit?ouid=102771015311436394588&usp=docs_home&ths=true

Joris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230208/81238969/attachment.html>

From erotemic at gmail.com  Sat Feb 11 14:42:11 2023
From: erotemic at gmail.com (Jonathan Crall)
Date: Sat, 11 Feb 2023 14:42:11 -0500
Subject: [Pandas-dev] DataFrame.pivot positional deprecations
Message-ID: <CAAj5Zt=vrj7GoHZdu_mNbKQYcydzE1+OkPdZf2d4xbEWqQ_PsA@mail.gmail.com>

Hi all,

Please tell me if there is a better place to raise this, but I'm seeing a
lot of:

FutureWarning: In a future version of pandas all arguments of
DataFrame.pivot will be keyword-only.

I'm wondering: what is the rationale behind removing the positional
arguments here? They seem perfectly natural to me. I'd like to put my two
cents in and suggest that maybe moving to keyword only is not the best idea
in this case because pivot is very useful in interactive sessions, but
keyword only args will make it more cumbersome to type and access.

If there is a very good reason for removing positional arguments I'm open
to updating my code, but I'd like to see what that rationale and discussion
was. If the rationale does not have a solid foundation then my suggestion
is perhaps this change should be removed from the roadmap.


-- 
-Dr. Jon Crall (him)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230211/a9cbfe05/attachment.html>

From garcia.marc at gmail.com  Tue Feb 21 16:55:36 2023
From: garcia.marc at gmail.com (Marc Garcia)
Date: Tue, 21 Feb 2023 22:55:36 +0100
Subject: [Pandas-dev] ANN: pandas 2.0.0 RC0
Message-ID: <CAEk5N5u7yasacCvMV=3pANy27iXwcXnz3EuTLgvqntN9s1uB=g@mail.gmail.com>

We are happy to announce the *release candidate* of pandas 2.0.0.

It can be installed from our conda-forge and PyPI packages via mamba, conda
and pip, for example:

mamba install -c conda-forge/label/pandas_rc pandas==2.0.0rc0
python -m pip install --upgrade --pre pandas==2.0.0rc0

Users having pandas code in production and maintainers of libraries with
pandas as a dependency are *strongly* recommended to run their test suites
with the release candidate, and report any breaking change to our issue
tracker <https://github.com/pandas-dev/pandas/issues/new/choose> before the
official 2.0.0 release.

You can find the documentation of pandas 2.0.0 here
<https://pandas.pydata.org/pandas-docs/version/2.0/index.html>, and the
list of changes in 2.0.0, in the release notes page
<https://pandas.pydata.org/pandas-docs/version/2.0/whatsnew/v2.0.0.html>.

We expect to release the final version of pandas 2.0.0 in around two weeks,
but the final date will depend on the issues reported to the release
candidate.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230221/a0135fbb/attachment.html>

From suryasriram2950 at gmail.com  Wed Feb 15 00:55:32 2023
From: suryasriram2950 at gmail.com (Surya Sriram)
Date: Wed, 15 Feb 2023 05:55:32 -0000
Subject: [Pandas-dev] Unable to install pandas on my work computer
Message-ID: <CAEtsEfSoKXYCy85fbMXpZOEC7UXDKbqarnSCNm_PvJykBL-pKA@mail.gmail.com>

Hi,

I was trying to install python pandas package from Pycharm on my work
computer, but I'm unable to install it. I'm attaching the error message
below. I don't have adminstrator rights on my computer, I can't open CMD on
my computer either. Is there any other way I could install packages?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230215/ac44a82d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IMG_20230215_112222086~2.jpg
Type: image/jpeg
Size: 2608897 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/pandas-dev/attachments/20230215/ac44a82d/attachment-0001.jpg>