[Pandas-dev] Chained filtering with lazy evaluation ("where")

Phillip Cloud cpcloud at gmail.com
Thu Mar 22 11:24:11 EDT 2018


If you feel like being evil, you can use a so-called “frame hack” + a
context manager:

In [1]: import pandas as pd
   ...: import contextlib
   ...: import sys
   ...:
   ...:
   ...: class ctx:
   ...:     def __init__(self, df):
   ...:         self.df = df
   ...:         current_frame = sys._getframe(0)
   ...:         self.locals = current_frame.f_back.f_locals
   ...:         self.existing_values = {
   ...:             k: self.locals[k] for k in df.columns
   ...:             if k in self.locals
   ...:         }
   ...:         self.new_values = {k for k in df.columns if k not in
self.locals}
   ...:
   ...:     def __enter__(self):
   ...:         for k in df.columns:
   ...:             self.locals[k] = df[k]
   ...:         return
   ...:
   ...:     def __exit__(self, *exc):
   ...:         self.locals.update(self.existing_values)
   ...:         for k in self.new_values:
   ...:             del self.locals[k]
   ...:

In [2]: df = pd.DataFrame({'a': np.array([1, 2], dtype='float32')})

In [3]: try:
   ...:     a + 1
   ...: except NameError:
   ...:     print("'a' doesn't exist yet!")
   ...:
'a' doesn't exist yet!

In [4]: with ctx(df):
   ...:     print(df[a == 1])
   ...:
     a
0  1.0

In [5]: try:
   ...:     a + 1
   ...: except NameError:
   ...:     print("'a' doesn't exist yet!")
   ...:
'a' doesn't exist yet!

​

On Thu, Mar 22, 2018 at 10:35 AM Pietro Battiston <ml at pietrobattiston.it>
wrote:

> Il giorno gio, 15/03/2018 alle 15.10 -0400, Justin Lewis ha scritto:
> > I might be missing the point but can you use .pipe()?
>
> Indeed, this is something else I had not considered.
>
> However I don't like it to much. Compare
>
> .loc[W]
>
> with
>
> .pipe(lambda df : df[df])
>
> By the way,
>
> .loc[lambda df : df[df]]
>
> is equivalent but cleaner to me (after all, we are selecting).
>
> This said, the solutions proposed by you and Chris are indeed more
> robust then mine. For instance,
>
> .loc[W + 1 > 2]
>
> works but
>
> .loc[2 < 1 + W]
>
> doesn't, and I don't even know if a fix is possible.
>
> Pietro
> _______________________________________________
> Pandas-dev mailing list
> Pandas-dev at python.org
> https://mail.python.org/mailman/listinfo/pandas-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pandas-dev/attachments/20180322/fc29a7cc/attachment.html>


More information about the Pandas-dev mailing list