Copying column values up based on other column values

Sun Jan 3 13:16:55 EST 2021

On Sunday, January 3, 2021 at 7:08:49 PM UTC+2, Jason Friedman wrote:
> > 
> > import numpy as np 
> > import pandas as pd 
> > from numpy.random import randn 
> > df=pd.DataFrame(randn(5,4),['A','B','C','D','E'],['W','X','Y','Z']) 
> > 
> > W X Y Z 
> > A -0.183141 -0.398652 0.909746 0.332105 
> > B -0.587611 -2.046930 1.446886 0.167606 
> > C 1.142661 -0.861617 -0.180631 1.650463 
> > D 1.174805 -0.957653 1.854577 0.335818 
> > E -0.680611 -1.051793 1.448004 -0.490869 
> > 
> > is there a way to create a column S - which will copy column column Y 
> > values UP- if values in column Y are above 1 - otherwise return new value 
> > above zero?.I made this manually: 
> > 
> > S: 
> > 
> > A 1.446886 
> > B 1.446886 
> > C 1.854577 
> > D 1.854577 
> > E 1.448004 
> >
> Here's one solution. No consideration to performance.
> import numpy as np 
> import pandas as pd 
> from numpy.random import randn 
> df=pd.DataFrame(randn(5,4),['A','B','C','D','E'],['W','X','Y','Z'])
> print(df) 
> 
> y_series = df["Y"] 
> for i in range(len(y_series)): 
> if i == len(y_series) - 1: 
> # Last one, nothing to copy 
> break 
> if y_series[i+1] > 1: 
> y_series[i] = y_series[i+1] 
> 
> df["Y"] = y_series 
> print(df)

Thank you Jason for this lovely for loop - is there a way to make this with pandas series or numpy arrays? for maximum speed?