Update a specific element in all a list of N lists

Friedrich Rentsch anthra.norell at bluewin.ch
Sun Dec 19 07:41:34 EST 2021



On 12/16/21 3:00 PM, hanan lamaazi wrote:
> Dear All,
>
> I really need your assistance,
>
> I have a dataset with 1005000 rows and 25 columns,
>
> The main column that I repeatedly use are Time, ID, and Reputation
>
> First I sliced the data based on the time, and I append the sliced data in
> a list called "df_list". So I get 201 lists with 25 columns
>
> The main code is starting for here:
>
> for elem in df_list:
>
> {do something.....}
>
> {Here I'm trying to calculate the outliers}
>
> Out.append(outliers)
>
> Now my problem is that I need to locate those outliers in the df_list and
> then update another column with is the "Reputation"
>
> Note that the there is a duplicated IDs but at different time slot
>
> example is ID = 1 is outliers, I need to select all ID = 1 in the list and
> update their reputation column
>
> I tried those solutions:
> 1)
>
> grp = data11.groupby(['ID'])
>          for i in GlobalNotOutliers.ID:
>              data11.loc[grp.get_group(i).index, 'Reput'] += 1
>
>          for j in GlobalOutliers.ID:
>              data11.loc[grp.get_group(j).index, 'Reput'] -= 1
>
>
> It works for a dataframe but not for a list
>
> 2)
>
> for elem in df_list:
>
> elem.loc[elem['ID'].isin(Outlier['ID'])]
>
>
> It doesn't select the right IDs, it gives the whole values in elem
>
> 3) Here I set the index using IDs:
>
> for i in Outlier.index:
>      for elem in df_list:
>          print(elem.Reput)
>          if i in elem.index:
> #             elem.loc[elem[i] , 'Reput'] += 1
>              m = elem.iloc[i, :]
>              print(m)
>
>
> It gives this error:
>
> IndexError: single positional indexer is out-of-bounds
>
>
> I'm greatly thankful to anyone who can help me,

I'd suggest you group your records by date and put each group into a 
dict whose key is date. Collecting each record into its group, append to 
it the index of the respective record in the original list. Then go 
through all your groups, record by record, finding outliers. The last 
item in the record is the index of the record in the original list 
identifying the record you want to update. Something like this:

     dictionary = {}
     for i, record in enumerate (original_list):
         date = record [DATE_INDEX]
         if date in dictionary:
             dictionary [date].append ((record, i))
         else:
             dictionary[date] = [(record, i)]

     reputation_indexes = set ()
     for date, records in dictionary.items ():
         for record, i in records:
             if has_outlier (record):
                 reputation_indexes.add (i)

     for i in reputation_idexes:
         update_reputation (original_list [i])

Frederic





More information about the Python-list mailing list