Subsetting a dictionary

Emile van Sebille emile at fenx.com
Sun Mar 18 21:24:24 EST 2001


Rather than copying and then deleting from the copy, try building a new
dictionary while adding only those that qualify.

Something like:  (untested)

def subset(self, attr, value):
                 # Copy semantics
                 td = {}
                 td.data = self.data.copy()
                 for ky, val in TrainingData.items():
                          if ky[attr] == value:
                                    td[ky] = val
                 return td


"Will Newton" <will at nospam.misconception.org.uk> wrote in message
news:993dkm$ttr$1 at news6.svr.pol.co.uk...
>
> I need to get a subset of a dictionary:
>
> D = Dictionary of objects (in this case, training examples for a decision
> tree)
>
> D2 = {D | d in D, d under some constraint}
>
> I need to be able to do this, but as it is temporary I do not want to
> destructively remove examples, so:
>
> D.subset(c)
> will return a new dictionary object that is equivalent to D under
> constraint c.
>
> I have code that does this, but it's very slow, consuming typically 75% of
> program runtime - it is called many times, which seems to be unavoidable
> without modifying the algorithm (ID3) too far.
>
> I currently have:
>
> class TrainingData:
>          data = {}
>         def subset(self, attr, value):
>                 # Copy semantics
>
>                 td = TrainingData()
>
>                 td.data = self.data.copy()
>
>                 for item in td.data.keys():
>                          if item[attr] != value:
>                                    del td.data[item]
>
>                 return td
>
> Can anyone see how I can speed this up?
>
> I could conceivably do this with a list and use filter, I'll see how that
> works.





More information about the Python-list mailing list