What is the best way to delete strings in a string list that that match certain pattern?

Sat Nov 7 11:13:52 EST 2009

On Fri, Nov 6, 2009 at 5:57 PM, Dave Angel <davea at ieee.org> wrote:
>
>
> Peng Yu wrote:
>>
>> On Fri, Nov 6, 2009 at 10:42 AM, Robert P. J. Day <rpjday at crashcourse.ca>
>> wrote:
>>
>>>
>>> On Fri, 6 Nov 2009, Peng Yu wrote:
>>>
>>>
>>>>
>>>> On Fri, Nov 6, 2009 at 3:05 AM, Diez B. Roggisch <deets at nospam.web.de>
>>>> wrote:
>>>>
>>>>>
>>>>> Peng Yu schrieb:
>>>>>
>>>>>>
>>>>>> Suppose I have a list of strings, A. I want to compute the list (call
>>>>>> it B) of strings that are elements of A but doesn't match a regex. I
>>>>>> could use a for loop to do so. In a functional language, there is way
>>>>>> to do so without using the for loop.
>>>>>>
>>>>>
>>>>> Nonsense. For processing over each element, you have to loop over them,
>>>>> either with or without growing a call-stack at the same time.
>>>>>
>>>>> FP languages can optimize away the stack-frame-growth (tail recursion)
>>>>> - but
>>>>> this isn't reducing complexity in any way.
>>>>>
>>>>> So use a loop, either directly, or using a list-comprehension.
>>>>>
>>>>
>>>> What is a list-comprehension?
>>>>
>>>> I tried the following code. The list 'l' will be ['a','b','c'] rather
>>>> than ['b','c'], which is what I want. It seems 'remove' will disrupt
>>>> the iterator, right? I am wondering how to make the code correct.
>>>>
>>>> l ='a', 'a', 'b', 'c']
>>>> for x in l:
>>>>  if x ='a':
>>>>    l.remove(x)
>>>>
>>>> print l
>>>>
>>>
>>>  list comprehension seems to be what you want:
>>>
>>>  l =i for i in l if i != 'a']
>>>
>>
>> My problem comes from the context of using os.walk(). Please see the
>> description of the following webpage. Somehow I have to modify the
>> list inplace. I have already tried 'dirs =i for i in l if dirs !'a']'. But
>> it seems that it doesn't "prune the search". So I need the
>> inplace modification of list.
>>
>> http://docs.python.org/library/os.html
>>
>> When topdown is True, the caller can modify the dirnames list in-place
>> (perhaps using del or slice assignment), and walk() will only recurse
>> into the subdirectories whose names remain in dirnames; this can be
>> used to prune the search, impose a specific order of visiting, or even
>> to inform walk() about directories the caller creates or renames
>> before it resumes walk() again. Modifying dirnames when topdown is
>> False is ineffective, because in bottom-up mode the directories in
>> dirnames are generated before dirpath itself is generated.
>>
>>
>
> The context is quite important in this case.  The os.walk() iterator gives
> you a tuple of three values, and one of them is a list.  You do indeed want
> to modify that list, but you usually don't want to do it "in-place."   I'll
> show you the in-place version first, then show you the slice approach.
>
> If all you wanted to do was to remove one or two specific items from the
> list, then the remove method would be good.  So in your example, you don' t
> need a loop.  Just say:
>   if 'a' in dirs:
>        dirs.remove('a')
>
> But if you have an expression you want to match each dir against, the list
> comprehension is the best answer.  And the trick to stuffing that new list
> into the original list object is to use slicing on the left side.  The [:]
> notation is a default slice that means the whole list.
>
>   dirs[:] = [ item for item in dirs if     bool_expression_on_item ]

I suggest to add this example to the document of os.walk() to make
other users' life easier.