[Tutor] Script to search in string of values from file A in file B

aduarte aduarte at itqb.unl.pt
Wed May 9 16:26:51 CEST 2012


On 2012-05-09 15:22, BRAGA, Bruno wrote:
> On Thursday, May 10, 2012, Afonso Duarte <aduarte at itqb.unl.pt [1]>
> wrote:
>> Dear All,
>>
>>  
>>
>> I’m new to Python and started to use it to search text strings in
> big (>500Mb) txt files.
>  >
>> I have a list on text file (e.g. A.txt) that I want to use as a key
> to search another file (e.g. B.txt), organized in the following way:
>>
>>  
>>
>> A.txt:
>>
>>  
>>
>> Aaa
>  >
>> Bbb
>>
>> Ccc
>>
>> Ddd
>>
>> .
>>
>> .
>>
>> .
>>
>>  
>>
>> B.txt
>>
>>  
>>
>> Bbb
>>
>> 1234
>>
>  > Xxx
>>
>> 234
>>
>>  
>>
>>  
>>
>> I want to use A.txt to search in B.txt and have as output the
> original search entry (e.g. Bbb) followed by the line that follows it
> in the B.txt (e.g.  Bbb / 1234).
>  >
>> I wrote the following script:
>>
>>  
>>
>>  
>>
>> object = open(B.txt', 'r')
>>
>> lista = open(A.txt', 'r')
>>
>> searches = lista.readlines()
>  >
>> for line in object.readlines():
>>
>>      for word in searches:
>>
>>           if word in line:
>>
>>                print line+'n'
>>
>>  
>>
>>  
>  >
>>  
>>
>> But from here I only get the searching entry and not the line
> afterwards, I tried to google it but I got lost and didn’t manage to
> do it.
>>
>> Any ideas ? I guess that this is basic scripting but I just started
> .
>

> Not sure I understood the question... But:
> - are you trying to "grep" the text file? (simpler than programming
> in python, IMO)

> - if you have multiple matches of any of the keys from A file in a
> sungle line of B file, the script above will print it multiple times

true, I did not mention, but the entries in file A.txt only appear once 
in b.txt.


>  - you need not add new line (n) in the print statement, unless you
> want it to print a blank line between results

true
>
> Based on the example you gave, the matching Bbb value in B and A are
> the same, so actually line is being printed, but it is just the same
> as word...


exactly! but what I want is that plus the value that proceeds that line 
in the B.txt i.e.

Bbb
1234

Best

Afonso
>
>>
>>  
>>
>> Best
>>
>>  
>>
>> Afonso


More information about the Tutor mailing list