[Tutor] Script to search in string of values from file A in file B
aduarte
aduarte at itqb.unl.pt
Wed May 9 16:26:51 CEST 2012
On 2012-05-09 15:22, BRAGA, Bruno wrote:
> On Thursday, May 10, 2012, Afonso Duarte <aduarte at itqb.unl.pt [1]>
> wrote:
>> Dear All,
>>
>>
>>
>> I’m new to Python and started to use it to search text strings in
> big (>500Mb) txt files.
> >
>> I have a list on text file (e.g. A.txt) that I want to use as a key
> to search another file (e.g. B.txt), organized in the following way:
>>
>>
>>
>> A.txt:
>>
>>
>>
>> Aaa
> >
>> Bbb
>>
>> Ccc
>>
>> Ddd
>>
>> .
>>
>> .
>>
>> .
>>
>>
>>
>> B.txt
>>
>>
>>
>> Bbb
>>
>> 1234
>>
> > Xxx
>>
>> 234
>>
>>
>>
>>
>>
>> I want to use A.txt to search in B.txt and have as output the
> original search entry (e.g. Bbb) followed by the line that follows it
> in the B.txt (e.g. Bbb / 1234).
> >
>> I wrote the following script:
>>
>>
>>
>>
>>
>> object = open(B.txt', 'r')
>>
>> lista = open(A.txt', 'r')
>>
>> searches = lista.readlines()
> >
>> for line in object.readlines():
>>
>> for word in searches:
>>
>> if word in line:
>>
>> print line+'n'
>>
>>
>>
>>
> >
>>
>>
>> But from here I only get the searching entry and not the line
> afterwards, I tried to google it but I got lost and didn’t manage to
> do it.
>>
>> Any ideas ? I guess that this is basic scripting but I just started
> .
>
> Not sure I understood the question... But:
> - are you trying to "grep" the text file? (simpler than programming
> in python, IMO)
> - if you have multiple matches of any of the keys from A file in a
> sungle line of B file, the script above will print it multiple times
true, I did not mention, but the entries in file A.txt only appear once
in b.txt.
> - you need not add new line (n) in the print statement, unless you
> want it to print a blank line between results
true
>
> Based on the example you gave, the matching Bbb value in B and A are
> the same, so actually line is being printed, but it is just the same
> as word...
exactly! but what I want is that plus the value that proceeds that line
in the B.txt i.e.
Bbb
1234
Best
Afonso
>
>>
>>
>>
>> Best
>>
>>
>>
>> Afonso
More information about the Tutor
mailing list