[Tutor] Building dictionary from large txt file

Dennis Lee Bieber wlfraed at ix.netcom.com
Tue Jul 26 23:11:02 EDT 2022


On Tue, 26 Jul 2022 22:58:06 +0200, bobx ander <bobxander87 at gmail.com>
declaimed the following:

>
>Atomic Number = 1
>    Atomic Symbol = H
>    Mass Number = 1
>    Relative Atomic Mass = 1.00782503223(9)
>    Isotopic Composition = 0.999885(70)
>    Standard Atomic Weight = [1.00784,1.00811]
>    Notes = m
>--------
>
>My goal is to extract the content into a dictionary that displays each
>unique triplet as indicated below
>{'H1': {'Z': 1,'A': 1,'m': 1.00782503223},
>              'D2': {'Z': 1,'A': 2,'m': 2.01410177812}
>               ...} etc

	First thing I'd want to know is how each entry in your source data MAPS
to each item in your desired dictionary.

>My code that I have attempted is as follows:
>
>filename='ex.txt'
>
>afile=open(filename,'r') #opens the file
>content=afile.readlines()
>afile.close()

	I'd probably run a loop inside the open/close section, collecting the
items for ONE entry. I presume "Atomic Number" starts each entry. Then,
when the next "Atomic Number" line is reached you process the collected
lines to make your dictionary entry.

>isotope_data={'Z':0,'A':0,'m':0}#start to create subdictionary for
>each case of atoms with its unique keys and values

	Usually not needed as addressing a key to add a value doesn't need
predefined keys or values. The only reason to initialize is if you expect
to have blocks that DON'T define all key/value pairs.

>for line in content:
>    data=line.strip().split()
>

	Drop the .split() at this level... IF you don't mind some loss in
processing speed to allow...

>    if len(data)<1:

	if not data: #empty string
		pass

see:
>>> str1 = "Atomic Number = 1"
>>> str2 = " "
>>> bool(str1)
True
>>> bool(str2)
True
>>> bool(str1.strip())
True
>>> bool(str2.strip())		<<<<
False
>>> 


>        pass
>    elif data[0]=="Atomic" and data[1]=="Number":
>        atomic_number=data[3]
>

	elif data.startswith("Atomic Number":
		atomic_number = data.split()[-1]


>
>     elif data[0]=="Mass" and data[1]=="Number":
>        mass_number=data[3]
>
>
>
>    elif data[0]=="Relative" and data[1]=="Atomic" and data[2]=="Mass":
>        relative_atomic_mass=data[4]
>

	Ditto for all those.
>
>isotope_data['Z']=atomic_number
>isotope_data['A']=mass_number
>isotope_data['A']=relative_atomic_mass

	This REPLACES any previous value of the key "A". To store multiple
values for a single key you need to put the values into a list... Presuming
you will always have both "mass_number" and "relative_atomic_mass"

	isotope_date["A"] = [mass_number, relative_atomic_mass]


	You don't show the outer dictionary in the example (the same list
concern may apply, you may need to do something like

	dict["key"] = []

	if term_1:
		dict["key"].append(term_1_value)
	if term_2:
		dict["key"].append(term_2_value)

etc.



-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
	wlfraed at ix.netcom.com    http://wlfraed.microdiversity.freeddns.org/



More information about the Tutor mailing list