Extract lines from file, add to new files

avi.e.gross at gmail.com avi.e.gross at gmail.com
Sat Feb 3 14:43:48 EST 2024


We substantially agree with that, Thomas.  In the best of all possible
worlds, someone who gets stuck will sit down and try to carefully spell out
things in ways like you mention and, incidentally, may often catch the error
or figure out how to do it and not even send in a request! LOL!

I think a main thing that the OP and others can do is to not just be
abstract but supply a small example including what output they expect and
perhaps what they did not receive properly along with error messages.

Rich had not tried doing what he wanted in python, yet. I don't know if he
did any other parts yet. I think he was still in a somewhat abstract design
state and hoping for someone to push him at something like a resource to
continue and he did accept suggestions on what now seem somewhat random
things to read as people guessed.

In a sense, I and others can take some blame for the way we widened the
problem while trying to look at it our own way.

But looking back, I think Rich asked close to what he wanted. An example
might have helped such as:

'I have information about multiple clients including an email address and a
human name such as:

user at domain.com Sally
fido at bark.com Barky

I want to save the info in a file or maybe two in such a way that when I
write a program and ask it to send an email to "fido at bark.com" then it finds
an entry that matches the email address and then uses that to find the
matching name and sends a specified message with a specific salutation in
front like "Dear Barky,".

I was thinking of having two files with one having email address after
another and the other having the corresponding names. I want to use python
to search one file and then return the name in the other. Are there other
and perhaps better ways commonly used to associate one keyword with a
value?'

Back to me. The above is not polished but might still have engendered a
discussion such as how to keep the files synchronized. Yes, the OP could
have added endless clauses saying that they are not asking how to create the
data files or keep them synchronized but just how to match them. The reply
in this case possibly could have suggested they count the lines they have
read until the match and, assuming no blank lines are used, read a second
file till the nth line. We might have been done quickly and THEN had a long
discussion about other ways!

I have participated, like you, in another forum designed for tutoring and I
think the rules and expectations there may be a bit different. Over here, it
is fairer to expect people to take a bit of time and ask clearer questions. 

We live and we learn and then Alzheimer's ...



-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org> On
Behalf Of Thomas Passin via Python-list
Sent: Saturday, February 3, 2024 12:59 PM
To: python-list at python.org
Subject: Re: Extract lines from file, add to new files

In my view this whole thread became murky and complicated because the OP 
did not write down the requirements for the program.  Requirements are 
needed to communicate with other people.  An individual may not need to 
actually write down the requirements - depending on their complexity - 
but they always exist even if only vaguely in a person's mind.  The 
requirements may include what tools or languages the person wants to use 
and why.

If you are asking for help, you need to communicate the requirements to 
the people you are asking for help from.

The OP may have thought the original post(s) contained enough of the 
requirements but as we know by now, they didn't.

The person asking for help may not realize they don't know enough to 
write down all the requirements; an effort to do so may bring that lack 
to visibility.

Mailing lists like these have a drawback that it's hard to impossible 
for someone not involved in a thread to learn anything general from it. 
We can write over and over again to please state clearly what you want 
to do and where the sticking points are, but newcomers post new 
questions without ever reading these pleas.  Then good-hearted people 
who want to be helpful end up spending a lot of time trying to guess 
what is actually being asked for, and maybe never find out with enough 
clarity.  Others take a guess and then spend time working up a solution 
that may or may not be on target.

So please! before posting a request for help, write down the 
requirements as best you can figure them out, and then make sure that 
they are expressed such that the readers can understand.

On 2/3/2024 11:33 AM, avi.e.gross at gmail.com wrote:
> Thomas,
> 
> I have been thinking about the concept of being stingy with information as
> this is a fairly common occurrence when people ask for help. They often
ask
> for what they think they want while people like us keep asking why they
want
> that and perhaps offer guidance on how to get closer to what they NEED or
a
> better way.
> 
> In retrospect, Rich did give all the info he thought he needed. It boiled
> down to saying that he wants to distribute data into two files in such a
way
> that finding an item in file A then lets him find the corresponding item
in
> file B. He was not worried about how to make the files or what to do with
> the info afterward. He had those covered and was missing what he
considered
> a central piece. And, it seems he programs in multiple languages and
> environments as needed and is not exactly a newbie. He just wanted a way
to
> implement his overall design.
> 
> We threw many solutions and ideas at him but some of us (like me) also got
> frustrated as some ideas were not received due to one objection or another
> that had not been mentioned earlier when it was not seen as important.
> 
> I particularly notice a disconnect some of us had. Was this supposed to be
a
> search that read only as much as needed to find something and stopped
> reading, or a sort of filter that returned zero or more matches and went
to
> the end, or perhaps something that read entire files and swallowed them
into
> data structures in memory and then searched and found corresponding
entries,
> or maybe something else?
> 
> All the above approaches could work but some designs not so much. For
> example, some files are too large. We, as programmers, often consciously
or
> unconsciously look at many factors to try to zoom in on what approaches me
> might use. To be given minimal amounts of info can be frustrating. We
worry
> about making a silly design. But the OP may want something minimal and not
> worry as long as it is fairly easy to program and works.
> 
> We could have suggested something very simple like:
> 
> Open both files A and B
> In a loop get a line from each. If the line from A is a match, do
something
> with the current line from B.
> If you are getting only one, exit the loop.
> 
> Or, if willing, we could have suggested any other file format, such as a
> CSV, in which the algorithm is similar but different as in:
> 
> Open file A
> Read a line in a loop
> Split it in parts
> If the party of the first part matches something, use the party of the
> second part
> 
> Or, of course, suggest they read the entire file, into a list of lines or
a
> data.frame and use some tools that search all of it and produce results.
> 
> I find I personally now often lean toward the latter approach but ages ago
> when memory and CPU were considerations and maybe garbage collection was
not
> automatic, ...
> 
> 
> -----Original Message-----
> From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org>
On
> Behalf Of Thomas Passin via Python-list
> Sent: Wednesday, January 31, 2024 7:25 AM
> To: python-list at python.org
> Subject: Re: Extract lines from file, add to new files
> 
> On 1/30/2024 11:25 PM, avi.e.gross at gmail.com wrote:
>> Thomas, on some points we may see it differently.
> 
> I'm mostly going by what the OP originally asked for back on Jan 11.
> He's been too stingy with information since then to be worth spending
> much time on, IMHO.
> 
>> Some formats can be done simply but are maybe better done in somewhat
>> standard ways.
>>
>> Some of what the OP has is already tables in a database and that can
>> trivially be exported into a CSV file or other formats like your TSV file
>> and more. They can also import from there. As I mentioned, many
> spreadsheets
>> and all kinds of statistical programs tend to support some formats making
> it
>> quite flexible.
>>
>> Python has all kinds of functionality, such as in the pandas module, to
> read
>> in a CSV or write it out. And once you have the data structure in memory,
> al
>> kinds of queries and changes can be made fairly straightforwardly. As one
>> example, Rich has mentioned wanting finer control in selecting who gets
> some
>> version of the email based on concepts like market segmentation. He
> already
>> may have info like the STATE (as in Arizona) in his database. He might at
>> some point enlarge his schema so each entry is placed in one or more
>> categories and thus his CSV, once imported, can do the usual tasks of
>> selecting various rows and columns or doing joins or whatever.
>>
>> Mind you, another architecture could place quite a bit of work completely
> on
>> the back end and he could send SQL queries to the database from python
and
>> get back his results into python which would then make the email messages
>> and pass them on to other functionality to deliver. This would remove any
>> need for files and just rely on the DB.
>>
>> There as as usual, too many choices and not necessarily one best answer.
> Of
>> course if this was a major product that would be heavily used, sure, you
>> could tweak and optimize. As it is, Rich is getting a chance to improve
> his
>> python skills no matter which way he goes.
>>
>>
>>
>> -----Original Message-----
>> From: Python-list <python-list-bounces+avi.e.gross=gmail.com at python.org>
> On
>> Behalf Of Thomas Passin via Python-list
>> Sent: Tuesday, January 30, 2024 10:37 PM
>> To: python-list at python.org
>> Subject: Re: Extract lines from file, add to new files
>>
>> On 1/30/2024 12:21 PM, Rich Shepard via Python-list wrote:
>>> On Tue, 30 Jan 2024, Thomas Passin via Python-list wrote:
>>>
>>>> Fine, my toy example will still be applicable. But, you know, you
> haven't
>>>> told us enough to give you help. Do you want to replace text from
values
>>>> in a file? That's been covered. Do you want to send the messages using
>>>> those libraries? You haven't said what you don't know how to do.
>>>> Something
>>>> else? What is it that you want to do that you don't know how?
>>>
>>> Thomas,
>>>
>>> For 30 years I've used a bash script using mailx to send messages to a
>> list
>>> of recipients. They have no salutation to personalize each one. Since I
>>> want
>>> to add that personalized salutation I decided to write a python script
to
>>> replace the bash script.
>>>
>>> I have collected 11 docs explaining the smtplib and email modules and
>>> providing example scripts to apply them to send multiple individual
>>> messages
>>> with salutations and attachments.
>>
>> If I had a script that's been working for 30 years, I'd probably just
>> use Python to do the personalizing and let the rest of the bash script
>> do the rest, like it always has.  The Python program would pipe or send
>> the personalized messages to the rest of the bash program. Something in
>> that ballpark, anyway.
>>
>>> Today I'm going to be reading these. They each recommend using .csv
input
>>> files for names and addresses. My first search is learning whether I can
>>> write a single .csv file such as:
>>> "name1","address1"
>>> "mane2","address2"
>>> which I believe will work; and by inserting at the top of the message
>> block
>>> Hi, {yourname}
>>> the name in the .csv file will replace the bracketed place holder
>> If the file contents are going to be people's names and email addresses,
>> I would just tab separate them and split each line on the tab.  Names
>> aren't going to include tabs so that would be safe.  Email addresses
>> might theoretically include a tab inside a quoted name but that would be
>> extremely obscure and unlikely.  No need for CSV, it would just add
>> complexity.
>>
>> data = f.readlines()
>> for d in data:
>>        name, addr = line.split('\t') if line.strip() else ('', '')
>>
>>> Still much to learn and the batch of downloaded PDF files should educate
>>> me.
>>>
>>> Regards,
>>>
>>> Rich
>>
> 

-- 
https://mail.python.org/mailman/listinfo/python-list



More information about the Python-list mailing list