print header for output

Cathy James nambo4jb at gmail.com
Sun Jun 19 00:57:03 EDT 2011


I managed to get output for my function, thanks much  for your
direction. I really appreciate the hints. Now I have tried to place
the statement "print ("Length \t" + "Count\n")" in different places in
my code so that the function can print the headers only one time in
this manner:

Count  Length
4 7
8 1
12 2


Code so far:
def fileProcess(filename = open('declaration.txt', 'r')):

    """Call the program with an argument,
    it should treat the argument as a filename,
    splitting it up into words, and computes the length of each word.
    print a table showing the word count for each of the word lengths
that has been encountered."""

    freq = {} #empty dict to accumulate word count and word length
    print ("Length \t" + "Count\n")
    for line in filename:
        punc = string.punctuation + string.whitespace#use Python's
built-in punctuation and whiitespace
        for word in (line.replace (punc, "").lower().split()):
            if word in freq:
                freq[word] +=1 #increment current count if word already in dict

            else:
                freq[word] = 1 #if punctuation encountered,
frequency=0 word length = 0
        #print ("Length \t" + "Count\n")#print header for all numbers.
        for word, count in freq.items():
            print(len(word), count)

fileProcess()

On Sat, Jun 18, 2011 at 7:09 PM,  <python-list-request at python.org> wrote:
> Send Python-list mailing list submissions to
>        python-list at python.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        http://mail.python.org/mailman/listinfo/python-list
> or, via email, send a message with subject or body 'help' to
>        python-list-request at python.org
>
> You can reach the person managing the list at
>        python-list-owner at python.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Python-list digest..."
>
> Today's Topics:
>
>   1. Re: How do you copy files from one location to another?
>      (Terry Reedy)
>   2. Re: Strategy to Verify Python Program is POST'ing to a web
>      server. (Paul Rubin)
>   3. Re: Strategy to Verify Python Program is POST'ing to a web
>      server. (Terry Reedy)
>   4. Re: debugging https connections with urllib2? (Roy Smith)
>   5. Re: Improper creating of logger instances or a Memory Leak?
>      (Chris Torek)
>   6. Re: Strategy to Verify Python Program is POST'ing to a web
>      server. (Chris Angelico)
>   7. NEED HELP-process words in a text file (Cathy James)
>   8. Re: NEED HELP-process words in a text file (Chris Rebert)
>   9. Re: NEED HELP-process words in a text file (Tim Chase)
>
>
> ---------- Forwarded message ----------
> From: Terry Reedy <tjreedy at udel.edu>
> To: python-list at python.org
> Date: Sat, 18 Jun 2011 16:52:26 -0400
> Subject: Re: How do you copy files from one location to another?
> On 6/18/2011 1:13 PM, Michael Hrivnak wrote:
>>
>> Python is great for automating sysadmin tasks, but perhaps you should
>> just use rsync for this.  It comes with the benefit of only copying
>> the changes instead of every file every time.
>>
>> "rsync -a C:\source E:\destination" and you're done.
>
> Perhaps 'synctree' would be a candidate for addition to shutil.
>
> If copytree did not prohibit an existing directory as destination, it could be used for synching with an 'ignore' function.
>
> --
> Terry Jan Reedy
>
>
>
>
> ---------- Forwarded message ----------
> From: Paul Rubin <no.email at nospam.invalid>
> To: python-list at python.org
> Date: Sat, 18 Jun 2011 14:03:19 -0700
> Subject: Re: Strategy to Verify Python Program is POST'ing to a web server.
> "mzagursk at gmail.com" <mzagursk at gmail.com> writes:
>> For example, if I create a website that tracks some sort of
>> statistical information and don't ensure that my program is the one
>> that is uploading it, the statistics can be thrown off by people
>> entering false POST data onto the data upload page.  Any remedy?
>
> If you're concerned about unauthorized users posting random crap, the
> obvious solution is configure your web server to put password protection
> on the page.
>
> If you're saying AUTHORIZED users (those allowed to use the program to
> post stuff) aren't trusted to not bypass the program, you've basically
> got a DRM problem, especially if you think the users might
> reverse-engineer the program to figure out the protocol.  The most
> effective approaches generally involve delivering the program in the
> form of a hardware product that's difficult to tamper with.  That's what
> cable TV boxes amount to, for example.
>
> What is the application, if you can say?  That might help get better
> answers.
>
>
>
> ---------- Forwarded message ----------
> From: Terry Reedy <tjreedy at udel.edu>
> To: python-list at python.org
> Date: Sat, 18 Jun 2011 17:17:09 -0400
> Subject: Re: Strategy to Verify Python Program is POST'ing to a web server.
> On 6/18/2011 7:34 AM, mzagursk at gmail.com wrote:
>>
>> Hello Folks,
>>
>> I am wondering what your strategies are for ensuring that data
>> transmitted to a website via a python program is indeed from that
>> program, and not from someone submitting POST data using some other
>> means.  I find it likely that there is no solution, in which case what
>> is the best solution for sending data to a remote server from a python
>> program and ensuring that it is from that program?
>>
>> For example, if I create a website that tracks some sort of
>> statistical information and don't ensure that my program is the one
>> that is uploading it, the statistics can be thrown off by people
>> entering false POST data onto the data upload page.  Any remedy?
>
> You have not specified all the parameters of the problem. Are there a limited number of copies of your program or are they distrubuted freely? What about multiple votes from one program?
>
> Corporate proxy votes (which are a legally important type of statistical information) work as follows. Each shareholder is mailed or emailed a 'control number'. Attend stockholder meeting in person, mail proxy vote, or login with any browser with control number. Repeat votes by the same control id supercede previous vote. There should be a 'thank you for voting' response for each vote. I suspect IP addr. is recorded with vote too. I have not heard of specific problems with electronic proxy voting.
>
> --
> Terry Jan Reedy
>
>
>
>
> ---------- Forwarded message ----------
> From: Roy Smith <roy at panix.com>
> To: python-list at python.org
> Date: Sat, 18 Jun 2011 17:45:42 -0400
> Subject: Re: debugging https connections with urllib2?
> In article <4dfcff48$0$49184$e4fe514c at news.xs4all.nl>,
>  Irmen de Jong <irmen.NOSPAM at xs4all.nl> wrote:
>
>> On 18-6-2011 20:57, Roy Smith wrote:
>> > We've got a REST call that we're making to a service provider over https
>> > using urllib2.urlopen().  Is there any way to see exactly what's getting
>> > sent and received over the network (i.e. all the HTTP headers) in plain
>> > text?
>>
>> Put a proxy between the https-service endpoint and your client app.
>> Let the proxy talk https and let your client talk http to the proxy.
>
> Clever.  I like.  Thanks.
>
>
>
> ---------- Forwarded message ----------
> From: Chris Torek <nospam at torek.net>
> To: python-list at python.org
> Date: 18 Jun 2011 22:28:39 GMT
> Subject: Re: Improper creating of logger instances or a Memory Leak?
> In article <ebafe7b6-aa93-4847-81d6-12d396a4ff3c at j28g2000vbp.googlegroups.com>
> foobar  <wjshipman at gmail.com> wrote:
>>I've run across a memory leak in a long running process which I can't
>>determine if its my issue or if its the logger.
>
> You do not say what version of python you are using, but on the
> other hand I do not know how much the logger code has evolved
> over time anyway. :-)
>
>> Each application thread gets a logger instance in it's init() method
>>via:
>>
>>        self.logger = logging.getLogger('ivr-'+str(self.rand))
>>
>>where self.rand is a suitably large random number to avoid collisions
>>of the log file's name.
>
> This instance will "live forever" (since the thread shares the
> main logging manager with all other threads).
> ---------
> class Manager:
>    """
>    There is [under normal circumstances] just one Manager instance, which
>    holds the hierarchy of loggers.
>    """
>    def __init__(self, rootnode):
>        """
>        Initialize the manager with the root node of the logger hierarchy.
>        """
>        [snip]
>        self.loggerDict = {}
>
>    def getLogger(self, name):
>        """
>        Get a logger with the specified name (channel name), creating it
>        if it doesn't yet exist. This name is a dot-separated hierarchical
>        name, such as "a", "a.b", "a.b.c" or similar.
>
>        If a PlaceHolder existed for the specified name [i.e. the logger
>        didn't exist but a child of it did], replace it with the created
>        logger and fix up the parent/child references which pointed to the
>        placeholder to now point to the logger.
>        """
>        [snip]
>                    self.loggerDict[name] = rv
>        [snip]
> [snip]
> Logger.manager = Manager(Logger.root)
> ---------
>
> So you will find all the various ivr-* loggers in
> logging.Logger.manager.loggerDict[].
>
>>finally the last statements in the run() method are:
>>
>>        filehandler.close()
>>        self.logger.removeHandler(filehandler)
>>        del self.logger #this was added to try and force a clean up of
>>the logger instances.
>
> There appears to be no __del__ handler and nothing that allows
> removing a logger instance from the manager's loggerDict.  Of
> course you could do this "manually", e.g.:
>
>        ...
>        self.logger.removeHandler(filehandler)
>        del logging.Logger.manager.loggerDict[self.logger.name]
>        del self.logger # optional
>
> I am curious as to why you create a new logger for each thread.
> The logging module has thread synchronization in it, so that you
> can share one log (or several logs) amongst all threads, which is
> more typically what one wants.
> --
> In-Real-Life: Chris Torek, Wind River Systems
> Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)  +1 801 277 2603
> email: gmail (figure it out)      http://web.torek.net/torek/index.html
>
>
>
> ---------- Forwarded message ----------
> From: Chris Angelico <rosuav at gmail.com>
> To: python-list at python.org
> Date: Sun, 19 Jun 2011 09:12:13 +1000
> Subject: Re: Strategy to Verify Python Program is POST'ing to a web server.
> On Sun, Jun 19, 2011 at 6:40 AM, Michael Hrivnak <mhrivnak at hrivnak.org> wrote:
>> On Sat, Jun 18, 2011 at 1:26 PM, Chris Angelico <rosuav at gmail.com> wrote:
>>> SSL certificates are good, but they can be stolen (very easily if the
>>> client is open source). Anything algorithmic suffers from the same
>>> issue.
>>
>> This is only true if you distribute your app with one built-in
>> certificate, which does indeed seem like a bad idea.  When you know
>> your user base though, especially if this is a situation with a small
>> number of deployments, than you can distribute a unique certificate to
>> each client, signed by your CA.
>
> That changes it from verifying the program to verifying the user. It's
> a somewhat different beast, but it still leaves the possibility of
> snagging the cert and using it in another program. Same with IP
> address checks. You can't prove that the other end is a particular
> program.
>
>>> You could go a long way toward it, though, by
>>> using something ridiculously complex, such as:
>>> ...
>>
>> An authentication process that involves the client executing code
>> supplied by the server opens up one single point of failure (server is
>> compromised or man-in-the-middle attack is happening) by which
>> arbitrary code could get executed on the client.  Yikes!
>
> Yeah, hence the part of verifying the server's cert too. That one is a
> bit safer though; nobody but you will have that certificate, so it's
> not as easy to take and put into another program. But this whole
> scheme was meant from the start to be ridiculous.
>
>> If ...
>> then you'll have to accept that you cannot trust the submitted data
>> 100%, and just take measures to mitigate abuse.
>
> I still stand by my original point, namely that the "if" on here is
> superfluous, and the "then" is unconditional. But the measures you
> describe _do_ reduce the likelihood significantly.
>
> ChrisA
>
>
>
> ---------- Forwarded message ----------
> From: Cathy James <nambo4jb at gmail.com>
> To: python-list at python.org
> Date: Sat, 18 Jun 2011 18:21:55 -0500
> Subject: NEED HELP-process words in a text file
> Dear Python Experts,
>
> First, I'd like to convey my appreciation to you all for your support
> and contributions.  I am a Python newborn and need help with my
> function. I commented on my program as to what it should do, but
> nothing is printing. I know I am off, but not sure where. Please
> help:(
>
> import string
> def fileProcess(filename):
>    """Call the program with an argument,
>    it should treat the argument as a filename,
>    splitting it up into words, and computes the length of each word.
>    print a table showing the word count for each of the word lengths
> that has been encountered.
>    Example:
>    Length Count
>    1 16
>    2 267
>    3 267
>    4 169
>    >>>"&"
>    Length    Count
>    0    0
>    >>>
>    >>>"right."
>    Length    Count
>    5    10
>    """
>    freq = [] #empty dict to accumulate words and word length
>    filename=open('declaration.txt, r')
>    for line in filename:
>        punc = string.punctuation + string.whitespace#use Python's
> built-in punctuation and whiitespace
>        for i, word in enumerate (line.replace (punc, "").lower().split()):
>            if word in freq:
>                freq[word] +=1 #increment current count if word already in dict
>
>            else:
>                freq[word] = 0 #if punctuation encountered,
> frequency=0 word length = 0
>        for word in freq.items():
>            print("Length /t"+"Count/n"+ freq[word],+'/t' +
> len(word))#print word count and length of word separated by a tab
>
>
>
>
>    #Thanks in advance,
> CJ.
>
>
>
> ---------- Forwarded message ----------
> From: Chris Rebert <clp2 at rebertia.com>
> To: Cathy James <nambo4jb at gmail.com>
> Date: Sat, 18 Jun 2011 16:30:00 -0700
> Subject: Re: NEED HELP-process words in a text file
> On Sat, Jun 18, 2011 at 4:21 PM, Cathy James <nambo4jb at gmail.com> wrote:
>> Subject: NEED HELP-process words in a text file
>>
>> Dear Python Experts,
>>
>> First, I'd like to convey my appreciation to you all for your support
>> and contributions.  I am a Python newborn and need help with my
>> function. I commented on my program as to what it should do, but
>> nothing is printing. I know I am off, but not sure where. Please
>> help:(
>
> Netiquette comment: Please avoid SHOUTING and including unnecessary
> entreaties in your subject lines in the future.
>
> Cheers,
> Chris
>
>
>
> ---------- Forwarded message ----------
> From: Tim Chase <python.list at tim.thechases.com>
> To: Cathy James <nambo4jb at gmail.com>
> Date: Sat, 18 Jun 2011 19:09:18 -0500
> Subject: Re: NEED HELP-process words in a text file
> On 06/18/2011 06:21 PM, Cathy James wrote:
>
>>     freq = [] #empty dict to accumulate words and word length
>
> While you say you create an empty dict, using "[]" creates an empty *list*, not a dict.  Either your comment is wrong or your code is wrong. :)  Given your usage, I presume you want a dict, not a list.
>
>>     for line in filename:
>>         punc = string.punctuation + string.whitespace#use Python's
>> built-in punctuation and whiitespace
>
> Since you don't change "punc" in your loop, you'd get better performance by hoisting this outside of the loop so it's only evaluated once.  Not that it should matter *that* greatly, but it's just a bad-code-smell.
>
>>         for i, word in enumerate (line.replace (punc, "").lower().split()):
>
> .replace() doesn't operate on sets of characters, but rather strings.  So unless your line contains the exact text in "punc" (unlikely), that replacement is a NOP.  There are a couple ways to go about removing unwanted characters:
>
> - make a set of those characters and produce a resulting string from things not in that set:
>
>  punc_set = set(punc)
>  line = ''.join(c for c in line if c not in punc_set)
>
> - use a regexp to strip them out...something like
>
>  punc_re = re.compile("[" + re.escape(punc) + "]")
>  ...
>  line = punc_re.sub('', line)
>
> - use string translations.  I'm not as familiar with these, but the following seemed to work for me, abusing the 2nd "deletechars" parameter for your particular use-case:
>
>  line = line.translate(None, punc)
>
> I don't see .translate(None) documented anywhere.  My random effort seemed to work in 2.6, but fails in 2.5 and prior.  YMMV.
>
>>             if word in freq:
>>                 freq[word] +=1 #increment current count if word already in dict
>>
>>             else:
>>                 freq[word] = 0 #if punctuation encountered,
>> frequency=0 word length = 0
>
> Again, your 2nd comment disagrees with your code.  As an aside, if you're using 2.5 or greater, I'd use collections.defaultdict(int) as the accumulator:
>
>  freq = collections.defaultdict(int)
>  ...
>  freq[word] += 1
>  # no need to check presence
>
>>         for word in freq.items():
>>             print("Length /t"+"Count/n"+ freq[word],+'/t' +
>> len(word))#print word count and length of word separated by a tab
>
> Where to begin:
>
> - Your escapes are using "/" instead of "\" for <tab> and <newline> which I expect will mess up the formatting.
>
> - You're also labeling them "Length/Count" but printing "count/length".
>
> - you're iterating over freq.items() but that should be written as
>
>  for word, count in freq.items():
>
> or
>
>  for word in freq:
>
> -  Additionally, adding the bits together makes it somewhat hard to understand.
>
> I'd use something like
>
>  for word, count in freq.items():
>    print("Word \tLength \tCount\n%s \t%i \t%i" % (
>      word, len(word), count))
>
> -tkc
>
>
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>



More information about the Python-list mailing list