Step further with filebasedMessages

Steven D'Aprano steve+comp.lang.python at pearwood.info
Tue May 5 06:41:51 EDT 2015


On Tuesday 05 May 2015 18:52, Cecil Westerhof wrote:

> I now defined get_message_slice:
>     ### Add step
>     def get_message_slice(message_filename, start, end):
>         """
>         Get a slice of messages, where 0 is the first message
>         Works with negative indexes
>         The values can be ascending and descending
>         """

What's a message in this context? Just a line of text?


>         message_list    = []
>         real_file       = expanduser(message_filename)
>         nr_of_messages  = get_nr_of_messages(real_file)
>         if start < 0:
>             start += nr_of_messages
>         if end < 0:
>             end += nr_of_messages
>         assert (start >= 0) and (start < nr_of_messages)
>         assert (end   >= 0) and (end   < nr_of_messages)

You can write that as:

    assert 0 <= start < nr_of_messages

except you probably shouldn't, because that's not a good use for assert: 
start and end are user-supplied parameters, not internal invariants.

You might find this useful for understanding when to use assert:

http://import-that.dreamwidth.org/676.html



>         if start > end:
>             tmp             = start
>             start           = end
>             end             = tmp
>             need_reverse    = True

You can swap two variables like this:

    start, end = end, start

The language guarantees that the right hand side will be evaluated before 
the assignments are done, so it will automatically do the right thing.


>         else:
>             need_reverse    = False

Your behaviour when start and end are in opposite order does not match the 
standard slicing behaviour:

py> "abcdef"[3:5]
'de'
py> "abcdef"[5:3]
''

That doesn't mean your behaviour is wrong, but it will surprise anyone who 
expects your slicing to be like the slicing they are used to.


>         with open(real_file, 'r') as f:
>             for message in islice(f, start, end + 1):
>                 message_list.append(message.rstrip())
>         if need_reverse:
>             message_list.reverse()
>         return message_list
> 
> Is that a good way?


I think what I would do is:

def get_message_slice(message_filename, start=0, end=None, step=1):
    real_file = expanduser(message_filename)
    with open(real_file, 'r') as f:
        messages = f.readlines()
    return messages[start:end:step]


until such time that I could prove that I needed something more 
sophisticated. Then, and only then, would I consider your approach, except 
using a slice object:

# Untested.
def get_message_slice(message_filename, start=0, end=None, step=1):
    real_file = expanduser(message_filename)
    messages = []
    # FIXME: I assume this is expensive. Can we avoid it?
    nr_of_messages = get_nr_of_messages(real_file)
    the_slice = slice(start, end, step)
    # Calculate the indexes in the given slice, e.g.
    # start=1, stop=7, step=2 gives [1,3,5].
    indices = range(*(the_slice.indices(nr_of_messages)))
    with open(real_file, 'r') as f:
        for i, message in enumerate(f):
            if i in indices:
                messages.append(message)
    return messages


There is still room for optimization: e.g. if the slice is empty, don't 
bother iterating over the file. I leave that to you.



-- 
Steve





More information about the Python-list mailing list