Newbie: Check first two non-whitespace characters

Steven D'Aprano steve at pearwood.info
Thu Dec 31 20:23:14 EST 2015


On Fri, 1 Jan 2016 05:18 am, otaksoftspamtrap at gmail.com wrote:

> I need to check a string over which I have no control for the first 2
> non-white space characters (which should be '[{').
> 
> The string would ideally be: '[{...' but could also be something like
> '  [  {  ....'.
> 
> Best to use re and how? Something else?

This should work, and be very fast, for moderately-sized strings:


def starts_with_brackets(the_string):
    the_string = the_string.replace(" ", "")
    return the_string.startswith("[}")


It might be a bit slow for huge strings (tens of millions of characters),
but for short strings it will be fine.

Alternatively, use a regex:


import re
regex = re.compile(r' *\[ *\{')

if regex.match(the_string):
    print("string starts with [{ as expected")
else:
    raise ValueError("invalid string")


This will probably be slower for small strings, but faster for HUGE strings
(tens of millions of characters). But I expect it will be fast enough.

It is simple enough to skip tabs as well as spaces. Easiest way is to match
on any whitespace:

regex = re.compile(r'\w*\[\w*\{')




-- 
Steven




More information about the Python-list mailing list