Abuse of Big Oh notation

Paul Rubin no.email at nospam.invalid
Mon Aug 20 12:01:21 EDT 2012


Oscar Benjamin <oscar.j.benjamin at gmail.com> writes:
> No it doen't. It is still O(k). The point of big O notation is to
> understand the asymptotic behaviour of one variable as it becomes
> large because of changes in other variables. 

Actually, two separate problems got mixed together late at night.  In
neither case is k an independent variable that ranges over all possible
values.  In both cases it is selected or observed by measurement (i.e.
it is a dependent variable determined by something that is itself not
independent).

1) Access in a rope: here, k is basically determined by the pointer size
of the computer, which in CPython (the implementation we're discussing)
the pointer size is 4 or 8 bytes (constants) in all instances AFAIK.  k
should be a big enough that the pointer and allocation overhead is small
compared to bloating the strings with UCS-2 or UCS-4, and small enough
to not add much scan time.  It seems realistic to say k<=128 for this
(several times smaller is probably fine).  128 is of course a constant
and not a variable.  We are not concerned about hypothetical computers
with billion bit pointers.

2) Parsing tokens: here, k is drawn from a fixed probability
distribution such that its expectation value is again a small constant
(this is an assertion about the data looks like in typical parsing
applications in the real world, not what is mathematically possible).
The average-case complexity (I said "amortized" earlier but should have
said "average-case") is proportional to that constant, which is to say,
O(1).  Of course there is still more wiggle room about what is
"typical".

Analogy: how big a box is required to hold a pair of shoes?  In a purely
theoretical sense we might say O(S) where S is the shoe size, treating
shoe size as an arbitrary independent variable.  But in the real world,
shoe size is controlled by the size of human feet, which is bounded by a
constant for biological reasons.  You don't have to consider shoes the
size of Jupiter.  So it is O(1).



More information about the Python-list mailing list