which is better, string concatentation or substitution?

Mon May 8 11:44:33 EDT 2006

Thank you Roy.

It seems if you lurk here long enough you eventually get all you
questions answered without even asking!
;-)

Roy Smith wrote:
> John Salerno  <johnjsal at NOSPAMgmail.com> wrote:
> >Roy Smith wrote:
> >
> >> One may be marginally faster, but they both require copying the source
> >> string, and are thus both O(n).
> >
> >Sorry, I'm not familiar with the O(n) notation.
>
> OK, here's a quick tutorial to "big-oh" notation.  It's a way of
> measuring algorithmic complexity.  The O stands for "on the Order
> of".
>
> For any algorithm, if you process n things (in the case of the strings
> we're talking about, n would be the total number of characters in all
> the strings), you can compute the number of steps it takes to complete
> all the processing as some function of n.
>
> For example, let's say a given algorithm took 4n^2 + 5n + 17 steps to
> complete.  It doesn't take much experimentation to prove to yourself
> that for all but the very smallest values of n, the constant 17 is
> completely insignificant.  In fact, n doesn't have to get very big
> before the 5n term becomes insignificant too.  To a reasonable
> approximation, you could say that the algorithm takes 4n^2 steps and
> be close enough.  For small values of n, this will be a bad
> approximation, but it doesn't really matter for small values of n.
> For large values of n (think hundreds, thousands or even millions),
> it's just fine.
>
> So, the first rule of O() notation is that when looking at how fast an
> algorithm runs, you need only consider the highest order term
> (i.e. the one with the biggest exponent).
>
> But, it gets even better.  Let's compare two algorithms, the one
> above, which takes (approximately) 4n^2 steps, and another one which
> takes 10n steps, for varous values of n:
>
>             n     10n    4n^2
>            ---  ------  ------
>             1     10       4
>             2     20      16
>             3     30      36
>             4     40      64
>             5     50     100
>             6     60     144
>             7     70     196
>             8     80     256
>             9     90     324
>            10    100     400
>            11    110     484
>            12    120     576
>            13    130     676
>            14    140     784
>            15    150     900
>            16    160    1024
>            17    170    1156
>            18    180    1296
>            19    190    1444
>            20    200    1600
>
> Notice that it doesn't take long for the fact that n^2 grows a lot
> faster than n to swamp the fact that 10 is bigger than 4.  So the next
> step in simplification is to say that not only don't the lower-order
> terms matter, but the coefficient on the highest order term doesn't
> even matter.  For a large enough n, all that matters is the exponent
> (for the moment, I'm making a further simplification that all
> algorithms can be described by polynomials with integer exponents).
>
> So, we get things like:
>
> O(n^0), which is almost always written as O(1).  This is a "constant
> time" algorithm, one which takes the same amount of steps to execute
> no matter how big the input is.  For example, in python, you can
> write, "x = 'foo'".  That assignment statement takes the same amount
> of time no matter how long the string is.  All of these execute in the
> same number of steps:
>
> 	x = ''
> 	x = 'foo'
> 	x = 'a very long string with lots and lots of characters'
>
> We can say that "assignment is constant time", or "assignment is
> O(1)".
>
> The next step up is O(n).  This is "linear time" algorithm.  It takes
> a number of steps which is directly proportional to the size of the
> input.  A good example might be "if 'x' in 'bar'".  The obvious way to
> implement this (and, I assume, the way it is implemented in Python),
> is to loop over the string, comparing 'x' to each character in 'bar',
> one by one.  This takes as many steps as there are characters in the
> string.
>
> Next comes O(n^2), also knows as "quadratic time".  This means if your
> input is of size n, the algorithm will take n^2 steps to run.
> Quadratic algorithms are generally considered very slow, and to be
> avoided if at all possible.
>
> Once you get used to thinking like this, it's easy to look at a piece
> of code and say, "oh, that's quadratic", or "that's linear", and
> instantly know which is faster for some some large input.  And once
> you've started thinking like that, you've made one of the big jumps
> from thinking like a coder to thinking like a computer scientist (or,
> if you prefer, like a software engineer).
>
> Google "algorithmic complexity" or "big o notation" (perhaps spelled
> "big oh") for more info.  I would normally recommend Wikipedia, but I
> just took a quick look at their "Big O notation" article, and noticed
> it's full of formal mathematical gobbledygook which just serves to
> obscure what is really a fairly easy concept.