Perl is worse!

Fri Jul 28 22:50:37 EDT 2000

Steve Lamb <grey at despair.rpglink.com> wrote:
> On 28 Jul 2000 20:11:42 GMT, Martijn Faassen <m.faassen at vet.uu.nl> wrote:
[snip]
>     Don't do much re work?  ;)

Nope; though I understand the basic concepts.

>>I'd probably solve that in a more verbose way myself. 

>     My perl background shines through, eh?

Perl code definitely seems to be regular expression happy. In many 
cases they're pretty handy, though in lots of cases where Perl uses regular
expressions Python does it with the string module.

[snip]
>>As your program proceeds and for instance throws the dice, it'll try to use
>>those variables as numbers, but since they're other strings, you'll
>>eventually get strange answers (such as 0). And then presumably you'd have to
>>search what part of your program is wrong. In Python you'd know pretty
>>quickly that you got strings instead of numbers, and so you know what to fix.

>     I know even faster in Perl.  When testing RE I take a sample, toss it in
> an isolated script and build the regex until the expected results come forth.

Oh, sure, testing is always useful, of course. I was presuming we had tested
but some little error slipped in at some point anyway.

> When that is done I test it against a much larger data set for any cases I may
> have missed.  If errant data gets in after that time in Perl you have a script
> which runs but has some problems.  In Python it pukes.

Yes, in Python you'd need to extend it with a few int()s if you want
integers, or build lists or objects or whatever more complicated thing you'd
like.

>  In either case it
> happens outside your hands.  When it comes to regex a properly defined regex
> will not let through abberant data.  Esp. not when it comes to something as
> simple as digits \d versus words \w.

Right -- that's why I didn't understand why you complained about the
extensive exception checking or other checks you need in Python; if you
*know* a string can be safely converted to an integer, int() does the
job just fine by itself.

>>That wasn't the assumption. I said the exact opposite, you need to 
>>remember what type your data is *anyway*. In order to do that you need
>>to check the data if it is input.

>     On input.  But for data generated internally this is not the case since
> you defined it from start to end.

Right, though often your internally generated data is dependent on some
input, and if you slipped in a string instead of a number suddenly your
big formulate will return a mysterious 0 or something. :) But apart from
that, if you are dealing with data internally generally your data is
already of the right type, so no excessive conversions are necessary.
(you mentioned one for your hash key which is using a number; I don't
think a little str() somewhere, or perhaps a ("%s" % n) takes up so
much space.

>>This is a common misconception. *everything* in Python is always, always,
>>references. It is just that certain data types (integers, string, tuples) are 
>>immutable; you cannot change them. So the semantics are the same as 
>>copy semantics.

[snip example]

>     Forget, for a moment, what is happening internally.  What looks like here
> is that in the first case there is an assignment going on and in the latter a
> reference.

Yes, as I said; in the case of immutable objects reference semantics are
the same as copying semantics. But once your object is mutable (such as
a list), you'll see the reference semantics.

>  Nevermind that what is happening in the first case is a references
> objint[1] and b references objint[1] and when we change a it points to a new
> reference objint[2].  It is the difference in behavior of an identical
> operator that is ambigious.  I'm sure even though I am aware that it happens I
> am going to be nailed by it a few times before I get used to it.

It's usually surprisingly easy to handle; I don't recall getting nailed
by it often. And the behavior *is* consistent; there's no ambiguity 
about it. In your first example, you use an immutable object. You could
also have used a mutable object and not changed it, and get the same
'copy semantics' effect. In the second case, you used a mutable object
and then changed it, and so you'll see the reference semantics in the
clear. But it's always there; no ambiguity about it.

>>Isn't that worse than the program giving up? 

>     No.  The program giving up in the customer's hands because of an unchecked
> except is a far cry worse than a nigglet but having it continue on.  As a
> customer which would you rather have, given the choice of these two:

> a: A program that crashes, stopping everything that you're doing with no
> chance of recovery

> b: A program that behaves oddly but allows for chance of recovery.

Ahum, the first! Especially if it deals with critical data, or *money*.
Imagine I get a string by accident (there's a bug in the program) instead
of a number, and the string evaluates to 0, and suddenly what I'm selling
in my e-commerce store is *free*, I'd be rather upset. I'd rather have the
software not working in case of such input!

Also, if you *know* you want to continue in the face of weird behavior
and recover, in Python there's an excellent exception handling system
for you to use. So Python offers the error recovery, if you explicitly
enable it.

[snip]
>>But you haven't; you haven't told the system it's a number yet. :) That's
>>the _only_ extra step you need to take. The system is too stupid to figure
>>it out very well on its own anyway, so why not tell the system explicitly?

>     No, in my case I needed to check to make sure it was a type that can be
> converted to int and then do the actual conversion.  That is two steps more
> than I deem needed considering the strictly checked input in the first place.

This depends. If you know, like in Perl, that your data *has* to be a number,
you just need a single int() and no further checks. If you turned out to
be wrong, you'll find this out as well.

>>Interactive mode (I think it explains in the tutorial) gives you the output
>>of expressions as a result always. 

>     Right, which is worthless in a script.

Which is why it isn't happening in a script. In a script, the result of
expression statements are discarded. This is useful in the case of 
function calls.

>  I'd personally have the Python
> check what mode it is in and toss an exception when it is in interactive mode
> and gets a meaningless statement.

It's not a meaningless statement. Sometimes it's very handy to get the
representation of some Python object. For instance:

>>> def foo(): pass
>>> [many complicated steps later]
...ah, but what is bar now?
>>> bar
<function foo at 80c8f68>

>  It does so on every other possible
> ambiguity, why not this one when it is even less grey than a lot of the ones I
> am tossing out?  Is there a purpose for having such informative statements in
> non-interactive mode?

In non-interactive mode it isn't useful, I think (I'd love to hear some
cases outside the function call example) so we could propose a patch to
Python that trips over this (in the case of non-calls).

>>No, my entire point was that you need to know what your data is *anyway*.

>     Right, I know what the data is, I don't need the language to ask me at
> every operation, "Are you sure?  I mean, really, really, REALLY sure!?"

No, but this is hardly the case in Python either. :)

>>You need to know that something is an integer anyway, so why not simply
>>tell the computer what's going on? I'm *not* saying the language should
>>do the checking. It can't; even though Perl tries.

>     Because what if I want to do something that the computer thinks is
> non-sensical?  

Then don't depend on the computer trying to figure out your nonsensical
thing anyway. Because you'll likely get something nonsensical, or something
you can't predict, unless you learned what would happen already.

Tell the computer explicitly what you want in a clear
way. This way other humans reading your code can understand what's going
on too. Or you yourself the next day, month, year.

>>checking. If you want the system to shut up about your mistakes,
>>you can always do this in Python:

>>try:
>>   a = a + 1
>>except:
>>   pass

>     Great, now I get to litter my code on a per operation basis with tries
> instead of being able to set it on certain variables.  That would make more
> sense.

Hm, it's hardly a recommended way to handle things in general. I certainly
don't use this type of thing in my code; I don't see the necessity and 
I don't feel your pain. I understand that you do, though. Just pointing
out that this behavior is designed and does have a reason.

>>This works. I didn't need to tell the system that 'a' is a cow or a chicken
>>or an animal that can make sounds. 

>     Of course not, they are types.  The variable gets it type at declaration,
> IE, assignment.

Huh? What do you mean, they are types? I was treating two objects that
have nothing to do with each other in the code as the same thing, and it
worked. Wasn't that exactly what you wanted? 

>>You can do that in Python too. I tend to prefer that as well.

>     I also prefer the computer to trust me than mistrust me.

That statement sounds good in rethorics, but in practice I'd rather
have the computer give up on things if I'm trying to do things that
don't seem to make any sense immediately. That way I need to remember
less.

>>>     Funny, we have implicit typing in OO code that causes problems later in
>>> the code.  So yes in OO code.

>>What implicit typing? I don't understand what you mean at all.

>    A variable gets its type upon assignment.  That is implicit typing.

You mean dynamic typing? The objects get the instance type in Python.
But they may be instances of any class, if you call .foo() on an instance
and their class can handle foo() in some way, it'll work.

>>Yes, but if I'm not doing it properly at least the language will complain
>>so I get a broad hint I do need to do it properly. :)

>     As well as if you're doing it properly the language will still complain
> long and loud.

I don't consider trying to add two things that cannot be added as 'proper'.
But we already had that debate. :)

>>Huh? Where in Python did you find your declarations? I'm confused.

> a = None
> int(a)

>     a is declared as None at assignment.

No, the name 'a' doesn't have any type. The object a is referring to has
NoneType. I'm not sure what you're trying to accomplish with 'int(a)'.
(and since Python isn't sure either, it complains :).

>>Well, because it doesn't really make a lot of sense if you want to convert
>>something to an integer when Python can't figure it out. If you're so
>>sure of yourself, you can always do:
>>
>>try:
>>   a = int(b)
>>except ValueError:
>>   a = 0

>     Right, which is basically what I had to do except I left None as None.
> Just seems like a pain in the butt to do after I have allready made sure,
> beyond all doubt, that those variables contain numbers in the first place.

If you already made sure beyond doubt, you only need to do this:

a = int(b)

Unless you made it possible that b contained something that couldn't be
turned into a number, such as None. If you made that possible you aren't
sure beyond all doubt that the variable contains a number, right?

>>All right; this just doesn't happen very often in my code. Generally because
>>I can expect a certain regularity in my input. If my input is less regular
>>than I expected, I get exceptions. That gives me a nice indicator that my
>>expectations were off.

>     Which is why for the base parsing routine I use re.  If it doesn't pass re
> it doesn't get executed, period.  People can try all they want to get around
> the regular expressions but it is really hard to do so.  I don't see a reason
> to write my own test parsing routines when re does it for me.

Sure, but it is the same in Python, right? Except that in the end you have
to cast your integers to ints or whatever other thing you were expecting
from your input.

>>In that case you don't need to do any checking, obviously. You have a 
>>regularity in your data so you just do something like:

>>a = int(a)

>>Though I may be missing something about regular expressions here and
>>I'm goofing up?
>  
>     A regex could return a string or None, only one of which is convertable to
> int.  You must litter your code with such checks.

Ah, okay. I can see that this could be a minor pain, but it also makes
sense to me that you need to check whether something matched *anyway*,
just saying None is 0 isn't the right thing, for instance. In throwaway
scripts it could be a pain, though.

>>Hm, why would you do that? Depends on the structure of your keys. You can
>>use numeric keys in your Python dictionary just fine.

>     Text for information, numericals to prevent duplication.

In Python you'd use a tuple. That's what tuples are designed for;
an immutable combination of things that can be used as a key, for
instance. So you'd say something like:

key = mystring, myvalue

instead of making it all a string.

>>But an integer can be seen as an object with methods as well. A string
>>too (Python 2.0 will actually allow you to use string methods). In fact,
>>I can make a new type of object in Python that behaves much like your
>>Perl scalars:

>     Touche'.  ;)

Thanks! I'm glad my effort wasn't wasted. ;)

Regards,

Martijn
-- 
History of the 20th Century: WW1, WW2, WW3?
No, WWW -- Could we be going in the right direction?