Prothon should not borrow Python strings!
Paul Prescod
paul at prescod.net
Mon May 24 13:46:38 EDT 2004
I skimmed the tutorial and something alarmed me.
"Strings are a powerful data type in Prothon. Unlike many languages,
they can be of unlimited size (constrained only by memory size) and can
hold any arbitrary data, even binary data such as photos and movies.They
are of course also good for their traditional role of storing and
manipulating text."
This view of strings is about a decade out of date with modern
programmimg practice. From the programmer's point of view, a string
should be a list of characters. Characters are logical objects that have
properties defined by Unicode. This is the model used by Java,
Javascript, XML and C#.
Characters are an extremely important logical concept for human beings
(computers are supposed to serve human beings!) and they need
first-class representation. It is an accident of history that the
language you grew up with has so few characters that they can have a
one-to-one correspondance with bytes.
I can understand why you might be afraid to tackle all of Unicode for
version 1.0. Don't bother. All you need to do today to avoid the dead
end is DO NOT ALLOW BINARY DATA IN STRINGS. Have a binary data type.
Have a character string type. Give them a common "prototype" if you
wish. Let them share methods. But keep them separate in your code. The
result of reading a file is a binary data string. The result of parsing
an XML file is a character string. These are as different as the bits
that represent an integer in a particular file format and a logical integer.
Even if your character data type is today limited to characters between
0 and 255, you can easily extend that later. But once you have megabytes
of code that makes no distinction between characters and bytes it will
be too late. It would be like trying to tease apart integers and floats
after having treated them as indistinguishable. (which brings me to my
next post)
Paul Prescod
More information about the Python-list
mailing list