RE Module Performance

Devyn Collier Johnson devyncjohnson at gmail.com
Thu Jul 25 09:44:38 EDT 2013


On 07/25/2013 09:36 AM, Jeremy Sanders wrote:
> wxjmfauth at gmail.com wrote:
>
>> Short example. Writing an editor with something like the
>> FSR is simply impossible (properly).
> http://www.gnu.org/software/emacs/manual/html_node/elisp/Text-Representations.html#Text-Representations
>
> "To conserve memory, Emacs does not hold fixed-length 22-bit numbers that are
> codepoints of text characters within buffers and strings. Rather, Emacs uses a
> variable-length internal representation of characters, that stores each
> character as a sequence of 1 to 5 8-bit bytes, depending on the magnitude of
> its codepoint[1]. For example, any ASCII character takes up only 1 byte, a
> Latin-1 character takes up 2 bytes, etc. We call this representation of text
> multibyte.
>
> ...
>
> [1] This internal representation is based on one of the encodings defined by
> the Unicode Standard, called UTF-8, for representing any Unicode codepoint, but
> Emacs extends UTF-8 to represent the additional codepoints it uses for raw 8-
> bit bytes and characters not unified with Unicode.
>
> "
>
> Jeremy
>
>
Wow! The thread that I started has changed a lot and lived a long time. 
I look forward to its first birthday (^u^).

Devyn Collier Johnson



More information about the Python-list mailing list