translating Python to Assembler

over at thepond.com over at thepond.com
Sun Jan 27 06:53:25 EST 2008


>
>> That's not the point, however. I'm trying to say that a processor
>> cannot read a Python script, and since the Python interpreter as
>> stored on disk is essentially an assembler file, 
>
>It isn't; it's an executable.

I appreciated the intelligent response I received from you earlier,
now we're splitting hairs.  :-)  Assembler, like any other higher
level language is written as a source file and is compiled to a
binary. An executable is one form of a binary, as is a dll. When you
view the disassembly of a binary, there is a distinct difference
between C, C++, Delphi, Visual Basic, DOS, or even between the
different file types like PE, NE, MZ, etc. But they all decompile to
assembler. 

While they are in the binary format, they are exactly that...binary.
Who would want to interpret a long string of 1's and 0's. Binaries are
not stored in hexadecimal on disk nor are they in hexadecimal in
memory. But, all the 1's and 0's are in codes when they are
instructions or ASCII strings. No other high level language has the
one to one relationship that assembler has to machine code, the actual
language of the computer. 

Dissassemblers can easily convert a binary to assembler due to the one
to one relationship between them. That can't be said for any other
higher level language. Converting back to C or Python would be a
nightmare, although it's becoming a reality. Converting a compiled
binary back to hexadecimal is basically a matter of converting the
binary to hexadecimal, as in a hex editor. There are exceptions to
that, of course, especially with compound assembler statements that
use extensions to differentiate between registers. 


>
>> any Python script must be sooner or later be converted to
>> assembler form in order to be read by its own interpreter.
>
>This "assembler form" is commonly referred to as "Python byte code".
>
thanks for pointing that out. It lead me to this page:

http://docs.python.org/lib/module-dis.html

where it is explained that the opcodes are in Include/opcode.h. I'll
take a look at that. 

The light goes on. From opcode.h:

#define PRINT_NEWLINE_TO 74

All the ASCIi strings end with 0x74 in the disassembly. I have noted
that Python uses a newline as a line feed/carriage return. Now I'm
getting it. It could all be disassembled with a hex editor, but a
disassembler is better for getting things in order. 

OK. So the pyc files use those defs...that's cool. 



More information about the Python-list mailing list