Writing an emulator in python - implementation questions (for performance)

Thu Nov 12 07:35:07 EST 2009

 Hi.

 I'm trying to port (just for fun), my old Sinclair Spectrum emulator,
ASpectrum, from C to Python + pygame.

 Although the Sinclair Spectrum has a simple Z80 8 bit 3.5Mhz
microprocessor, and no aditional hardware (excluding the +2/+3 model's
AYsound chip), I'm not sure if my loved scripted language, python,
will be fast enought to emulate the Sinclair Spectrum at 100% speed.
There are Java Spectrum emulators available, so it should be possible.

 Anyway, this message is directed to prepare the DESIGN so that the
code can run as fast as possible. I mean, I want to know the best way
to do some emulation tasks before starting to write a single line of
code.

 My questions:

 GLOBAL VARIABLES VS OBJECTS:
==================================

 I need the emulation to be as fastest as possible. In my C program I
have an struct called "Z80_Processor" that contains all the registers,
memory structures, and so on. I pass that struct to the Z80Decode(),
Z80Execute() or Z80Dissassemble() functions, i.e.. This allows me (in
my C emulator) to emulate multiple z80 processors If I want.

 As Python is scripted and I imagine that emulation will be slower
than the emulator written in C, I've thought of not creating a
Z80_Processor *object* and declare global variables such as reg_A,
reg_B, reg_PC, array main_memory[] and so on, and let the z80
functions to directly access that global variables.

 I'm doing this to avoid OOP's extra processing... but this makes the
program less modular. Do you think using processor "objects" would
make the execution slower, or I'm doing right using global variables
and avoiding objects in this type of program?

 Should I start writing all the code with a Z80CPU object and if
performance is low, just remove the "object" layer and declare it as
globals, or I should go directly for globals?

 HIGH AND LOW PART OF REGISTERS:
=================================

- In C, I have the following structs and code for emulating registers:

typedef union
{
  struct
  {
    unsigned char l, h;
  } B;

  unsigned short W;
} eword;

eword reg_A;

 This means that reg_A is a 16 bit "variable" that I can directly
access with reg_A.w=value, and I can access also the LOW BYTE and HIGH
BYTES with reg_A.B.h and reg_A.B.l. And more importante, changing W
modifies l and h, and changing l or h modifies W.

 How can I implement this in Python, I mean, define a 16 byte variable
so that high and low bytes can be accessed separately and changing W,
H or L affects the entire variable? I would like to avoid doing BIT
masks to get or change HIGH or LOW parts of a variable and let the
compiled code to do it by itself.

 I know I can write an "object" with set and get methods that
implement that (that could be done in C too), but for emulation, I
need the FASTEST implementation possible (something like the C's Union
trick).

 MEMORY (ARRAYS):
===========================

 To emulate Spectrum's memory in C, I have a 64KB array like: unsigned
char memory_array[65535]. Then I can access memory_array[reg_PC] to
fetch the next opcode (or data for an opcode) and just increment
reg_PC.

 Is python's array module the best (and fastest) implementation to
"emulate" the memory?

 MEMORY (PAGES):
=============================

 The Sinclair Spectrum 8 bit computer can address 64KB of memory and
that memory is based on 16KB pages (so it can see 4 pages
simultaneously, where page 0 is always ROM). Programs can change
"pages" to point to aditional 16KB pages in 128KB memory models.

 I don't know how to emulate paging in python...

 My first approach would be to have eight 16KB arrays, and "memcpy()"
memory to the main 64KB array when the user calls page swapping. I
mean (C + pseudocode):

 main_memory[65536];
 memory_blocks[8][16384];

 // Initial settings
 current_pages[4] = [0, 1, 2, 3]

 // User swaps last memory page (3) to block 7, so I:
 page_to_swap_from = 3
 page_to_map = 7

 // Save the contents of current page (page being unmapped):
 memcpy( main_memory,                                     // Source
         16384*page_to_swap_from,                         // Starting
at
         memory_blocks[current_pages[page_to_swap_from],  // To
         16384 );                                         // 16K

 // Now map page 7 to memory block 3:
 memcpy( memory_blocks[page_to_map],                      // Source
         0,                                               // Starting
at
         main_memory[page_to_swap_from*16384],            // To
         16384 );                                         // 16K
 current_pages[page_to_swap_from] = page_to_map;

 Memcpy is very fast in C, but I don't know if doing the same in
python with arrays would be fast enough, or if there is a better
approach to simulate paging of 16KB blocks in a 64KB memory windows (4
mappable memory blocks).

 Maybe another approach based in pointers or something like that?