Python "byte code" description 
Author Message
 Python "byte code" description

Hi,

To the best of my understanding, the python interpretor executes Python
code that has been compiled to a Python byte code.

Is this byte code specified anywhere? I can't seem to find anything,
except this very cursory overview:

http://www.*-*-*.com/

Is there anything more comprehensive, apart from the Python
implementation itself? Not only do I need the actual file format, but
also the run time assumptions (stack and memory characteristics) would
also be necessary.

Thanks in advance,
D.



Wed, 25 May 2005 15:07:27 GMT  
 Python "byte code" description


Quote:
> To the best of my understanding, the Python interpretor executes
Python
> code that has been compiled to a Python byte code.

This is true for the CPython interpreter and only for that
interpreter.  Jython compiles to javacode.  Another (experimental)
implementation translated most of Python to another language.  The
point is that PyCode is an implementation detail of one
implementation.  It is not a part of the language itself or its
definition.  You can understand almost all of Python without knowing
about compilation for efficiency.

Quote:
> Is this byte code specified anywhere? I can't seem to find anything,
> except this very cursory overview:

> http://python.org/doc/current/lib/bytecodes.html

Aside from the C code itself, that *is* the documentation -- just so
you can understand the output of dis module.

Quote:
> Is there anything more comprehensive, apart from the Python
> implementation itself? Not only do I need the actual file format,

Why?  The internal .pyc format is an internal implementation detail
subject to change with each version.  The doc for this is the c code
which write and reads it.   It consists of a magic number, PyCode, and
marshaled versions of literals.

Quote:
> but also the run time assumptions (stack and memory characteristics)
would
> also be necessary.

Python grabs what it needs as long as the OS will give it.  Specifics
depend on your OS and hardware.  It 'assumes' that the system
resources are sufficient for the task you give it.

Perhaps you can restate your question to be more specific, and give a
bit of context.

Terry J. Reedy



Wed, 25 May 2005 15:53:51 GMT  
 Python "byte code" description

Quote:
>> Is there anything more comprehensive, apart from the Python
>> implementation itself? Not only do I need the actual file format,

>Why?  The internal .pyc format is an internal implementation detail
>subject to change with each version.  The doc for this is the c code
>which write and reads it.   It consists of a magic number, PyCode, and
>marshaled versions of literals.

Probably in the past I would have found developer documentation on the
lastest _implementation detail_ of Python's bytecode to be interesting.

C//



Wed, 25 May 2005 17:18:59 GMT  
 Python "byte code" description

Quote:

> It is not a part of the language itself or its
> definition.

I think you could argue that it would be useful in some situations if it
were well defined. Or how else how can you be sure other interpeters,
de{*filter*}s, or other tools that operate on the byte code (or virtual
machine) are correct?

Or, what if I wanted to convert Python byte codes to JVM byte codes
directly? Or Parrot byte codes? Or [insert VM here] byte codes?

Quote:
> You can understand almost all of Python without knowing
> about compilation for efficiency.

True. But my question wasn't about understanding Python.

Quote:

> >Is there anything more comprehensive, apart from the Python
> >implementation itself? Not only do I need the actual file format,

> The internal .pyc format is an internal implementation detail
> subject to change with each version.  

Maybe, but that fact *itself* isn't documented. Also, that doesn't
always have to be true ...  there may be advantages in defining it
concretely and limiting change, or at least managing it.

A trivial example: how do you know if the current behaviour of the C
implementation for a particular case is a bug, or the way it's supposed
to be, unless it's well defined, or you are Guido? ;)

Quote:

> Python grabs what it needs as long as the OS will give it.  Specifics
> depend on your OS and hardware.  It 'assumes' that the system
> resources are sufficient for the task you give it.

That's not what I mean. I'm asking what assumptions the byte code makes
about its environment (or, it's virtual machine/interpreter). For
example, some of the operators assume the existence of a co_varnames
variable. Others assume a stack.

Quote:

> Perhaps you can restate your question to be more specific, and give a
> bit of context.

What if I wanted to implement an interpreter? I'd need to know what the
properties of the stack are, and other environmental assumptions that
the byte code instruction set makes.

But, I think this answers my question anyway. I'm just going to have to
reverse engineer it for myself :(

Thanks anyhow!

--
D.



Wed, 25 May 2005 18:24:37 GMT  
 Python "byte code" description

Quote:


> > It is not a part of the language itself or its
> > definition.

> I think you could argue that it would be useful in some situations if
> it were well defined. Or how else how can you be sure other
> interpeters, de{*filter*}s, or other tools that operate on the byte code
> (or virtual machine) are correct?

You can't.  But there are really very few tools that operate on
bytecode.  You can expect them to break with every major release of
Python.  I don't personally think that avoiding this is a worthwhile
pursuit.

Quote:
> Or, what if I wanted to convert Python byte codes to JVM byte codes
> directly? Or Parrot byte codes? Or [insert VM here] byte codes?

This strikes me as a silly thing to do: surely more sensible would be
compiling Python source to these alternative bytecodes.

Quote:
> > >Is there anything more comprehensive, apart from the Python
> > >implementation itself? Not only do I need the actual file format,

> > The internal .pyc format is an internal implementation detail
> > subject to change with each version.

> Maybe, but that fact *itself* isn't documented.

I'm slightly surprised by that.  Patches welcome :)

Quote:
> Also, that doesn't always have to be true ...  there may be
> advantages in defining it concretely and limiting change, or at
> least managing it.

There may be.  I personally don't think so.  For instance, people have
made occasional noises about rewriting the core VM to be a register
machine, not a stack machine.  If someone ever gets round to finishing
this and it turns out to be significantly faster, or significantly
more comprehensible code, I don't think we should disallow it on the
grounds that bytecode inspecting tools will break.

Quote:
> A trivial example: how do you know if the current behaviour of the C
> implementation for a particular case is a bug, or the way it's
> supposed to be, unless it's well defined, or you are Guido? ;)

You ask Guido?  (Not any *entirely* facetious answer...)

Quote:
> > Python grabs what it needs as long as the OS will give it.  Specifics
> > depend on your OS and hardware.  It 'assumes' that the system
> > resources are sufficient for the task you give it.

> That's not what I mean. I'm asking what assumptions the byte code
> makes about its environment (or, it's virtual
> machine/interpreter). For example, some of the operators assume the
> existence of a co_varnames variable. Others assume a stack.

The only way you can find this out is to read the code.

Quote:
> > Perhaps you can restate your question to be more specific, and give a
> > bit of context.

> What if I wanted to implement an interpreter?

Why would you want to use the same bytecode as CPython currently does?
Duplicating the effort to that point seems, well, pointless.

Quote:
> I'd need to know what the properties of the stack are, and other
> environmental assumptions that the byte code instruction set makes.

> But, I think this answers my question anyway. I'm just going to have
> to reverse engineer it for myself :(

Yes, but

a) between the docs you've found and the code, it's not that hard.
b) if it were documented, you'd probably still have to read bits of
   code to understand what the docs meant, and to check their accuracy
c) there are plenty of far more important things which are lacking
   documentation (new-style classes and descriptors and so on spring
   to mind).

Good luck!

Cheers,
M.

--
  The ultimate laziness is not using Perl.  That saves you so much
  work you wouldn't believe it if you had never tried it.
                                        -- Erik Naggum, comp.lang.lisp



Wed, 25 May 2005 23:46:39 GMT  
 Python "byte code" description

Quote:

> You can't.  But there are really very few tools that operate on
> bytecode.  You can expect them to break with every major release of
> Python.  I don't personally think that avoiding this is a worthwhile
> pursuit.

>> Or, what if I wanted to convert Python byte codes to JVM byte codes
>> directly? Or Parrot byte codes? Or [insert VM here] byte codes?

IMO, a more interesting (from a geek/CS POV) project would be somebody who
wants to write a compiler for a different language that generates bytecode
that would be executed by they Python VM and inter-operate with "Python"
bytecode. One of the big strengths of Python is the library.  If you
generated Python bytecodes and used the same calling interface, then you
could impliment a new language that could also take advantage of the
existing libraries.

The fact that JVM is documented allowed JPython to do exactly this.

If the Python VM definition does not exist or is not stable, it prohibits
other compilers from taking advantage of the VM and existing code base.

Quote:
> This strikes me as a silly thing to do: surely more sensible would be
> compiling Python source to these alternative bytecodes.

Or compiling alternative languages to Python bytecodes.

Quote:
>> Also, that doesn't always have to be true ...  there may be advantages in
>> defining it concretely and limiting change, or at least managing it.

> There may be.  I personally don't think so.

If I wanted to compile a different language into Python bytecode I _would_
think so.  [But I don't, so this is all academic.]

Quote:
>> What if I wanted to implement an interpreter?

> Why would you want to use the same bytecode as CPython currently does?
> Duplicating the effort to that point seems, well, pointless.

Perhaps somebody wants to impliment a hardware-assisted interpreter.  It's
been done for JVM.  If PVM isn't defined/stable, then doing HW accelleration
is going to be a bit difficult.

--
Grant Edwards                   grante             Yow!  MERYL STREEP is my
                                  at               obstetrician!
                               visi.com            



Thu, 26 May 2005 00:18:45 GMT  
 Python "byte code" description

Quote:


> > You can't.  But there are really very few tools that operate on
> > bytecode.  You can expect them to break with every major release of
> > Python.  I don't personally think that avoiding this is a worthwhile
> > pursuit.

> >> Or, what if I wanted to convert Python byte codes to JVM byte codes
> >> directly? Or Parrot byte codes? Or [insert VM here] byte codes?

> IMO, a more interesting (from a geek/CS POV) project would be somebody who
> wants to write a compiler for a different language that generates bytecode
> that would be executed by they Python VM and inter-operate with "Python"
> bytecode.

OK, that's something I hadn't thought of.  Such a task doesn't sound
like fun, for a bunch of reasons.

Quote:
> One of the big strengths of Python is the library.  If you
> generated Python bytecodes and used the same calling interface, then you
> could impliment a new language that could also take advantage of the
> existing libraries.

Yeah, but you can do that already through the C API, shurely?

[...]

Quote:
> Perhaps somebody wants to impliment a hardware-assisted interpreter.  It's
> been done for JVM.  If PVM isn't defined/stable, then doing HW accelleration
> is going to be a bit difficult.

I think it's going to be pretty hard in all circumstances.  But you
make a valid point I hadn't thought of.

Cheers,
M.

--
  MGM will not get your whites whiter or your colors brighter.
  It will, however, sit there and look spiffy while sucking down
  a major honking wad of RAM.              -- http://www.xiph.org/mgm/



Thu, 26 May 2005 00:28:25 GMT  
 Python "byte code" description

    Derek> Is this byte code specified anywhere? I can't seem to find
    Derek> anything, except this very cursory overview:

    Derek> http://python.org/doc/current/lib/bytecodes.html

The source really is the best place to look, however, I started to cobble
some stuff together last summer and placed it in a Wiki:

    http://manatee.mojam.com/pyvmwiki

Feel free to add/edit to your heart's content as you discover bits that are
missing (there are many missing bits).

--

http://www.mojam.com/
http://www.musi-cal.com/



Thu, 26 May 2005 00:05:59 GMT  
 Python "byte code" description

Quote:

>> IMO, a more interesting (from a geek/CS POV) project would be somebody who
>> wants to write a compiler for a different language that generates bytecode
>> that would be executed by they Python VM and inter-operate with "Python"
>> bytecode.

> OK, that's something I hadn't thought of.  Such a task doesn't sound like
> fun, for a bunch of reasons.

Fun is subjective. ;)

Quote:
>> One of the big strengths of Python is the library.  If you generated Python
>> bytecodes and used the same calling interface, then you could impliment a
>> new language that could also take advantage of the existing libraries.

> Yeah, but you can do that already through the C API, shurely?

I suppose, but then you'd have to have two virtual machines running, one for
the new language and one for Python libs, with C-language glue in-between.

Quote:
> [...]
>> Perhaps somebody wants to impliment a hardware-assisted interpreter.  It's
>> been done for JVM.  If PVM isn't defined/stable, then doing HW accelleration
>> is going to be a bit difficult.

> I think it's going to be pretty hard in all circumstances.  But you make a
> valid point I hadn't thought of.

It would certainly be hard, but having undefined VM makes it even harder.

--
Grant Edwards                   grante             Yow!  It's a lot of fun
                                  at               being alive... I wonder if
                               visi.com            my bed is made?!?



Thu, 26 May 2005 02:30:10 GMT  
 Python "byte code" description

Quote:

> IMO, a more interesting (from a geek/CS POV) project would be somebody who
> wants to write a compiler for a different language that generates bytecode
> that would be executed by they Python VM

You'd probably find it easier to translate the other language
into Python source and compile it with the Python compiler.

That solution would also have the advantage of not breaking
the next time the bytecode format changes.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,      
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg



Fri, 27 May 2005 10:28:50 GMT  
 
 [ 10 post ] 

 Relevant Pages 

1. "byte oriented platforms"? (was python compiler)

2. string.join(["Tk 4.2p2", "Python 1.4", "Win32", "free"], "for")

3. "bad synchronous description" in Xilinx WebPack

4. ".o" file format description

5. Description about "fpatat"

6. "Clarion" ads in Byte

7. Generating "random" bytes

8. Reading a "byte" from a file

9. BEGIN{want[]={"s1o", "s2o", "s2q", "s3q"}

10. Parsing ""D""?

11. "Fifth", "Forth", zai nar?

12. Hi, this code: text0 = "One $BLAH Three" text1 = "One @BLAH Three" text0.sub!("$BLAH", "Two") text1.sub!("@BLAH", "Two") print text0,"\n" print text1,"\n" produces thiHi, this code: text0 = "One $BLAH Three" text1 = "One @BLAH Three" text0.sub!("$BLAH", "T

 

 
Powered by phpBB® Forum Software