CPython - What's Under the Hood?

(These notes are still quite a work-in-progress but I doubt I'll be visiting this anytime soon ¯\_(ツ)_/¯ )

Python is an interpreted language. What does that mean? Let's say we have a text file something.py. This is our source code. We pass this file to a program called "python" and that program returns an output based on the contents of something.py. That middleware program is the Python interpreter. This articles aims to unpack some aspects of how the interpreter works. There are many interpreters out there such as PyPy (written in Python), Jython (written in Java). For our purposes, we will only be looking at CPython (written in C), the most widely-adapted Python interpreter of them all.

People say that python is an interpreted language. That is kind of misleading because every language has a compiler. All a compiler is, in a general sense, is a program that converts a program from one format to another.

So, the first stage of the interpretation is a compiler which converts python to python bytecode. And the interpreter actually reads the bytecode and returns the output.

So we are going to focus more on bytecode -> output, i.e the interpreter part.

Reading the CPython source code

There are two main sub-directories -

  1. ./Include - This has all the header files. All the interfaces for the objects. .h files.
  2. ./Objects - Each file inside here represents apython object

Other interesting directories -

  1. ./Python - This is the main runtime
  2. ./Modules - Each language comes with a standard library. Not really core to the interpreter.
  3. ./Lib - The standard library functions which are written in Python itself.

The main interpreter loop is in ./Python/ceval.c

Python ByteCode and the Code Object

We are going to look at two files - Include/opcode.h and Python/ceval.c. Let's create a new file test.py and add the following program to it -

x = 1
y = 2
z = x + y
print(z)

Now, let's open an Interactive Python shell using ipython and type c = compile(open('test.py').read(), 'test.py', 'exec')

>>> c = compile(open('test.py').read(), 'test.py', 'exec')
>>> c
<code object <module> at 0x10aca6190, file "test.py", line 1>

For some more information on the in-built compile function, help(compile) gives the following output -

compile(source, filename, mode, flags=0, dont_inherit=False, optimize=-1, *, _feature_version=-1)
    Compile source into a code object that can be executed by exec() or eval().

    The source code may represent a Python module, statement or expression.
    The filename will be used for run-time error messages.
    The mode must be 'exec' to compile a module, 'single' to compile a
    single (interactive) statement, or 'eval' to compile an expression.
    The flags argument, if present, controls which future statements influence
    the compilation of the code.
    The dont_inherit argument, if true, stops the compilation inheriting
    the effects of any future statements in effect in the code calling
    compile; if absent or false these statements do influence the compilation,
    in addition to any features explicitly specified.

So when running this, we can pass whatever for filename. It is to be used to report errors. The first argument is important. exec will compile a module. Conceptually every python file is a module.

# List the attributes of the code object
>>> code_attributes = { attr: getattr(c, attr)  for attr in dir(c) }
>>> pprint(code_attributes)
...
'co_argcount': 0,
'co_cellvars': (),
'co_code': b'e\x00j\x01j\x01\x01\x00d\x00S\x00',
'co_consts': (None,),
'co_filename': 'test.py',
'co_firstlineno': 1,
'co_flags': 64,
'co_freevars': (),
'co_kwonlyargcount': 0,
'co_lnotab': b'',
'co_name': '<module>',
'co_names': ('test', 'py'),
'co_nlocals': 0,
'co_posonlyargcount': 0,
'co_stacksize': 1,
'co_varnames': (),
...

  1. co_code: Is the bytecode of our source code. type(c.co_code) is bytes.
>>> [byte for byte in c.co_code]
[101, 0, 106, 1, 106, 1, 1, 0, 100, 0, 83, 0]

To look at the human-readable version of the bytecode, run python -m dis test.py. Bytecode is a series of instructions. Each instruction consists of two bytes: one for an opcode and one for an argument. -

  1           0 LOAD_CONST               0 (1)
              2 STORE_NAME               0 (x)

  2           4 LOAD_CONST               1 (2)
              6 STORE_NAME               1 (y)

  3           8 LOAD_NAME                0 (x)
             10 LOAD_NAME                1 (y)
             12 BINARY_ADD
             14 STORE_NAME               2 (z)

  4          16 LOAD_NAME                3 (print)
             18 LOAD_NAME                2 (z)
             20 CALL_FUNCTION            1
             22 POP_TOP
             24 LOAD_CONST               2 (None)
             26 RETURN_VALUE

The first number here is the corresponding line number in test.py. The second number is the byte offset (i.e the index of the instruction). Before Python 3.6 (TODO: double check this), byte offsets could be odd. Now for simplicity, its even. Even if an opcode does not need an argument, it is given an argument The third item is the opcode of the current instruction

Python VM

CPython is a stack-oriented virtual machine. Each function pushes a new entry - a frame - onto the call stack. When a function returns, its frame is popped off the stack.

Inside each call frame of the call stack, there are two more stacks -

  1. evaluation stack (a.k.a value stack) - Used for computation within the frame. Most of the computation is done by working on the top of this stack.
  2. block stack - Tracks the "blocks" (loops, try/except, with etc.). To handle statements like break and continue, Python needs to know which block are we currently inside of. Every time you go into a construct like this, it pushes the block onto the stack. Once the block has finished executing, it pops the block off the stack.

Let's change test.py to be the following program -

def fib(n):
    if n < 2:
        return n

    current, next = 0, 1
    while n:
        current, next = next, current + next
        n -= 1

    return current

co_consts

A Tuple which contains all the literal or constant values that were referenced in the body of our function.

>>> fib.__code__.co_consts
(None, 2, (0, 1), 1)

There is a None there because if Python finishes executing the function without reaching an explicit return statement, it will return None

co_varnames

A Tuple which contains all names of the local variables in the function.

>>> fib.__code__.co_varnames
('n', 'current', 'next')

co_names

A Tuple which contains all the non local names which are references in the function.

>>> fib.__code__.co_names
()

co_code

The bytecode of the fib function. In Python 3 this is a bytes object. Previously, this used to be a string

fib.__code__.co_code
>>> b'|\x00d\x01k\x00r\x0c|\x00S\x00d\x02\\\x02}\x01}\x02|\x00r0|\x02|\x01|\x02\x17\x00\x02\x00}\x01}\x02|\x00d\x038\x00}\x00q\x14|\x01S\x00'

Disassembling test.py

  1           0 LOAD_CONST               0 (<code object fib at 0x105e47030, file "test.py", line 1>)
              2 LOAD_CONST               1 ('fib')
              4 MAKE_FUNCTION            0
              6 STORE_NAME               0 (fib)
              8 LOAD_CONST               2 (None)
             10 RETURN_VALUE

Disassembly of <code object fib at 0x105e47030, file "test.py", line 1>:
  2           0 LOAD_FAST                0 (n)
              2 LOAD_CONST               1 (2)
              4 COMPARE_OP               0 (<)
              6 POP_JUMP_IF_FALSE       12

  3           8 LOAD_FAST                0 (n)
             10 RETURN_VALUE

  5     >>   12 LOAD_CONST               2 ((0, 1))
             14 UNPACK_SEQUENCE          2
             16 STORE_FAST               1 (current)
             18 STORE_FAST               2 (next)

  6     >>   20 LOAD_FAST                0 (n)
             22 POP_JUMP_IF_FALSE       48

  7          24 LOAD_FAST                2 (next)
             26 LOAD_FAST                1 (current)
             28 LOAD_FAST                2 (next)
             30 BINARY_ADD
             32 ROT_TWO
             34 STORE_FAST               1 (current)
             36 STORE_FAST               2 (next)

  8          38 LOAD_FAST                0 (n)
             40 LOAD_CONST               3 (1)
             42 INPLACE_SUBTRACT
             44 STORE_FAST               0 (n)
             46 JUMP_ABSOLUTE           20

 10     >>   48 LOAD_FAST                1 (current)
             50 RETURN_VALUE

The >> means that that line is a potential jump target of another instruction.

Let's disassemble the call to this function fib:

 dis.dis('fib(8)')
  1           0 LOAD_NAME                0 (fib)
              2 LOAD_CONST               0 (8)
              4 CALL_FUNCTION            1
              6 RETURN_VALUE
  • 0 LOAD_NAME Loads the fibonacci function object onto the evaluation stack of the current call frame.
  • 2 LOAD_CONST Loads the number 8 onto the evaluation stack of the current call frame.
  • 4 CALL_FUNCTION 1 Calls the function. The argument 1 says that there is exactly one positional argument to call this function with. So, if the argument was k, it would first pop the first k elements from the evaluation stack. It is known that after the k elements are removed, the next item is the function object. It will pop that off next and call the function. The output is then pushed onto the evaluation stack of the current call frame.
  • RETURN_VALUE Pops the top element of the stack and returns to the caller.

Local names are faster than global ones LOAD_CONST > LOAD_FAST > LOAD_NAME or LOAD_GLOBAL

Execution of a frame

In ceval.c, the function PyObject *PyEval_EvalFrameEx(PyFrameObject *f, int throwflag) executes one frame at a time.

Get's the code from the frame object

co = f->f_code;
names = co->co_names;
consts = co->co_consts;
fastlocals = f->f_localsplus;
freevars = f->f_localsplus + co->co_nlocals;
first_instr = (unsigned char*) PyString_AS_STRING(co->co_code);

Let's make a foobar.py with the contents -

x = 10

def foo(x):
    y = x * 2
    return bar(y)

def bar(x):
    y = x / 2
    return y

z = foo(x)
➜  Python-2.7.8 python -m dis foobar.py 
  1           0 LOAD_CONST               0 (10)
              2 STORE_NAME               0 (x)

  3           4 LOAD_CONST               1 (<code object foo at 0x107c90240, file "foobar.py", line 3>)
              6 LOAD_CONST               2 ('foo')
              8 MAKE_FUNCTION            0
             10 STORE_NAME               1 (foo)

  7          12 LOAD_CONST               3 (<code object bar at 0x107c902f0, file "foobar.py", line 7>)
             14 LOAD_CONST               4 ('bar')
             16 MAKE_FUNCTION            0
             18 STORE_NAME               2 (bar)

 11          20 LOAD_NAME                1 (foo)
             22 LOAD_NAME                0 (x)
             24 CALL_FUNCTION            1
             26 STORE_NAME               3 (z)
             28 LOAD_CONST               5 (None)
             30 RETURN_VALUE

Disassembly of <code object foo at 0x107c90240, file "foobar.py", line 3>:
  4           0 LOAD_FAST                0 (x)
              2 LOAD_CONST               1 (2)
              4 BINARY_MULTIPLY
              6 STORE_FAST               1 (y)

  5           8 LOAD_GLOBAL              0 (bar)
             10 LOAD_FAST                1 (y)
             12 CALL_FUNCTION            1
             14 RETURN_VALUE

Disassembly of <code object bar at 0x107c902f0, file "foobar.py", line 7>:
  8           0 LOAD_FAST                0 (x)
              2 LOAD_CONST               1 (2)
              4 BINARY_TRUE_DIVIDE
              6 STORE_FAST               1 (y)

  9           8 LOAD_FAST                1 (y)
             10 RETURN_VALUE

We see a MAKE_FUNCTION here. The function body has already been compiled into a code object. But we still need this extra step because a function object is more than a code object, it is a closure. If the function uses a global variable, it needs to recursively look up the heirarcy of namespaces to resolve it. A code object does not do that. So, MAKE_FUNCTION is important for that reason. It creates a function object which has a code object attribute.

./Include/code.h contains information about the code object

/* Bytecode object */
typedef struct {
    PyObject_HEAD
    int co_argcount;        /* #arguments, except *args */
    int co_nlocals;         /* #local variables */
    int co_stacksize;       /* #entries needed for evaluation stack */
    int co_flags;           /* CO_..., see below */
    PyObject *co_code;      /* instruction opcodes */
    PyObject *co_consts;    /* list (constants used) */
    PyObject *co_names;     /* list of strings (names used) */
    PyObject *co_varnames;  /* tuple of strings (local variable names) */
    PyObject *co_freevars;  /* tuple of strings (free variable names) */
    PyObject *co_cellvars;      /* tuple of strings (cell variable names) */
    /* The rest doesn't count for hash/cmp */
    PyObject *co_filename;  /* string (where it was loaded from) */
    PyObject *co_name;      /* string (name, for reference) */
    int co_firstlineno;     /* first source line number */
    PyObject *co_lnotab;    /* string (encoding addr<->lineno mapping) See
                   Objects/lnotab_notes.txt for details. */
    void *co_zombieframe;       /* for optimization only (see frameobject.c) */
    PyObject *co_weakreflist;   /* to support weakrefs to code objects */
} PyCodeObject;

./Include/frameobject.h contains information about the frame object

typedef struct _frame {
    PyObject_VAR_HEAD
    struct _frame *f_back;    /* previous frame, or NULL */
    PyCodeObject *f_code;     /* code segment */
    PyObject *f_builtins;     /* builtin symbol table (PyDictObject) */
    PyObject *f_globals;      /* global symbol table (PyDictObject) */
    PyObject *f_locals;       /* local symbol table (any mapping) */
    PyObject **f_valuestack;  /* points after the last local */

    // Next free slot in f_valuestack.  Frame creation sets to f_valuestack.
    // Frame evaluation usually NULLs it, but a frame that yields sets it
    // to the current stack top.
    PyObject **f_stacktop;

    PyObject *f_trace;      /* Trace function */

    // If an exception is raised in this frame, the next three are used to
    // record the exception info (if any) originally in the thread state.
    PyObject *f_exc_type, *f_exc_value, *f_exc_traceback;

    PyThreadState *f_tstate;
    int f_lasti;        /* Last instruction if called */
    int f_lineno;       /* Current line number */
    int f_iblock;       /* index in f_blockstack */
    PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */
    PyObject *f_localsplus[1];  /* locals+stack, dynamically sized */
} PyFrameObject;

  • f_back points to the encapsulating frame, so the previous frame in the call frame stack.
  • f_valuestack Each frame has a value stack.
  • f_localsplus The value stack and the local variables are stored in the same structure for optimization but conceptually are two different lists.

CALL_FUNCTION

[Refer to later part of [Philip Guo's lecture](https://www.youtube.com/watch?v=dJD9mLPCkuc&list=PLzV58Zm8FuBL6OAv1Yu6AwXZrnsFbbR0S&index=3)]

Difference between PyFrameObject, PyFunctionObject and PyCodeObject

Clarification from functionobject.h -

Function objects and code objects should not be confused with each other: Function objects are created by the execution of the 'def' statement. They reference a code object in their func_code attribute, which is a purely syntactic object, i.e. nothing more than a compiled version of some source code lines. There is one code object per source code "fragment", but each code object can be referenced by zero or many function objects depending only on how many times the 'def' statement in the source was executed so far.

typedef struct {
    PyObject_HEAD
    PyObject *func_code;         /* A code object */
    PyObject *func_globals;      /* A dictionary (other mappings won't do) */
    PyObject *func_defaults;     /* NULL or a tuple */
    PyObject *func_closure;      /* NULL or a tuple of cell objects */
    PyObject *func_doc;          /* The __doc__ attribute, can be anything */
    PyObject *func_name;         /* The __name__ attribute, a string object */
    PyObject *func_dict;         /* The __dict__ attribute, a dict or NULL */
    PyObject *func_weakreflist;  /* List of weak references */
    PyObject *func_module;       /* The __module__ attribute, can be anything */
} PyFunctionObject;

PyObject

Everything in Python is basically a result of passing a bunch of PyObject pointers around.

https://youtu.be/naZTXNBbcLc?t=1054

Let's create a if-else.py with the following program -

if x < y:
    z = x
elif True:
    z = y
else:
    z = 100
➜  Python-2.7.8 python -m dis if-else.py 
  1           0 LOAD_NAME                0 (x)
              2 LOAD_NAME                1 (y)
              4 COMPARE_OP               0 (<)
              6 POP_JUMP_IF_FALSE       14

  2           8 LOAD_NAME                0 (x)
             10 STORE_NAME               2 (z)
             12 JUMP_FORWARD             4 (to 18)

  4     >>   14 LOAD_NAME                1 (y)
             16 STORE_NAME               2 (z)

  6     >>   18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

PyObject is defined in ./Include/object.h as -

#define _PyObject_HEAD_EXTRA

/* PyObject_HEAD defines the initial segment of every PyObject. */
#define PyObject_HEAD                   \
    _PyObject_HEAD_EXTRA                \
    Py_ssize_t ob_refcnt;               \
    struct _typeobject *ob_type;

/* Nothing is actually declared to be a PyObject, but every pointer to
 * a Python object can be cast to a PyObject*.  This is inheritance built
 * by hand.  Similarly every pointer to a variable-size Python object can,
 * in addition, be cast to PyVarObject*.
 */
typedef struct _object {
    PyObject_HEAD
} PyObject;

Under usual execution circumstances (unless some special flag is used), _PyObject_HEAD_EXTRA is empty.

Every object in Python is wrapped as a PyObject. A PyObject has two main things - a type which is a pointer to a PyTypeObject and a reference count. Every object in python is an instance of a subtype of PyObject. The source code uses "Structural Subtyping". Since C does not have proper OOP constructs, we just make sure that every struct that "inherits" from PyObject has all the attributes of a PyObject along with any additional fields. So for example, PyIntObject is defined as -

typedef struct {
    PyObject_HEAD
    long ob_ival;
} PyIntObject;

As we saw earlier, PyObject_HEAD is basically a macro for two attributes - the type and the reference count. So, PyIntObject has those two and an additional long ob_ival attribute.

PyObject Case Studies

In the previous section we looked at PyObject. In this part, we will explore objects such as lists, strings and tuples.

Apart from PyObject, there is also a PyVarObject. The main difference between a PyObject and a PyVarObject is that the latter has a Py_ssize_t ob_size field. So, for objects which can have a variable size such as lists, they inherit from PyVarObject

PyStringObject

In C, strings are just a sequence of k bytes where each byte corresponds to a character and the last element of the sequence is a 0x00. So in C, a string of k characters takes k+1 bytes in memory.

Let's read the following snippet from stringobject.h

Type PyStringObject represents a character string. An extra zero byte is reserved at the end to ensure it is zero-terminated, but a size is present so strings with null bytes in them can be represented. This is an immutable object type.

typedef struct {
    PyObject_VAR_HEAD
    long ob_shash;
    int ob_sstate;
    char ob_sval[1];
} PyStringObject;

#define SSTATE_NOT_INTERNED 0
#define SSTATE_INTERNED_MORTAL 1
#define SSTATE_INTERNED_IMMORTAL 2
  • ob_sval Although it seems to be defined to be of size 1, when a string is created, it will be allocated to be of size ob_size + 1 and ob_sval[ob_size] == 0 since the last element is a 0x00.
  • ob_shash is the hash of the string or -1 if not computed yet.
  • ob_sstate keeps track of whether a string is interened. ob_sstate != 0 if and only if the string object is in stringobject.c's interned dictionary; in this case the two references from interned to this object are not counted in ob_refcnt.

PyDictObject

Python is built around dictionaries such as globals, locals, module dictionaries, class dictionaries, instance dictionaries etc. Interestingly, globals is a dictioanry but locals is not a dictionary until the point that you ask to see it. Internally, it is implemented in a more optimized form. Classes and instances are thin wrappers around dictionaries.

References

Lecture Series

  1. CPython internals: A ten-hour codewalk through the Python interpreter source code - Philip Guo

Books

  1. Inside The Python Virtual Machine - Obi Ike-Nwosu

Articles

  1. Ten Thousand Meters - Victor Skvortsov
  2. Python Internals - Eli Bendersky
  3. Extending and Embedding the Python Interpreter - Multiple Contributors (list)

Talks

  1. A Bit about Bytes: Understanding Python Bytecode - James Bennett
  2. Modern Python Dictionaries A confluence of a dozen great ideas - Raymond Hettinger (slides)


    Published on 03 August 2021