mirror of
				https://github.com/python/cpython.git
				synced 2025-10-31 13:41:24 +00:00 
			
		
		
		
	Text moved to PEP 339.
This commit is contained in:
		
							parent
							
								
									5118517c16
								
							
						
					
					
						commit
						3909ff69e2
					
				
					 1 changed files with 0 additions and 507 deletions
				
			
		|  | @ -1,507 +0,0 @@ | ||||||
| Developer Notes for Python Compiler |  | ||||||
| =================================== |  | ||||||
| 
 |  | ||||||
| Table of Contents |  | ||||||
| ----------------- |  | ||||||
| 
 |  | ||||||
| - Scope |  | ||||||
|     Defines the limits of the change |  | ||||||
| - Parse Trees |  | ||||||
|     Describes the local (Python) concept |  | ||||||
| - Abstract Syntax Trees (AST) |  | ||||||
|     Describes the AST technology used |  | ||||||
| - Parse Tree to AST |  | ||||||
|     Defines the transform approach |  | ||||||
| - Control Flow Graphs |  | ||||||
|     Defines the creation of "basic blocks" |  | ||||||
| - AST to CFG to Bytecode |  | ||||||
|     Tracks the flow from AST to bytecode |  | ||||||
| - Code Objects |  | ||||||
|     Pointer to making bytecode "executable" |  | ||||||
| - Modified Files |  | ||||||
|     Files added/modified/removed from CPython compiler |  | ||||||
| - ToDo |  | ||||||
|     Work yet remaining (before complete) |  | ||||||
| - References |  | ||||||
|     Academic and technical references to technology used. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Scope |  | ||||||
| ----- |  | ||||||
| 
 |  | ||||||
| Historically (through 2.4), compilation from source code to bytecode |  | ||||||
| involved two steps: |  | ||||||
| 
 |  | ||||||
| 1. Parse the source code into a parse tree (Parser/pgen.c) |  | ||||||
| 2. Emit bytecode based on the parse tree (Python/compile.c) |  | ||||||
| 
 |  | ||||||
| Historically, this is not how a standard compiler works.  The usual |  | ||||||
| steps for compilation are: |  | ||||||
| 
 |  | ||||||
| 1. Parse source code into a parse tree (Parser/pgen.c) |  | ||||||
| 2. Transform parse tree into an Abstract Syntax Tree (Python/ast.c) |  | ||||||
| 3. Transform AST into a Control Flow Graph (Python/newcompile.c) |  | ||||||
| 4. Emit bytecode based on the Control Flow Graph (Python/newcompile.c) |  | ||||||
| 
 |  | ||||||
| Starting with Python 2.5, the above steps are now used.  This change |  | ||||||
| was done to simplify compilation by breaking it into three steps. |  | ||||||
| The purpose of this document is to outline how the lattter three steps |  | ||||||
| of the process works. |  | ||||||
| 
 |  | ||||||
| This document does not touch on how parsing works beyond what is needed |  | ||||||
| to explain what is needed for compilation.  It is also not exhaustive |  | ||||||
| in terms of the how the entire system works.  You will most likely need |  | ||||||
| to read some source to have an exact understanding of all details. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Parse Trees |  | ||||||
| ----------- |  | ||||||
| 
 |  | ||||||
| Python's parser is an LL(1) parser mostly based off of the |  | ||||||
| implementation laid out in the Dragon Book [Aho86]_. |  | ||||||
| 
 |  | ||||||
| The grammar file for Python can be found in Grammar/Grammar with the |  | ||||||
| numeric value of grammar rules are stored in Include/graminit.h.  The |  | ||||||
| numeric values for types of tokens (literal tokens, such as ``:``, |  | ||||||
| numbers, etc.) are kept in Include/token.h).  The parse tree made up of |  | ||||||
| ``node *`` structs (as defined in Include/node.h). |  | ||||||
| 
 |  | ||||||
| Querying data from the node structs can be done with the following |  | ||||||
| macros (which are all defined in Include/token.h): |  | ||||||
| 
 |  | ||||||
| - ``CHILD(node *, int)`` |  | ||||||
| 	Returns the nth child of the node using zero-offset indexing |  | ||||||
| - ``RCHILD(node *, int)`` |  | ||||||
| 	Returns the nth child of the node from the right side; use |  | ||||||
| 	negative numbers! |  | ||||||
| - ``NCH(node *)`` |  | ||||||
| 	Number of children the node has |  | ||||||
| - ``STR(node *)`` |  | ||||||
| 	String representation of the node; e.g., will return ``:`` for a |  | ||||||
| 	COLON token |  | ||||||
| - ``TYPE(node *)`` |  | ||||||
| 	The type of node as specified in ``Include/graminit.h`` |  | ||||||
| - ``REQ(node *, TYPE)`` |  | ||||||
| 	Assert that the node is the type that is expected |  | ||||||
| - ``LINENO(node *)`` |  | ||||||
| 	retrieve the line number of the source code that led to the |  | ||||||
| 	creation of the parse rule; defined in Python/ast.c |  | ||||||
| 
 |  | ||||||
| To tie all of this example, consider the rule for 'while':: |  | ||||||
| 
 |  | ||||||
|   while_stmt: 'while' test ':' suite ['else' ':' suite] |  | ||||||
| 
 |  | ||||||
| The node representing this will have ``TYPE(node) == while_stmt`` and |  | ||||||
| the number of children can be 4 or 7 depending on if there is an 'else' |  | ||||||
| statement.  To access what should be the first ':' and require it be an |  | ||||||
| actual ':' token, `(REQ(CHILD(node, 2), COLON)``. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Abstract Syntax Trees (AST) |  | ||||||
| --------------------------- |  | ||||||
| 
 |  | ||||||
| The abstract syntax tree (AST) is a high-level representation of the |  | ||||||
| program structure without the necessity of containing the source code; |  | ||||||
| it can be thought of a abstract representation of the source code.  The |  | ||||||
| specification of the AST nodes is specified using the Zephyr Abstract |  | ||||||
| Syntax Definition Language (ASDL) [Wang97]_. |  | ||||||
| 
 |  | ||||||
| The definition of the AST nodes for Python is found in the file |  | ||||||
| Parser/Python.asdl . |  | ||||||
| 
 |  | ||||||
| Each AST node (representing statements, expressions, and several |  | ||||||
| specialized types, like list comprehensions and exception handlers) is |  | ||||||
| defined by the ASDL.  Most definitions in the AST correspond to a |  | ||||||
| particular source construct, such as an 'if' statement or an attribute |  | ||||||
| lookup.  The definition is independent of its realization in any |  | ||||||
| particular programming language. |  | ||||||
| 
 |  | ||||||
| The following fragment of the Python ASDL construct demonstrates the |  | ||||||
| approach and syntax:: |  | ||||||
| 
 |  | ||||||
|   module Python |  | ||||||
|   { |  | ||||||
| 	stmt = FunctionDef(identifier name, arguments args, stmt* body, |  | ||||||
| 			    expr* decorators) |  | ||||||
| 	      | Return(expr? value) | Yield(expr value) |  | ||||||
| 	      attributes (int lineno) |  | ||||||
|   } |  | ||||||
| 
 |  | ||||||
| The preceding example describes three different kinds of statements; |  | ||||||
| function definitions, return statements, and yield statements.  All |  | ||||||
| three kinds are considered of type stmt as shown by '|' separating the |  | ||||||
| various kinds.  They all take arguments of various kinds and amounts. |  | ||||||
| 
 |  | ||||||
| Modifiers on the argument type specify the number of values needed; '?' |  | ||||||
| means it is optional, '*' means 0 or more, no modifier means only one |  | ||||||
| value for the argument and it is required.  FunctionDef, for instance, |  | ||||||
| takes an identifier for the name, 'arguments' for args, zero or more |  | ||||||
| stmt arguments for 'body', and zero or more expr arguments for |  | ||||||
| 'decorators'. |  | ||||||
| 
 |  | ||||||
| Do notice that something like 'arguments', which is a node type, is |  | ||||||
| represented as a single AST node and not as a sequence of nodes as with |  | ||||||
| stmt as one might expect.   |  | ||||||
| 
 |  | ||||||
| All three kinds also have an 'attributes' argument; this is shown by the |  | ||||||
| fact that 'attributes' lacks a '|' before it. |  | ||||||
| 
 |  | ||||||
| The statement definitions above generate the following C structure type:: |  | ||||||
| 
 |  | ||||||
|   typedef struct _stmt *stmt_ty; |  | ||||||
| 
 |  | ||||||
|   struct _stmt { |  | ||||||
|         enum { FunctionDef_kind=1, Return_kind=2, Yield_kind=3 } kind; |  | ||||||
|         union { |  | ||||||
|                 struct { |  | ||||||
|                         identifier name; |  | ||||||
|                         arguments_ty args; |  | ||||||
|                         asdl_seq *body; |  | ||||||
|                 } FunctionDef; |  | ||||||
|                  |  | ||||||
|                 struct { |  | ||||||
|                         expr_ty value; |  | ||||||
|                 } Return; |  | ||||||
|                  |  | ||||||
|                 struct { |  | ||||||
|                         expr_ty value; |  | ||||||
|                 } Yield; |  | ||||||
|         } v; |  | ||||||
|         int lineno; |  | ||||||
|    } |  | ||||||
| 
 |  | ||||||
| Also generated are a series of constructor functions that allocate (in |  | ||||||
| this case) a stmt_ty struct with the appropriate initialization.  The |  | ||||||
| 'kind' field specifies which component of the union is initialized.  The |  | ||||||
| FunctionDef() constructor function sets 'kind' to FunctionDef_kind and |  | ||||||
| initializes the 'name', 'args', 'body', and 'attributes' fields. |  | ||||||
| 
 |  | ||||||
| *** NOTE: if you make a change here that can affect the output of bytecode that |  | ||||||
| is already in existence, make sure to delete your old .py(c|o) files!  Running |  | ||||||
| ``find . -name '*.py[co]' -exec rm -f {} ';'`` should do the trick. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Parse Tree to AST |  | ||||||
| ----------------- |  | ||||||
| 
 |  | ||||||
| The AST is generated from the parse tree in (see Python/ast.c) using the |  | ||||||
| function:: |  | ||||||
| 
 |  | ||||||
|   mod_ty PyAST_FromNode(const node *n); |  | ||||||
| 
 |  | ||||||
| The function begins a tree walk of the parse tree, creating various AST |  | ||||||
| nodes as it goes along.  It does this by allocating all new nodes it |  | ||||||
| needs, calling the proper AST node creation functions for any required |  | ||||||
| supporting functions, and connecting them as needed. |  | ||||||
| 
 |  | ||||||
| Do realize that there is no automated nor symbolic connection between |  | ||||||
| the grammar specification and the nodes in the parse tree.  No help is |  | ||||||
| directly provided by the parse tree as in yacc. |  | ||||||
| 
 |  | ||||||
| For instance, one must keep track of |  | ||||||
| which node in the parse tree one is working with (e.g., if you are |  | ||||||
| working with an 'if' statement you need to watch out for the ':' token |  | ||||||
| to find the end of the conditional).  No help is directly provided by |  | ||||||
| the parse tree as in yacc. |  | ||||||
| 
 |  | ||||||
| The functions called to generate AST nodes from the parse tree all have |  | ||||||
| the name ast_for_xx where xx is what the grammar rule that the function |  | ||||||
| handles (alias_for_import_name is the exception to this).  These in turn |  | ||||||
| call the constructor functions as defined by the ASDL grammar and |  | ||||||
| contained in Python/Python-ast.c (which was generated by |  | ||||||
| Parser/asdl_c.py) to create the nodes of the AST.  This all leads to a |  | ||||||
| sequence of AST nodes stored in asdl_seq structs. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Function and macros for creating and using ``asdl_seq *`` types as found |  | ||||||
| in Python/asdl.c and Include/asdl.h: |  | ||||||
| 
 |  | ||||||
| - ``asdl_seq_new(int)`` |  | ||||||
| 	Allocate memory for an asdl_seq for length 'size' |  | ||||||
| - ``asdl_seq_free(asdl_seq *)`` |  | ||||||
| 	Free asdl_seq struct |  | ||||||
| - ``asdl_seq_GET(asdl_seq *seq, int pos)`` |  | ||||||
| 	Get item held at 'pos' |  | ||||||
| - ``asdl_seq_SET(asdl_seq *seq, int pos, void *val)`` |  | ||||||
| 	Set 'pos' in 'seq' to 'val' |  | ||||||
| - ``asdl_seq_APPEND(asdl_seq *seq, void *val)`` |  | ||||||
| 	Set the end of 'seq' to 'val' |  | ||||||
| - ``asdl_seq_LEN(asdl_seq *)`` |  | ||||||
| 	Return the length of 'seq' |  | ||||||
| 
 |  | ||||||
| If you are working with statements, you must also worry about keeping |  | ||||||
| track of what line number generated the statement.  Currently the line |  | ||||||
| number is passed as the last parameter to each stmt_ty function. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Control Flow Graphs |  | ||||||
| ------------------- |  | ||||||
| 
 |  | ||||||
| A control flow graph (often referenced by its acronym, CFG) is a |  | ||||||
| directed graph that models the flow of a program using basic blocks that |  | ||||||
| contain the intermediate representation (abbreviated "IR", and in this |  | ||||||
| case is Python bytecode) within the blocks.  Basic blocks themselves are |  | ||||||
| a block of IR that has a single entry point but possibly multiple exit |  | ||||||
| points.  The single entry point is the key to basic blocks; it all has |  | ||||||
| to do with jumps.  An entry point is the target of something that |  | ||||||
| changes control flow (such as a function call or a jump) while exit |  | ||||||
| points are instructions that would change the flow of the program (such |  | ||||||
| as jumps and 'return' statements).  What this means is that a basic |  | ||||||
| block is a chunk of code that starts at the entry point and runs to an |  | ||||||
| exit point or the end of the block. |  | ||||||
|    |  | ||||||
| As an example, consider an 'if' statement with an 'else' block.  The |  | ||||||
| guard on the 'if' is a basic block which is pointed to by the basic |  | ||||||
| block containing the code leading to the 'if' statement.  The 'if' |  | ||||||
| statement block contains jumps (which are exit points) to the true body |  | ||||||
| of the 'if' and the 'else' body (which may be NULL), each of which are |  | ||||||
| their own basic blocks.  Both of those blocks in turn point to the |  | ||||||
| basic block representing the code following the entire 'if' statement. |  | ||||||
| 
 |  | ||||||
| CFGs are usually one step away from final code output.  Code is directly |  | ||||||
| generated from the basic blocks (with jump targets adjusted based on the |  | ||||||
| output order) by doing a post-order depth-first search on the CFG |  | ||||||
| following the edges. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| AST to CFG to Bytecode |  | ||||||
| ---------------------- |  | ||||||
| 
 |  | ||||||
| With the AST created, the next step is to create the CFG. The first step |  | ||||||
| is to convert the AST to Python bytecode without having jump targets |  | ||||||
| resolved to specific offsets (this is calculated when the CFG goes to |  | ||||||
| final bytecode). Essentially, this transforms the AST into Python |  | ||||||
| bytecode with control flow represented by the edges of the CFG. |  | ||||||
| 
 |  | ||||||
| Conversion is done in two passes.  The first creates the namespace |  | ||||||
| (variables can be classified as local, free/cell for closures, or |  | ||||||
| global).  With that done, the second pass essentially flattens the CFG |  | ||||||
| into a list and calculates jump offsets for final output of bytecode. |  | ||||||
| 
 |  | ||||||
| The conversion process is initiated by a call to the function in |  | ||||||
| Python/newcompile.c:: |  | ||||||
| 
 |  | ||||||
|   PyCodeObject * PyAST_Compile(mod_ty, const char *, PyCompilerFlags); |  | ||||||
| 
 |  | ||||||
| This function does both the conversion of the AST to a CFG and |  | ||||||
| outputting final bytecode from the CFG.  The AST to CFG step is handled |  | ||||||
| mostly by the two functions called by PyAST_Compile():: |  | ||||||
| 
 |  | ||||||
|   struct symtable * PySymtable_Build(mod_ty, const char *, |  | ||||||
| 					PyFutureFeatures); |  | ||||||
|   PyCodeObject * compiler_mod(struct compiler *, mod_ty); |  | ||||||
| 
 |  | ||||||
| The former is in Python/symtable.c while the latter is in |  | ||||||
| Python/newcompile.c . |  | ||||||
| 
 |  | ||||||
| PySymtable_Build() begins by entering the starting code block for the |  | ||||||
| AST (passed-in) and then calling the proper symtable_visit_xx function |  | ||||||
| (with xx being the AST node type).  Next, the AST tree is walked with |  | ||||||
| the various code blocks that delineate the reach of a local variable |  | ||||||
| as blocks are entered and exited:: |  | ||||||
| 
 |  | ||||||
|   static int symtable_enter_block(struct symtable *, identifier, |  | ||||||
| 				    block_ty, void *, int); |  | ||||||
|   static int symtable_exit_block(struct symtable *, void *); |  | ||||||
| 
 |  | ||||||
| Once the symbol table is created, it is time for CFG creation, whose |  | ||||||
| code is in Python/newcompile.c .  This is handled by several functions |  | ||||||
| that break the task down by various AST node types.  The functions are |  | ||||||
| all named compiler_visit_xx where xx is the name of the node type (such |  | ||||||
| as stmt, expr, etc.).  Each function receives a ``struct compiler *`` |  | ||||||
| and xx_ty where xx is the AST node type.  Typically these functions |  | ||||||
| consist of a large 'switch' statement, branching based on the kind of |  | ||||||
| node type passed to it.  Simple things are handled inline in the |  | ||||||
| 'switch' statement with more complex transformations farmed out to other |  | ||||||
| functions named compiler_xx with xx being a descriptive name of what is |  | ||||||
| being handled. |  | ||||||
| 
 |  | ||||||
| When transforming an arbitrary AST node, use the VISIT macro:: |  | ||||||
| 
 |  | ||||||
|   VISIT(struct compiler *, <node type>, <AST node>); |  | ||||||
| 
 |  | ||||||
| The appropriate compiler_visit_xx function is called, based on the value |  | ||||||
| passed in for <node type> (so ``VISIT(c, expr, node)`` calls |  | ||||||
| ``compiler_visit_expr(c, node)``).  The VISIT_SEQ macro is very similar, |  | ||||||
|  but is called on AST node sequences (those values that were created as |  | ||||||
| arguments to a node that used the '*' modifier).  There is also |  | ||||||
| VISIT_SLICE just for handling slices:: |  | ||||||
| 
 |  | ||||||
|   VISIT_SLICE(struct compiler *, slice_ty, expr_context_ty); |  | ||||||
| 
 |  | ||||||
| Emission of bytecode is handled by the following macros: |  | ||||||
| 
 |  | ||||||
| - ``ADDOP(struct compiler *c, int op)`` |  | ||||||
|     add 'op' as an opcode |  | ||||||
| - ``ADDOP_I(struct compiler *c, int op, int oparg)`` |  | ||||||
|     add 'op' with an 'oparg' argument |  | ||||||
| - ``ADDOP_O(struct compiler *c, int op, PyObject *type, PyObject *obj)`` |  | ||||||
|     add 'op' with the proper argument based on the position of obj in |  | ||||||
|     'type', but with no handling of mangled names; used for when you |  | ||||||
|     need to do named lookups of objects such as globals, consts, or |  | ||||||
|     parameters where name mangling is not possible and the scope of the |  | ||||||
|     name is known |  | ||||||
| - ``ADDOP_NAME(struct compiler *, int, PyObject *, PyObject *)`` |  | ||||||
|     just like ADDOP_O, but name mangling is also handled; used for |  | ||||||
|     attribute loading or importing based on name |  | ||||||
| - ``ADDOP_JABS(struct compiling *c, int op, basicblock b)`` |  | ||||||
|     create an absolute jump to the basic block 'b' |  | ||||||
| - ``ADDOP_JREL(struct compiling *c, int op, basicblock b)`` |  | ||||||
|     create a relative jump to the basic block 'b' |  | ||||||
| 
 |  | ||||||
| Several helper functions that will emit bytecode and are named |  | ||||||
| compiler_xx() where xx is what the function helps with (list, boolop |  | ||||||
|  etc.).  A rather useful one is:: |  | ||||||
| 
 |  | ||||||
|   static int compiler_nameop(struct compiler *, identifier, |  | ||||||
| 				expr_context_ty); |  | ||||||
| 
 |  | ||||||
| This function looks up the scope of a variable and, based on the |  | ||||||
| expression context, emits the proper opcode to load, store, or delete |  | ||||||
| the variable. |  | ||||||
| 
 |  | ||||||
| As for handling the line number on which a statement is defined, is |  | ||||||
| handled by compiler_visit_stmt() and thus is not a worry. |  | ||||||
| 
 |  | ||||||
| In addition to emitting bytecode based on the AST node, handling the |  | ||||||
| creation of basic blocks must be done.  Below are the macros and |  | ||||||
| functions used for managing basic blocks: |  | ||||||
| 
 |  | ||||||
| - ``NEW_BLOCK(struct compiler *)`` |  | ||||||
|     create block and set it as current |  | ||||||
| - ``NEXT_BLOCK(struct compiler *)`` |  | ||||||
|     basically NEW_BLOCK() plus jump from current block |  | ||||||
| - ``compiler_new_block(struct compiler *)`` |  | ||||||
|     create a block but don't use it (used for generating jumps) |  | ||||||
| 
 |  | ||||||
| Once the CFG is created, it must be flattened and then final emission of |  | ||||||
| bytecode occurs.  Flattening is handled using a post-order depth-first |  | ||||||
| search.  Once flattened, jump offsets are backpatched based on the |  | ||||||
| flattening and then a PyCodeObject file is created.  All of this is |  | ||||||
| handled by calling:: |  | ||||||
| 
 |  | ||||||
|   PyCodeObject * assemble(struct compiler *, int); |  | ||||||
| 
 |  | ||||||
| *** NOTE: if you make a change here that can affect the output of bytecode that |  | ||||||
| is already in existence, make sure to delete your old .py(c|o) files!  Running |  | ||||||
| ``find . -name '*.py[co]' -exec rm -f {} ';'`` should do the trick. |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Code Objects |  | ||||||
| ------------ |  | ||||||
| 
 |  | ||||||
| In the end, one ends up with a PyCodeObject which is defined in |  | ||||||
| Include/code.h .  And with that you now have executable Python bytecode! |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| Modified Files |  | ||||||
| -------------- |  | ||||||
| 
 |  | ||||||
| + Parser/ |  | ||||||
| 
 |  | ||||||
|     - Python.asdl |  | ||||||
|         ASDL syntax file |  | ||||||
| 
 |  | ||||||
|     - asdl.py |  | ||||||
|         "An implementation of the Zephyr Abstract Syntax Definition |  | ||||||
|         Language."  Uses SPARK_ to parse the ASDL files. |  | ||||||
| 
 |  | ||||||
|     - asdl_c.py |  | ||||||
|         "Generate C code from an ASDL description."  Generates |  | ||||||
| 	../Python/Python-ast.c and ../Include/Python-ast.h . |  | ||||||
| 
 |  | ||||||
|     - spark.py |  | ||||||
|         SPARK_ parser generator |  | ||||||
| 
 |  | ||||||
| + Python/ |  | ||||||
| 
 |  | ||||||
|     - Python-ast.c |  | ||||||
|         Creates C structs corresponding to the ASDL types.  Also |  | ||||||
| 	contains code for marshaling AST nodes (core ASDL types have |  | ||||||
| 	marshaling code in asdl.c).  "File automatically generated by |  | ||||||
| 	../Parser/asdl_c.py". |  | ||||||
| 
 |  | ||||||
|     - asdl.c |  | ||||||
|         Contains code to handle the ASDL sequence type.  Also has code |  | ||||||
|         to handle marshalling the core ASDL types, such as number and |  | ||||||
|         identifier.  used by Python-ast.c for marshaling AST nodes. |  | ||||||
| 
 |  | ||||||
|     - ast.c |  | ||||||
|         Converts Python's parse tree into the abstract syntax tree. |  | ||||||
| 
 |  | ||||||
|     - compile.txt |  | ||||||
|         This file. |  | ||||||
| 
 |  | ||||||
|     - newcompile.c |  | ||||||
|         New version of compile.c that handles the emitting of bytecode. |  | ||||||
| 
 |  | ||||||
|     - symtable.c |  | ||||||
| 	Generates symbol table from AST. |  | ||||||
| 	 |  | ||||||
| 
 |  | ||||||
| + Include/ |  | ||||||
| 
 |  | ||||||
|     - Python-ast.h |  | ||||||
|         Contains the actual definitions of the C structs as generated by |  | ||||||
|         ../Python/Python-ast.c . |  | ||||||
|         "Automatically generated by ../Parser/asdl_c.py". |  | ||||||
| 
 |  | ||||||
|     - asdl.h |  | ||||||
|         Header for the corresponding ../Python/ast.c . |  | ||||||
| 
 |  | ||||||
|     - ast.h |  | ||||||
|         Declares PyAST_FromNode() external (from ../Python/ast.c). |  | ||||||
| 
 |  | ||||||
|     - code.h |  | ||||||
| 	Header file for ../Objects/codeobject.c; contains definition of |  | ||||||
| 	PyCodeObject. |  | ||||||
| 
 |  | ||||||
|     - symtable.h |  | ||||||
| 	Header for ../Python/symtable.c .  struct symtable and |  | ||||||
| 	PySTEntryObject are defined here. |  | ||||||
| 
 |  | ||||||
| + Objects/ |  | ||||||
| 
 |  | ||||||
|     - codeobject.c |  | ||||||
| 	Contains PyCodeObject-related code (originally in |  | ||||||
| 	../Python/compile.c). |  | ||||||
| 
 |  | ||||||
| 
 |  | ||||||
| ToDo |  | ||||||
| ---- |  | ||||||
| *** NOTE: all bugs and patches should be filed on SF under the group |  | ||||||
| 	    "AST" for easy searching.  It also does not hurt to put |  | ||||||
| 	    "[AST]" at the beginning of the subject line of the tracker |  | ||||||
| 	    item. |  | ||||||
| 
 |  | ||||||
| + Stdlib support |  | ||||||
|     - AST->Python access? |  | ||||||
|     - rewrite compiler package to mirror AST structure? |  | ||||||
| + Documentation |  | ||||||
|     - flesh out this doc |  | ||||||
| 	* byte stream output |  | ||||||
| 	* explanation of how the symbol table pass works |  | ||||||
| 	* code object (PyCodeObject) |  | ||||||
| + Universal |  | ||||||
|     - make sure entire test suite passes |  | ||||||
|     - fix memory leaks |  | ||||||
|     - make sure return types are properly checked for errors |  | ||||||
|     - no gcc warnings |  | ||||||
| 
 |  | ||||||
| References |  | ||||||
| ---------- |  | ||||||
| 
 |  | ||||||
| .. [Aho86] Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman. |  | ||||||
|    `Compilers: Principles, Techniques, and Tools`, |  | ||||||
|    http://www.amazon.com/exec/obidos/tg/detail/-/0201100886/104-0162389-6419108 |  | ||||||
| 
 |  | ||||||
| .. [Wang97]  Daniel C. Wang, Andrew W. Appel, Jeff L. Korn, and Chris |  | ||||||
|    S. Serra.  `The Zephyr Abstract Syntax Description Language.`_ |  | ||||||
|    In Proceedings of the Conference on Domain-Specific Languages, pp. |  | ||||||
|    213--227, 1997. |  | ||||||
| 
 |  | ||||||
| .. _The Zephyr Abstract Syntax Description Language.: |  | ||||||
|    http://www.cs.princeton.edu/~danwang/Papers/dsl97/dsl97.html |  | ||||||
| 
 |  | ||||||
| .. _SPARK: http://pages.cpsc.ucalgary.ca/~aycock/spark/ |  | ||||||
| 
 |  | ||||||
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue
	
	 Brett Cannon
						Brett Cannon