Document the compiler a little.
This commit is contained in:
parent
893c2bc46e
commit
67fd0ddbbe
@ -13,6 +13,173 @@ Hy Models
|
|||||||
Write this.
|
Write this.
|
||||||
|
|
||||||
|
|
||||||
|
Hy Internal Theory
|
||||||
|
==================
|
||||||
|
|
||||||
|
.. _overview:
|
||||||
|
|
||||||
|
Overview
|
||||||
|
--------
|
||||||
|
|
||||||
|
The Hy internals work by acting as a front-end to Python bytecode, so that
|
||||||
|
Hy it's self compiles down to Python Bytecode, allowing an unmodified Python
|
||||||
|
runtime to run Hy.
|
||||||
|
|
||||||
|
The way we do this is by translating Hy into Python AST, and building that AST
|
||||||
|
down into Python bytecode using standard internals, so that we don't have
|
||||||
|
to duplicate all the work of the Python internals for every single Python
|
||||||
|
release.
|
||||||
|
|
||||||
|
Hy works in four stages. The following sections will cover each step of Hy
|
||||||
|
from source to runtime.
|
||||||
|
|
||||||
|
.. _lexing:
|
||||||
|
|
||||||
|
Lexing / tokenizing
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
The first stage of compiling hy is to lex the source into tokens that we can
|
||||||
|
deal with. We use a project called rply, which is a really nice (and fast)
|
||||||
|
parser, written in a subset of Python called rply.
|
||||||
|
|
||||||
|
The lexing code is all defined in ``hy.lex.lexer``. This code is mostly just
|
||||||
|
defining the Hy grammer, and all the actual hard parts are taken care of by
|
||||||
|
rply -- we just define "callbacks" for rply in ``hy.lex.parser``, which take
|
||||||
|
the tokens generated, and return the Hy models.
|
||||||
|
|
||||||
|
You can think of the Hy models as the "AST" for Hy, it's what Macros operate
|
||||||
|
on (directly), and it's what the compiler uses when it compiles Hy down.
|
||||||
|
|
||||||
|
Check the documentation for more information on the Hy models for more
|
||||||
|
information regarding the Hy models, and what they mean.
|
||||||
|
|
||||||
|
.. TODO: Uh, we should, like, document models.
|
||||||
|
|
||||||
|
|
||||||
|
.. _compiling:
|
||||||
|
|
||||||
|
Compiling
|
||||||
|
---------
|
||||||
|
|
||||||
|
This is where most of the magic in Hy happens. This is where we take Hy AST
|
||||||
|
(the models), and compile them into Python AST. A couple of funky things happen
|
||||||
|
here to work past a few problems in AST, and working in the compiler is some
|
||||||
|
of the most important work we do have.
|
||||||
|
|
||||||
|
The compiler is a bit complex, so don't feel bad if you don't grok it on the
|
||||||
|
first shot, it may take a bit of time to get right.
|
||||||
|
|
||||||
|
The main entry-point to the Compiler is ``HyASTCompiler.compile``. This method
|
||||||
|
is invoked, and the only real "public" method on the class (that is to say,
|
||||||
|
we don't really promise the API beyond that method).
|
||||||
|
|
||||||
|
In fact, even internally, we don't recurse directly hardly ever, we almost
|
||||||
|
always force the Hy tree through ``compile``, and will often do this with
|
||||||
|
sub-elements of an expression that we have. It's up to the Type-based dispatcher
|
||||||
|
to properly dispatch sub-elements.
|
||||||
|
|
||||||
|
All methods that preform a compilation are marked with the ``@builds()``
|
||||||
|
decorator. You can either pass the class of the Hy model that it compiles,
|
||||||
|
or you can use a string for expressions. I'll clear this up in a second.
|
||||||
|
|
||||||
|
First stage type-dispatch
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Let's start in the ``compile`` method. The first thing we do is check the
|
||||||
|
Type of the thing we're building. We look up to see if we have a method that
|
||||||
|
can build the ``type()`` that we have, and dispatch to the method that can
|
||||||
|
handle it. If we don't have any methods that can build that type, we raise
|
||||||
|
an internal ``Exception``.
|
||||||
|
|
||||||
|
For instance, if we have a ``HyString``, we have an almost 1-to-1 mapping of
|
||||||
|
Hy AST to Python AST. The ``compile_string`` method takes the ``HyString``, and
|
||||||
|
returns an ``ast.Str()`` that's populated with the correct line-numbers and
|
||||||
|
content.
|
||||||
|
|
||||||
|
Macro-expand
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
If we get a ``HyExpression``, we'll attempt to see if this is a known
|
||||||
|
Macro, and push to have it expanded by invoking ``hy.macros.macroexpand``, then
|
||||||
|
push the result back into ``HyASTCompiler.compile``.
|
||||||
|
|
||||||
|
Second stage expression-dispatch
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The only special case is the ``HyExpression``, since we need to create different
|
||||||
|
AST depending on the special form in question. For instance, when we hit an
|
||||||
|
``(if true true false)``, we need to generate a ``ast.If``, and properly
|
||||||
|
compile the sub-nodes. This is where the ``@builds()`` with a String as an
|
||||||
|
argument comes in.
|
||||||
|
|
||||||
|
For the ``compile_expression`` (which is defined with an
|
||||||
|
``@builds(HyExpression)``) will dispatch based on the string of the first
|
||||||
|
argument. If, for some reason, the first argument is not a string, it will
|
||||||
|
properly handle that case as well (most likely by raising an ``Exception``).
|
||||||
|
|
||||||
|
If the String isn't known to Hy, it will default to create an ``ast.Call``,
|
||||||
|
which will try to do a runtime call (in Python, something like ``foo()``).
|
||||||
|
|
||||||
|
Issues hit with Python AST
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Python AST is great; it's what's enabled us to write such a powerful project
|
||||||
|
on top of Python without having to fight Python too hard. Like anything, we've
|
||||||
|
had our fair share of issues, and here's a short list of the common ones you
|
||||||
|
might run into.
|
||||||
|
|
||||||
|
*Python differentiates between Statements and Expressions*.
|
||||||
|
|
||||||
|
This might not sound like a big deal -- in fact, to most Python programmers,
|
||||||
|
this will shortly become a "Well, yeah" moment.
|
||||||
|
|
||||||
|
In Python, doing something like:
|
||||||
|
|
||||||
|
``print for x in range(10): pass``, because ``print`` prints expressions, and
|
||||||
|
``for`` isn't an expression, it's a control flow statement. Things like
|
||||||
|
``1 + 1`` are Expressions, as is ``lambda x: 1 + x``, but other language
|
||||||
|
features, such as ``if``, ``for``, or ``while`` are statements.
|
||||||
|
|
||||||
|
Since they have no "value" to Python, this makes working in Hy hard, since
|
||||||
|
doing something like ``(print (if true true false))`` is not just common, it's
|
||||||
|
expected.
|
||||||
|
|
||||||
|
As a result, we auto-mangle things using a ``Result`` object, where we offer
|
||||||
|
up any ``ast.stmt`` that need to get run, and a single ``ast.expr`` that can
|
||||||
|
be used to get the value of whatever was just run. Hy does this by forcing
|
||||||
|
assignment to things while running.
|
||||||
|
|
||||||
|
As example, the Hy::
|
||||||
|
|
||||||
|
(print (if true true false))
|
||||||
|
|
||||||
|
Will turn into::
|
||||||
|
|
||||||
|
if True:
|
||||||
|
_mangled_name_here = True
|
||||||
|
else:
|
||||||
|
_mangled_name_here = False
|
||||||
|
|
||||||
|
print _mangled_name_here
|
||||||
|
|
||||||
|
|
||||||
|
OK, that was a bit of a lie, since we actually turn that statement
|
||||||
|
into::
|
||||||
|
|
||||||
|
print True if True else False
|
||||||
|
|
||||||
|
By forcing things into an ``ast.expr`` if we can, but the general idea holds.
|
||||||
|
|
||||||
|
|
||||||
|
Runtime
|
||||||
|
-------
|
||||||
|
|
||||||
|
After we have a Python AST tree that's complete, we can try and compile it to
|
||||||
|
Python bytecode by pushing it through ``eval``. From here on out, we're no
|
||||||
|
longer in control, and Python is taking care of everything. This is why things
|
||||||
|
like Python tracebacks, pdb and django apps work.
|
||||||
|
|
||||||
|
|
||||||
Hy Macros
|
Hy Macros
|
||||||
=========
|
=========
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user