Document mangling

This commit is contained in:
Kodi Arfer 2018-03-04 16:39:54 -08:00
parent 4d77dd0d40
commit eda0b89f67
4 changed files with 92 additions and 68 deletions

View File

@ -699,6 +699,20 @@ Returns the single step macro expansion of *form*.
HySymbol('e'), HySymbol('e'),
HySymbol('f')])]) HySymbol('f')])])
.. _mangle-fn:
mangle
------
Usage: ``(mangle x)``
Stringify the input and translate it according to :ref:`Hy's mangling rules
<mangling>`.
.. code-block:: hylang
=> (mangle "foo-bar")
'foo_bar'
.. _merge-with-fn: .. _merge-with-fn:
@ -1431,6 +1445,22 @@ Returns an iterator from *coll* as long as *pred* returns ``True``.
=> (list (take-while neg? [ 1 2 3 -4 5])) => (list (take-while neg? [ 1 2 3 -4 5]))
[] []
.. _unmangle-fn:
unmangle
--------
Usage: ``(unmangle x)``
Stringify the input and return a string that would :ref:`mangle <mangling>` to
it. Note that this isn't a one-to-one operation, and nor is ``mangle``, so
``mangle`` and ``unmangle`` don't always round-trip.
.. code-block:: hylang
=> (unmangle "foo_bar")
'foo-bar'
Included itertools Included itertools
================== ==================

View File

@ -157,17 +157,8 @@ HySymbol
``hy.models.HySymbol`` is the model used to represent symbols ``hy.models.HySymbol`` is the model used to represent symbols
in the Hy language. It inherits :ref:`HyString`. in the Hy language. It inherits :ref:`HyString`.
``HySymbol`` objects are mangled in the parsing phase, to help Python Symbols are :ref:`mangled <mangling>` when they are compiled
interoperability: to Python variable names.
- Symbols surrounded by asterisks (``*``) are turned into uppercase;
- Dashes (``-``) are turned into underscores (``_``);
- One trailing question mark (``?``) is turned into a leading ``is_``.
Caveat: as the mangling is done during the parsing phase, it is possible
to programmatically generate HySymbols that can't be generated with Hy
source code. Such a mechanism is used by :ref:`gensym` to generate
"uninterned" symbols.
.. _hykeyword: .. _hykeyword:
@ -340,7 +331,7 @@ Since they have no "value" to Python, this makes working in Hy hard, since
doing something like ``(print (if True True False))`` is not just common, it's doing something like ``(print (if True True False))`` is not just common, it's
expected. expected.
As a result, we auto-mangle things using a ``Result`` object, where we offer As a result, we reconfigure things using a ``Result`` object, where we offer
up any ``ast.stmt`` that need to get run, and a single ``ast.expr`` that can up any ``ast.stmt`` that need to get run, and a single ``ast.expr`` that can
be used to get the value of whatever was just run. Hy does this by forcing be used to get the value of whatever was just run. Hy does this by forcing
assignment to things while running. assignment to things while running.
@ -352,11 +343,11 @@ As example, the Hy::
Will turn into:: Will turn into::
if True: if True:
_mangled_name_here = True _temp_name_here = True
else: else:
_mangled_name_here = False _temp_name_here = False
print _mangled_name_here print _temp_name_here
OK, that was a bit of a lie, since we actually turn that statement OK, that was a bit of a lie, since we actually turn that statement

View File

@ -8,6 +8,12 @@ Hy <-> Python interop
Despite being a Lisp, Hy aims to be fully compatible with Python. That means Despite being a Lisp, Hy aims to be fully compatible with Python. That means
every Python module or package can be imported in Hy code, and vice versa. every Python module or package can be imported in Hy code, and vice versa.
:ref:`Mangling <mangling>` allows variable names to be spelled differently in
Hy and Python. For example, Python's ``str.format_map`` can be written
``str.format-map`` in Hy, and a Hy function named ``valid?`` would be called
``is_valid`` in Python. In Python, you can import Hy's core functions
``mangle`` and ``unmangle`` directly from the ``hy`` package.
Using Python from Hy Using Python from Hy
==================== ====================
@ -27,41 +33,6 @@ You can use it in Hy:
You can also import ``.pyc`` bytecode files, of course. You can also import ``.pyc`` bytecode files, of course.
A quick note about mangling
--------
In Python, snake_case is used by convention. Lisp dialects tend to use dashes
instead of underscores, so Hy does some magic to give you more pleasant names.
In the same way, ``UPPERCASE_NAMES`` from Python can be used ``*with-earmuffs*``
instead.
You can use either the original names or the new ones.
Imagine ``example.py``::
def function_with_a_long_name():
print(42)
FOO = "bar"
Then, in Hy:
.. code-block:: clj
(import example)
(.function-with-a-long-name example) ; prints "42"
(.function_with_a_long_name example) ; also prints "42"
(print (. example *foo*)) ; prints "bar"
(print (. example FOO)) ; also prints "bar"
.. warning::
Mangling isnt that simple; there is more to discuss about it, yet it doesnt
belong in this section.
.. TODO: link to mangling section, when it is done
Using Hy from Python Using Hy from Python
==================== ====================

View File

@ -2,25 +2,10 @@
Syntax Syntax
============== ==============
Hy maintains, over everything else, 100% compatibility in both directions identifiers
with Python itself. All Hy code follows a few simple rules. Memorize -----------
this, as it's going to come in handy.
These rules help ensure that Hy code is idiomatic and interfaceable in both An identifier consists of a nonempty sequence of Unicode characters that are not whitespace nor any of the following: ``( ) [ ] { } ' "``. Hy first tries to parse each identifier into a numeric literal, then into a keyword if that fails, and finally into a symbol if that fails.
languages.
* Symbols in earmuffs will be translated to the upper-cased version of that
string. For example, ``foo`` will become ``FOO``.
* UTF-8 entities will be encoded using
`punycode <https://en.wikipedia.org/wiki/Punycode>`_ and prefixed with
``hy_``. For instance, ```` will become ``hy_w7h``, ```` will become
``hy_g6h``, and ``i♥u`` will become ``hy_iu_t0x``.
* Symbols that contain dashes will have them replaced with underscores. For
example, ``render-template`` will become ``render_template``. This means
that symbols with dashes will shadow their underscore equivalents, and vice
versa.
numeric literals numeric literals
---------------- ----------------
@ -98,6 +83,53 @@ the error ``Keyword argument :foo needs a value``. To avoid this, you can quote
the keyword, as in ``(f ':foo)``, or use it as the value of another keyword the keyword, as in ``(f ':foo)``, or use it as the value of another keyword
argument, as in ``(f :arg :foo)``. argument, as in ``(f :arg :foo)``.
.. _mangling:
symbols
-------
Symbols are identifiers that are neither legal numeric literals nor legal
keywords. In most contexts, symbols are compiled to Python variable names. Some
example symbols are ``hello``, ``+++``, ``3fiddy``, ``$40``, ``just✈wrong``,
and ``🦑``.
Since the rules for Hy symbols are much more permissive than the rules for
Python identifiers, Hy uses a mangling algorithm to convert its own names to
Python-legal names. The rules are:
- Convert all hyphens (``-``) to underscores (``_``). Thus, ``foo-bar`` becomes
``foo_bar``.
- If the name ends with ``?``, remove it and prepend ``is``. Thus, ``tasty?``
becomes ``is_tasty``.
- If the name still isn't Python-legal, make the following changes. A name
could be Python-illegal because it contains a character that's never legal in
a Python name, it contains a character that's illegal in that position, or
it's equal to a Python reserved word.
- Prepend ``hyx_`` to the name.
- Replace each illegal character with ``ΔfooΔ`` (or on Python 2, ``XfooX``),
where ``foo`` is the the Unicode character name in lowercase, with spaces
replaced by underscores and hyphens replaced by ``H``. Replace ``Δ`` itself
(or on Python 2, ``X``) the same way. If the character doesn't have a name,
use ``U`` followed by its code point in lowercase hexadecimal.
Thus, ``green☘`` becomes ``hyx_greenΔshamrockΔ`` and ``if`` becomes
``hyx_if``.
- Finally, any added ``hyx_`` or ``is_`` is added after any leading
underscores, because leading underscores have special significance to Python.
Thus, ``_tasty?`` becomes ``_is_tasty`` instead of ``is__tasty``.
Mangling isn't something you should have to think about often, but you may see
mangled names in error messages, the output of ``hy2py``, etc. A catch to be
aware of is that mangling, as well as the inverse "unmangling" operation
offered by the ``unmangle`` function, isn't one-to-one. Two different symbols
can mangle to the same string and hence compile to the same Python variable.
The chief practical consequence of this is that ``-`` and ``_`` are
interchangeable in all symbol names, so you shouldn't assign to the
one-character name ``_`` , or else you'll interfere with certain uses of
subtraction.
discard prefix discard prefix
-------------- --------------