Document mangling

This commit is contained in:
Kodi Arfer 2018-03-04 16:39:54 -08:00
parent 4d77dd0d40
commit eda0b89f67
4 changed files with 92 additions and 68 deletions

View File

@ -699,6 +699,20 @@ Returns the single step macro expansion of *form*.
HySymbol('e'),
HySymbol('f')])])
.. _mangle-fn:
mangle
------
Usage: ``(mangle x)``
Stringify the input and translate it according to :ref:`Hy's mangling rules
<mangling>`.
.. code-block:: hylang
=> (mangle "foo-bar")
'foo_bar'
.. _merge-with-fn:
@ -1431,6 +1445,22 @@ Returns an iterator from *coll* as long as *pred* returns ``True``.
=> (list (take-while neg? [ 1 2 3 -4 5]))
[]
.. _unmangle-fn:
unmangle
--------
Usage: ``(unmangle x)``
Stringify the input and return a string that would :ref:`mangle <mangling>` to
it. Note that this isn't a one-to-one operation, and nor is ``mangle``, so
``mangle`` and ``unmangle`` don't always round-trip.
.. code-block:: hylang
=> (unmangle "foo_bar")
'foo-bar'
Included itertools
==================

View File

@ -157,17 +157,8 @@ HySymbol
``hy.models.HySymbol`` is the model used to represent symbols
in the Hy language. It inherits :ref:`HyString`.
``HySymbol`` objects are mangled in the parsing phase, to help Python
interoperability:
- Symbols surrounded by asterisks (``*``) are turned into uppercase;
- Dashes (``-``) are turned into underscores (``_``);
- One trailing question mark (``?``) is turned into a leading ``is_``.
Caveat: as the mangling is done during the parsing phase, it is possible
to programmatically generate HySymbols that can't be generated with Hy
source code. Such a mechanism is used by :ref:`gensym` to generate
"uninterned" symbols.
Symbols are :ref:`mangled <mangling>` when they are compiled
to Python variable names.
.. _hykeyword:
@ -340,7 +331,7 @@ Since they have no "value" to Python, this makes working in Hy hard, since
doing something like ``(print (if True True False))`` is not just common, it's
expected.
As a result, we auto-mangle things using a ``Result`` object, where we offer
As a result, we reconfigure things using a ``Result`` object, where we offer
up any ``ast.stmt`` that need to get run, and a single ``ast.expr`` that can
be used to get the value of whatever was just run. Hy does this by forcing
assignment to things while running.
@ -352,11 +343,11 @@ As example, the Hy::
Will turn into::
if True:
_mangled_name_here = True
_temp_name_here = True
else:
_mangled_name_here = False
_temp_name_here = False
print _mangled_name_here
print _temp_name_here
OK, that was a bit of a lie, since we actually turn that statement

View File

@ -8,6 +8,12 @@ Hy <-> Python interop
Despite being a Lisp, Hy aims to be fully compatible with Python. That means
every Python module or package can be imported in Hy code, and vice versa.
:ref:`Mangling <mangling>` allows variable names to be spelled differently in
Hy and Python. For example, Python's ``str.format_map`` can be written
``str.format-map`` in Hy, and a Hy function named ``valid?`` would be called
``is_valid`` in Python. In Python, you can import Hy's core functions
``mangle`` and ``unmangle`` directly from the ``hy`` package.
Using Python from Hy
====================
@ -27,41 +33,6 @@ You can use it in Hy:
You can also import ``.pyc`` bytecode files, of course.
A quick note about mangling
--------
In Python, snake_case is used by convention. Lisp dialects tend to use dashes
instead of underscores, so Hy does some magic to give you more pleasant names.
In the same way, ``UPPERCASE_NAMES`` from Python can be used ``*with-earmuffs*``
instead.
You can use either the original names or the new ones.
Imagine ``example.py``::
def function_with_a_long_name():
print(42)
FOO = "bar"
Then, in Hy:
.. code-block:: clj
(import example)
(.function-with-a-long-name example) ; prints "42"
(.function_with_a_long_name example) ; also prints "42"
(print (. example *foo*)) ; prints "bar"
(print (. example FOO)) ; also prints "bar"
.. warning::
Mangling isnt that simple; there is more to discuss about it, yet it doesnt
belong in this section.
.. TODO: link to mangling section, when it is done
Using Hy from Python
====================

View File

@ -2,25 +2,10 @@
Syntax
==============
Hy maintains, over everything else, 100% compatibility in both directions
with Python itself. All Hy code follows a few simple rules. Memorize
this, as it's going to come in handy.
identifiers
-----------
These rules help ensure that Hy code is idiomatic and interfaceable in both
languages.
* Symbols in earmuffs will be translated to the upper-cased version of that
string. For example, ``foo`` will become ``FOO``.
* UTF-8 entities will be encoded using
`punycode <https://en.wikipedia.org/wiki/Punycode>`_ and prefixed with
``hy_``. For instance, ```` will become ``hy_w7h``, ```` will become
``hy_g6h``, and ``i♥u`` will become ``hy_iu_t0x``.
* Symbols that contain dashes will have them replaced with underscores. For
example, ``render-template`` will become ``render_template``. This means
that symbols with dashes will shadow their underscore equivalents, and vice
versa.
An identifier consists of a nonempty sequence of Unicode characters that are not whitespace nor any of the following: ``( ) [ ] { } ' "``. Hy first tries to parse each identifier into a numeric literal, then into a keyword if that fails, and finally into a symbol if that fails.
numeric literals
----------------
@ -98,6 +83,53 @@ the error ``Keyword argument :foo needs a value``. To avoid this, you can quote
the keyword, as in ``(f ':foo)``, or use it as the value of another keyword
argument, as in ``(f :arg :foo)``.
.. _mangling:
symbols
-------
Symbols are identifiers that are neither legal numeric literals nor legal
keywords. In most contexts, symbols are compiled to Python variable names. Some
example symbols are ``hello``, ``+++``, ``3fiddy``, ``$40``, ``just✈wrong``,
and ``🦑``.
Since the rules for Hy symbols are much more permissive than the rules for
Python identifiers, Hy uses a mangling algorithm to convert its own names to
Python-legal names. The rules are:
- Convert all hyphens (``-``) to underscores (``_``). Thus, ``foo-bar`` becomes
``foo_bar``.
- If the name ends with ``?``, remove it and prepend ``is``. Thus, ``tasty?``
becomes ``is_tasty``.
- If the name still isn't Python-legal, make the following changes. A name
could be Python-illegal because it contains a character that's never legal in
a Python name, it contains a character that's illegal in that position, or
it's equal to a Python reserved word.
- Prepend ``hyx_`` to the name.
- Replace each illegal character with ``ΔfooΔ`` (or on Python 2, ``XfooX``),
where ``foo`` is the the Unicode character name in lowercase, with spaces
replaced by underscores and hyphens replaced by ``H``. Replace ``Δ`` itself
(or on Python 2, ``X``) the same way. If the character doesn't have a name,
use ``U`` followed by its code point in lowercase hexadecimal.
Thus, ``green☘`` becomes ``hyx_greenΔshamrockΔ`` and ``if`` becomes
``hyx_if``.
- Finally, any added ``hyx_`` or ``is_`` is added after any leading
underscores, because leading underscores have special significance to Python.
Thus, ``_tasty?`` becomes ``_is_tasty`` instead of ``is__tasty``.
Mangling isn't something you should have to think about often, but you may see
mangled names in error messages, the output of ``hy2py``, etc. A catch to be
aware of is that mangling, as well as the inverse "unmangling" operation
offered by the ``unmangle`` function, isn't one-to-one. Two different symbols
can mangle to the same string and hence compile to the same Python variable.
The chief practical consequence of this is that ``-`` and ``_`` are
interchangeable in all symbol names, so you shouldn't assign to the
one-character name ``_`` , or else you'll interfere with certain uses of
subtraction.
discard prefix
--------------