Document mangling
This commit is contained in:
parent
4d77dd0d40
commit
eda0b89f67
@ -699,6 +699,20 @@ Returns the single step macro expansion of *form*.
|
||||
HySymbol('e'),
|
||||
HySymbol('f')])])
|
||||
|
||||
.. _mangle-fn:
|
||||
|
||||
mangle
|
||||
------
|
||||
|
||||
Usage: ``(mangle x)``
|
||||
|
||||
Stringify the input and translate it according to :ref:`Hy's mangling rules
|
||||
<mangling>`.
|
||||
|
||||
.. code-block:: hylang
|
||||
|
||||
=> (mangle "foo-bar")
|
||||
'foo_bar'
|
||||
|
||||
.. _merge-with-fn:
|
||||
|
||||
@ -1431,6 +1445,22 @@ Returns an iterator from *coll* as long as *pred* returns ``True``.
|
||||
=> (list (take-while neg? [ 1 2 3 -4 5]))
|
||||
[]
|
||||
|
||||
.. _unmangle-fn:
|
||||
|
||||
unmangle
|
||||
--------
|
||||
|
||||
Usage: ``(unmangle x)``
|
||||
|
||||
Stringify the input and return a string that would :ref:`mangle <mangling>` to
|
||||
it. Note that this isn't a one-to-one operation, and nor is ``mangle``, so
|
||||
``mangle`` and ``unmangle`` don't always round-trip.
|
||||
|
||||
.. code-block:: hylang
|
||||
|
||||
=> (unmangle "foo_bar")
|
||||
'foo-bar'
|
||||
|
||||
Included itertools
|
||||
==================
|
||||
|
||||
|
@ -157,17 +157,8 @@ HySymbol
|
||||
``hy.models.HySymbol`` is the model used to represent symbols
|
||||
in the Hy language. It inherits :ref:`HyString`.
|
||||
|
||||
``HySymbol`` objects are mangled in the parsing phase, to help Python
|
||||
interoperability:
|
||||
|
||||
- Symbols surrounded by asterisks (``*``) are turned into uppercase;
|
||||
- Dashes (``-``) are turned into underscores (``_``);
|
||||
- One trailing question mark (``?``) is turned into a leading ``is_``.
|
||||
|
||||
Caveat: as the mangling is done during the parsing phase, it is possible
|
||||
to programmatically generate HySymbols that can't be generated with Hy
|
||||
source code. Such a mechanism is used by :ref:`gensym` to generate
|
||||
"uninterned" symbols.
|
||||
Symbols are :ref:`mangled <mangling>` when they are compiled
|
||||
to Python variable names.
|
||||
|
||||
.. _hykeyword:
|
||||
|
||||
@ -340,7 +331,7 @@ Since they have no "value" to Python, this makes working in Hy hard, since
|
||||
doing something like ``(print (if True True False))`` is not just common, it's
|
||||
expected.
|
||||
|
||||
As a result, we auto-mangle things using a ``Result`` object, where we offer
|
||||
As a result, we reconfigure things using a ``Result`` object, where we offer
|
||||
up any ``ast.stmt`` that need to get run, and a single ``ast.expr`` that can
|
||||
be used to get the value of whatever was just run. Hy does this by forcing
|
||||
assignment to things while running.
|
||||
@ -352,11 +343,11 @@ As example, the Hy::
|
||||
Will turn into::
|
||||
|
||||
if True:
|
||||
_mangled_name_here = True
|
||||
_temp_name_here = True
|
||||
else:
|
||||
_mangled_name_here = False
|
||||
_temp_name_here = False
|
||||
|
||||
print _mangled_name_here
|
||||
print _temp_name_here
|
||||
|
||||
|
||||
OK, that was a bit of a lie, since we actually turn that statement
|
||||
|
@ -8,6 +8,12 @@ Hy <-> Python interop
|
||||
Despite being a Lisp, Hy aims to be fully compatible with Python. That means
|
||||
every Python module or package can be imported in Hy code, and vice versa.
|
||||
|
||||
:ref:`Mangling <mangling>` allows variable names to be spelled differently in
|
||||
Hy and Python. For example, Python's ``str.format_map`` can be written
|
||||
``str.format-map`` in Hy, and a Hy function named ``valid?`` would be called
|
||||
``is_valid`` in Python. In Python, you can import Hy's core functions
|
||||
``mangle`` and ``unmangle`` directly from the ``hy`` package.
|
||||
|
||||
Using Python from Hy
|
||||
====================
|
||||
|
||||
@ -27,41 +33,6 @@ You can use it in Hy:
|
||||
|
||||
You can also import ``.pyc`` bytecode files, of course.
|
||||
|
||||
A quick note about mangling
|
||||
--------
|
||||
|
||||
In Python, snake_case is used by convention. Lisp dialects tend to use dashes
|
||||
instead of underscores, so Hy does some magic to give you more pleasant names.
|
||||
|
||||
In the same way, ``UPPERCASE_NAMES`` from Python can be used ``*with-earmuffs*``
|
||||
instead.
|
||||
|
||||
You can use either the original names or the new ones.
|
||||
|
||||
Imagine ``example.py``::
|
||||
|
||||
def function_with_a_long_name():
|
||||
print(42)
|
||||
|
||||
FOO = "bar"
|
||||
|
||||
Then, in Hy:
|
||||
|
||||
.. code-block:: clj
|
||||
|
||||
(import example)
|
||||
(.function-with-a-long-name example) ; prints "42"
|
||||
(.function_with_a_long_name example) ; also prints "42"
|
||||
|
||||
(print (. example *foo*)) ; prints "bar"
|
||||
(print (. example FOO)) ; also prints "bar"
|
||||
|
||||
.. warning::
|
||||
Mangling isn’t that simple; there is more to discuss about it, yet it doesn’t
|
||||
belong in this section.
|
||||
.. TODO: link to mangling section, when it is done
|
||||
|
||||
|
||||
Using Hy from Python
|
||||
====================
|
||||
|
||||
|
@ -2,25 +2,10 @@
|
||||
Syntax
|
||||
==============
|
||||
|
||||
Hy maintains, over everything else, 100% compatibility in both directions
|
||||
with Python itself. All Hy code follows a few simple rules. Memorize
|
||||
this, as it's going to come in handy.
|
||||
identifiers
|
||||
-----------
|
||||
|
||||
These rules help ensure that Hy code is idiomatic and interfaceable in both
|
||||
languages.
|
||||
|
||||
* Symbols in earmuffs will be translated to the upper-cased version of that
|
||||
string. For example, ``foo`` will become ``FOO``.
|
||||
|
||||
* UTF-8 entities will be encoded using
|
||||
`punycode <https://en.wikipedia.org/wiki/Punycode>`_ and prefixed with
|
||||
``hy_``. For instance, ``⚘`` will become ``hy_w7h``, ``♥`` will become
|
||||
``hy_g6h``, and ``i♥u`` will become ``hy_iu_t0x``.
|
||||
|
||||
* Symbols that contain dashes will have them replaced with underscores. For
|
||||
example, ``render-template`` will become ``render_template``. This means
|
||||
that symbols with dashes will shadow their underscore equivalents, and vice
|
||||
versa.
|
||||
An identifier consists of a nonempty sequence of Unicode characters that are not whitespace nor any of the following: ``( ) [ ] { } ' "``. Hy first tries to parse each identifier into a numeric literal, then into a keyword if that fails, and finally into a symbol if that fails.
|
||||
|
||||
numeric literals
|
||||
----------------
|
||||
@ -98,6 +83,53 @@ the error ``Keyword argument :foo needs a value``. To avoid this, you can quote
|
||||
the keyword, as in ``(f ':foo)``, or use it as the value of another keyword
|
||||
argument, as in ``(f :arg :foo)``.
|
||||
|
||||
.. _mangling:
|
||||
|
||||
symbols
|
||||
-------
|
||||
|
||||
Symbols are identifiers that are neither legal numeric literals nor legal
|
||||
keywords. In most contexts, symbols are compiled to Python variable names. Some
|
||||
example symbols are ``hello``, ``+++``, ``3fiddy``, ``$40``, ``just✈wrong``,
|
||||
and ``🦑``.
|
||||
|
||||
Since the rules for Hy symbols are much more permissive than the rules for
|
||||
Python identifiers, Hy uses a mangling algorithm to convert its own names to
|
||||
Python-legal names. The rules are:
|
||||
|
||||
- Convert all hyphens (``-``) to underscores (``_``). Thus, ``foo-bar`` becomes
|
||||
``foo_bar``.
|
||||
- If the name ends with ``?``, remove it and prepend ``is``. Thus, ``tasty?``
|
||||
becomes ``is_tasty``.
|
||||
- If the name still isn't Python-legal, make the following changes. A name
|
||||
could be Python-illegal because it contains a character that's never legal in
|
||||
a Python name, it contains a character that's illegal in that position, or
|
||||
it's equal to a Python reserved word.
|
||||
|
||||
- Prepend ``hyx_`` to the name.
|
||||
- Replace each illegal character with ``ΔfooΔ`` (or on Python 2, ``XfooX``),
|
||||
where ``foo`` is the the Unicode character name in lowercase, with spaces
|
||||
replaced by underscores and hyphens replaced by ``H``. Replace ``Δ`` itself
|
||||
(or on Python 2, ``X``) the same way. If the character doesn't have a name,
|
||||
use ``U`` followed by its code point in lowercase hexadecimal.
|
||||
|
||||
Thus, ``green☘`` becomes ``hyx_greenΔshamrockΔ`` and ``if`` becomes
|
||||
``hyx_if``.
|
||||
|
||||
- Finally, any added ``hyx_`` or ``is_`` is added after any leading
|
||||
underscores, because leading underscores have special significance to Python.
|
||||
Thus, ``_tasty?`` becomes ``_is_tasty`` instead of ``is__tasty``.
|
||||
|
||||
Mangling isn't something you should have to think about often, but you may see
|
||||
mangled names in error messages, the output of ``hy2py``, etc. A catch to be
|
||||
aware of is that mangling, as well as the inverse "unmangling" operation
|
||||
offered by the ``unmangle`` function, isn't one-to-one. Two different symbols
|
||||
can mangle to the same string and hence compile to the same Python variable.
|
||||
The chief practical consequence of this is that ``-`` and ``_`` are
|
||||
interchangeable in all symbol names, so you shouldn't assign to the
|
||||
one-character name ``_`` , or else you'll interfere with certain uses of
|
||||
subtraction.
|
||||
|
||||
discard prefix
|
||||
--------------
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user