200 lines
7.9 KiB
ReStructuredText
200 lines
7.9 KiB
ReStructuredText
.. _syntax:
|
|
|
|
==============
|
|
Syntax
|
|
==============
|
|
|
|
identifiers
|
|
-----------
|
|
|
|
An identifier consists of a nonempty sequence of Unicode characters that are not whitespace nor any of the following: ``( ) [ ] { } ' "``. Hy first tries to parse each identifier into a numeric literal, then into a keyword if that fails, and finally into a symbol if that fails.
|
|
|
|
numeric literals
|
|
----------------
|
|
|
|
In addition to regular numbers, standard notation from Python for non-base 10
|
|
integers is used. ``0x`` for Hex, ``0o`` for Octal, ``0b`` for Binary.
|
|
|
|
.. code-block:: clj
|
|
|
|
(print 0x80 0b11101 0o102 30)
|
|
|
|
Underscores and commas can appear anywhere in a numeric literal except the very
|
|
beginning. They have no effect on the value of the literal, but they're useful
|
|
for visually separating digits.
|
|
|
|
.. code-block:: clj
|
|
|
|
(print 10,000,000,000 10_000_000_000)
|
|
|
|
Unlike Python, Hy provides literal forms for NaN and infinity: ``NaN``,
|
|
``Inf``, and ``-Inf``.
|
|
|
|
string literals
|
|
---------------
|
|
|
|
Hy allows double-quoted strings (e.g., ``"hello"``), but not single-quoted
|
|
strings like Python. The single-quote character ``'`` is reserved for
|
|
preventing the evaluation of a form (e.g., ``'(+ 1 1)``), as in most Lisps.
|
|
|
|
.. _syntax-bracket-strings:
|
|
|
|
Python's so-called triple-quoted strings (e.g., ``'''hello'''`` and
|
|
``"""hello"""``) aren't supported. However, in Hy, unlike Python, any string
|
|
literal can contain newlines. Furthermore, Hy supports an alternative form of
|
|
string literal called a "bracket string" similar to Lua's long brackets.
|
|
Bracket strings have customizable delimiters, like the here-documents of other
|
|
languages. A bracket string begins with ``#[FOO[`` and ends with ``]FOO]``,
|
|
where ``FOO`` is any string not containing ``[`` or ``]``, including the empty
|
|
string. (If ``FOO`` is exactly ``f`` or begins with ``f-``, the bracket string
|
|
is interpreted as a :ref:`format string <syntax-fstrings>`.) For example::
|
|
|
|
=> (print #[["That's very kind of yuo [sic]" Tom wrote back.]])
|
|
"That's very kind of yuo [sic]" Tom wrote back.
|
|
=> (print #[==[1 + 1 = 2]==])
|
|
1 + 1 = 2
|
|
|
|
A bracket string can contain newlines, but if it begins with one, the newline
|
|
is removed, so you can begin the content of a bracket string on the line
|
|
following the opening delimiter with no effect on the content. Any leading
|
|
newlines past the first are preserved.
|
|
|
|
Plain string literals support :ref:`a variety of backslash escapes
|
|
<py:strings>`. To create a "raw string" that interprets all backslashes
|
|
literally, prefix the string with ``r``, as in ``r"slash\not"``. Bracket
|
|
strings are always raw strings and don't allow the ``r`` prefix.
|
|
|
|
Like Python, Hy treats all string literals as sequences of Unicode characters
|
|
by default. You may prefix a plain string literal (but not a bracket string)
|
|
with ``b`` to treat it as a sequence of bytes.
|
|
|
|
Unlike Python, Hy only recognizes string prefixes (``r``, etc.) in lowercase.
|
|
|
|
.. _syntax-fstrings:
|
|
|
|
format strings
|
|
--------------
|
|
|
|
A format string (or "f-string", or "formatted string literal") is a string
|
|
literal with embedded code, possibly accompanied by formatting commands. Hy
|
|
f-strings work much like :ref:`Python f-strings <py:f-strings>` except that the
|
|
embedded code is in Hy rather than Python, and they're supported on all
|
|
versions of Python.
|
|
|
|
::
|
|
|
|
=> (print f"The sum is {(+ 1 1)}.")
|
|
The sum is 2.
|
|
|
|
Since ``!`` and ``:`` are identifier characters in Hy, Hy decides where the
|
|
code in a replacement field ends, and any conversion or format specifier
|
|
begins, by parsing exactly one form. You can use ``do`` to combine several
|
|
forms into one, as usual. Whitespace may be necessary to terminate the form::
|
|
|
|
=> (setv foo "a")
|
|
=> (print f"{foo:x<5}")
|
|
…
|
|
NameError: name 'hyx_fooXcolonXxXlessHthan_signX5' is not defined
|
|
=> (print f"{foo :x<5}")
|
|
axxxx
|
|
|
|
Unlike Python, whitespace is allowed between a conversion and a format
|
|
specifier.
|
|
|
|
Also unlike Python, comments and backslashes are allowed in replacement fields.
|
|
Hy's lexer will still process the whole format string normally, like any other
|
|
string, before any replacement fields are considered, so you may need to
|
|
backslash your backslashes, and you can't comment out a closing brace or the
|
|
string delimiter.
|
|
|
|
.. _syntax-keywords:
|
|
|
|
keywords
|
|
--------
|
|
|
|
An identifier headed by a colon, such as ``:foo``, is a keyword. If a
|
|
literal keyword appears in a function call, it's used to indicate a keyword
|
|
argument rather than passed in as a value. For example, ``(f :foo 3)`` calls
|
|
the function ``f`` with the keyword argument named ``foo`` set to ``3``. Hence,
|
|
trying to call a function on a literal keyword may fail: ``(f :foo)`` yields
|
|
the error ``Keyword argument :foo needs a value``. To avoid this, you can quote
|
|
the keyword, as in ``(f ':foo)``, or use it as the value of another keyword
|
|
argument, as in ``(f :arg :foo)``.
|
|
|
|
Keywords can be called like functions as shorthand for ``get``. ``(:foo obj)``
|
|
is equivalent to ``(get obj :foo)``. An optional ``default`` argument is also
|
|
allowed: ``(:foo obj 2)`` or ``(:foo obj :default 2)`` returns ``2`` if ``(get
|
|
obj :foo)`` raises a ``KeyError``.
|
|
|
|
.. _mangling:
|
|
|
|
symbols
|
|
-------
|
|
|
|
Symbols are identifiers that are neither legal numeric literals nor legal
|
|
keywords. In most contexts, symbols are compiled to Python variable names. Some
|
|
example symbols are ``hello``, ``+++``, ``3fiddy``, ``$40``, ``just✈wrong``,
|
|
and ``🦑``.
|
|
|
|
Since the rules for Hy symbols are much more permissive than the rules for
|
|
Python identifiers, Hy uses a mangling algorithm to convert its own names to
|
|
Python-legal names. The rules are:
|
|
|
|
- Convert all hyphens (``-``) to underscores (``_``). Thus, ``foo-bar`` becomes
|
|
``foo_bar``.
|
|
- If the name ends with ``?``, remove it and prepend ``is_``. Thus, ``tasty?``
|
|
becomes ``is_tasty``.
|
|
- If the name still isn't Python-legal, make the following changes. A name
|
|
could be Python-illegal because it contains a character that's never legal in
|
|
a Python name, it contains a character that's illegal in that position, or
|
|
it's equal to a Python reserved word.
|
|
|
|
- Prepend ``hyx_`` to the name.
|
|
- Replace each illegal character with ``XfooX``, where ``foo`` is the Unicode
|
|
character name in lowercase, with spaces replaced by underscores and
|
|
hyphens replaced by ``H``. Replace ``X`` itself the same way. If the
|
|
character doesn't have a name, use ``U`` followed by its code point in
|
|
lowercase hexadecimal.
|
|
|
|
Thus, ``green☘`` becomes ``hyx_greenXshamrockX`` and ``if`` becomes
|
|
``hyx_if``.
|
|
|
|
- Finally, any added ``hyx_`` or ``is_`` is added after any leading
|
|
underscores, because leading underscores have special significance to Python.
|
|
Thus, ``_tasty?`` becomes ``_is_tasty`` instead of ``is__tasty``.
|
|
|
|
Mangling isn't something you should have to think about often, but you may see
|
|
mangled names in error messages, the output of ``hy2py``, etc. A catch to be
|
|
aware of is that mangling, as well as the inverse "unmangling" operation
|
|
offered by the ``unmangle`` function, isn't one-to-one. Two different symbols
|
|
can mangle to the same string and hence compile to the same Python variable.
|
|
The chief practical consequence of this is that ``-`` and ``_`` are
|
|
interchangeable in all symbol names, so you shouldn't assign to the
|
|
one-character name ``_`` , or else you'll interfere with certain uses of
|
|
subtraction.
|
|
|
|
discard prefix
|
|
--------------
|
|
|
|
Hy supports the Extensible Data Notation discard prefix, like Clojure.
|
|
Any form prefixed with ``#_`` is discarded instead of compiled.
|
|
This completely removes the form so it doesn't evaluate to anything,
|
|
not even None.
|
|
It's often more useful than linewise comments for commenting out a
|
|
form, because it respects code structure even when part of another
|
|
form is on the same line. For example:
|
|
|
|
.. code-block:: clj
|
|
|
|
=> (print "Hy" "cruel" "World!")
|
|
Hy cruel World!
|
|
=> (print "Hy" #_"cruel" "World!")
|
|
Hy World!
|
|
=> (+ 1 1 (print "Math is hard!"))
|
|
Math is hard!
|
|
Traceback (most recent call last):
|
|
...
|
|
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'
|
|
=> (+ 1 1 #_(print "Math is hard!"))
|
|
2
|