.. _syntax: ============== Syntax ============== identifiers ----------- An identifier consists of a nonempty sequence of Unicode characters that are not whitespace nor any of the following: ``( ) [ ] { } ' "``. Hy first tries to parse each identifier into a numeric literal, then into a keyword if that fails, and finally into a symbol if that fails. numeric literals ---------------- In addition to regular numbers, standard notation from Python for non-base 10 integers is used. ``0x`` for Hex, ``0o`` for Octal, ``0b`` for Binary. .. code-block:: clj (print 0x80 0b11101 0o102 30) Underscores and commas can appear anywhere in a numeric literal except the very beginning. They have no effect on the value of the literal, but they're useful for visually separating digits. .. code-block:: clj (print 10,000,000,000 10_000_000_000) Unlike Python, Hy provides literal forms for NaN and infinity: ``NaN``, ``Inf``, and ``-Inf``. string literals --------------- Hy allows double-quoted strings (e.g., ``"hello"``), but not single-quoted strings like Python. The single-quote character ``'`` is reserved for preventing the evaluation of a form (e.g., ``'(+ 1 1)``), as in most Lisps. Python's so-called triple-quoted strings (e.g., ``'''hello'''`` and ``"""hello"""``) aren't supported. However, in Hy, unlike Python, any string literal can contain newlines. Furthermore, Hy supports an alternative form of string literal called a "bracket string" similar to Lua's long brackets. Bracket strings have customizable delimiters, like the here-documents of other languages. A bracket string begins with ``#[FOO[`` and ends with ``]FOO]``, where ``FOO`` is any string not containing ``[`` or ``]``, including the empty string. (If ``FOO`` is exactly ``f`` or begins with ``f-``, the bracket string is interpreted as a :ref:`format string `.) For example:: => (print #[["That's very kind of yuo [sic]" Tom wrote back.]]) "That's very kind of yuo [sic]" Tom wrote back. => (print #[==[1 + 1 = 2]==]) 1 + 1 = 2 A bracket string can contain newlines, but if it begins with one, the newline is removed, so you can begin the content of a bracket string on the line following the opening delimiter with no effect on the content. Any leading newlines past the first are preserved. Plain string literals support :ref:`a variety of backslash escapes `. To create a "raw string" that interprets all backslashes literally, prefix the string with ``r``, as in ``r"slash\not"``. Bracket strings are always raw strings and don't allow the ``r`` prefix. Like Python, Hy treats all string literals as sequences of Unicode characters by default. You may prefix a plain string literal (but not a bracket string) with ``b`` to treat it as a sequence of bytes. Unlike Python, Hy only recognizes string prefixes (``r``, etc.) in lowercase. .. _syntax-fstrings: format strings -------------- A format string (or "f-string", or "formatted string literal") is a string literal with embedded code, possibly accompanied by formatting commands. Hy f-strings work much like :ref:`Python f-strings ` except that the embedded code is in Hy rather than Python, and they're supported on all versions of Python. :: => (print f"The sum is {(+ 1 1)}.") The sum is 2. Since ``!`` and ``:`` are identifier characters in Hy, Hy decides where the code in a replacement field ends, and any conversion or format specifier begins, by parsing exactly one form. You can use ``do`` to combine several forms into one, as usual. Whitespace may be necessary to terminate the form:: => (setv foo "a") => (print f"{foo:x<5}") … NameError: name 'hyx_fooXcolonXxXlessHthan_signX5' is not defined => (print f"{foo :x<5}") axxxx Unlike Python, whitespace is allowed between a conversion and a format specifier. Also unlike Python, comments and backslashes are allowed in replacement fields. Hy's lexer will still process the whole format string normally, like any other string, before any replacement fields are considered, so you may need to backslash your backslashes, and you can't comment out a closing brace or the string delimiter. .. _syntax-keywords: keywords -------- An identifier headed by a colon, such as ``:foo``, is a keyword. If a literal keyword appears in a function call, it's used to indicate a keyword argument rather than passed in as a value. For example, ``(f :foo 3)`` calls the function ``f`` with the keyword argument named ``foo`` set to ``3``. Hence, trying to call a function on a literal keyword may fail: ``(f :foo)`` yields the error ``Keyword argument :foo needs a value``. To avoid this, you can quote the keyword, as in ``(f ':foo)``, or use it as the value of another keyword argument, as in ``(f :arg :foo)``. Keywords can be called like functions as shorthand for ``get``. ``(:foo obj)`` is equivalent to ``(get obj :foo)``. An optional ``default`` argument is also allowed: ``(:foo obj 2)`` or ``(:foo obj :default 2)`` returns ``2`` if ``(get obj :foo)`` raises a ``KeyError``. .. _mangling: symbols ------- Symbols are identifiers that are neither legal numeric literals nor legal keywords. In most contexts, symbols are compiled to Python variable names. Some example symbols are ``hello``, ``+++``, ``3fiddy``, ``$40``, ``just✈wrong``, and ``🦑``. Since the rules for Hy symbols are much more permissive than the rules for Python identifiers, Hy uses a mangling algorithm to convert its own names to Python-legal names. The rules are: - Convert all hyphens (``-``) to underscores (``_``). Thus, ``foo-bar`` becomes ``foo_bar``. - If the name ends with ``?``, remove it and prepend ``is_``. Thus, ``tasty?`` becomes ``is_tasty``. - If the name still isn't Python-legal, make the following changes. A name could be Python-illegal because it contains a character that's never legal in a Python name, it contains a character that's illegal in that position, or it's equal to a Python reserved word. - Prepend ``hyx_`` to the name. - Replace each illegal character with ``XfooX``, where ``foo`` is the Unicode character name in lowercase, with spaces replaced by underscores and hyphens replaced by ``H``. Replace ``X`` itself the same way. If the character doesn't have a name, use ``U`` followed by its code point in lowercase hexadecimal. Thus, ``green☘`` becomes ``hyx_greenXshamrockX`` and ``if`` becomes ``hyx_if``. - Finally, any added ``hyx_`` or ``is_`` is added after any leading underscores, because leading underscores have special significance to Python. Thus, ``_tasty?`` becomes ``_is_tasty`` instead of ``is__tasty``. Mangling isn't something you should have to think about often, but you may see mangled names in error messages, the output of ``hy2py``, etc. A catch to be aware of is that mangling, as well as the inverse "unmangling" operation offered by the ``unmangle`` function, isn't one-to-one. Two different symbols can mangle to the same string and hence compile to the same Python variable. The chief practical consequence of this is that ``-`` and ``_`` are interchangeable in all symbol names, so you shouldn't assign to the one-character name ``_`` , or else you'll interfere with certain uses of subtraction. discard prefix -------------- Hy supports the Extensible Data Notation discard prefix, like Clojure. Any form prefixed with ``#_`` is discarded instead of compiled. This completely removes the form so it doesn't evaluate to anything, not even None. It's often more useful than linewise comments for commenting out a form, because it respects code structure even when part of another form is on the same line. For example: .. code-block:: clj => (print "Hy" "cruel" "World!") Hy cruel World! => (print "Hy" #_"cruel" "World!") Hy World! => (+ 1 1 (print "Math is hard!")) Math is hard! Traceback (most recent call last): ... TypeError: unsupported operand type(s) for +: 'int' and 'NoneType' => (+ 1 1 #_(print "Math is hard!")) 2