mirror of
https://github.com/ProgramSnail/Lama.git
synced 2025-12-06 14:58:50 +00:00
214 lines
9.1 KiB
TeX
214 lines
9.1 KiB
TeX
% !TEX TS-program = pdflatex
|
|
% !TeX spellcheck = en_US
|
|
% !TEX root = lama-spec.tex
|
|
\chapter{Extensions}
|
|
\label{sec:extensions}
|
|
|
|
There are some extensions for the core language defined in the previous chapters. These
|
|
extensions add some syntactic sugar, which makes writing programs in \lama a less
|
|
painful task.
|
|
|
|
\section{Custom Infix Operators}
|
|
\label{sec:custom_infix}
|
|
|
|
Besides the set of builtin infix operators (see Fig.~\ref{builtin_infixes}) users may define
|
|
custom infix operators. These operators may be declared at any scope level; when defined
|
|
at the top level they can be exported as well. However, there are some restrictions regarding the
|
|
redefinition of builtin infix operators:
|
|
|
|
\begin{itemize}
|
|
\item redefinitions of builtin infix operators can not be exported;
|
|
\item the assignment operator "\lstinline|:=|" can not be redefined;
|
|
\item infix definitions can not be mutually recursive (but this can be worked around by
|
|
definining infix synonyms for mutually-recursive functions).
|
|
\end{itemize}
|
|
|
|
The syntax for infix operator definition is shown in Fig.~\ref{custom_infix_construct}; a custom infix definition must specify exactly two arguments.
|
|
An associativity and precedence level has to be assigned to each custom infix operator. A precedence level is assigned by specifying at which
|
|
position, relative to other known infix operators, the operator being defined is inserted. Three kinds of specifications are allowed: at given level,
|
|
immediately before or immediately after. For example, "\lstinline|at +|" means that the operator is assigned exactly the same
|
|
level of precedence as "\lstinline|+|"; "\lstinline|after +|" creates a new precedence level immediately \emph{after} that for
|
|
"\lstinline|+|" (but \emph{before} that for "\lstinline|*|"), and "\lstinline|before *|" has exactly the same effect (provided
|
|
there were no insertions of precedence levels between those for "\lstinline|+|" and "\lstinline|*|").
|
|
|
|
When being inserted at an existing precedence level, an infix operator inherits the associativity from that level; hence, only "\lstinline|infix|"
|
|
keyword can be used for such definitions. When a new level is created, an associativity for this level has to be additionally specified
|
|
by using corresponding keyword ("\lstinline|infix|" for non-associative levels, "\lstinline|infixr|"~--- for levels with right
|
|
associativity, and "\lstinline|infixl|"~--- for levels with left associativity).
|
|
|
|
When public infix operators are exported, their relative precedence levels and associativity are exported as well; since not all
|
|
custom infix definitions may be made public some levels may disappear from the export. For example, let us have the following definitions:
|
|
|
|
\begin{lstlisting}
|
|
infixl ** before * (x, y) {...}
|
|
public infixr *** before ** (x, y) {...}
|
|
\end{lstlisting}
|
|
|
|
Here in the top scope for the compilation unit we have two additional precedence levels: one for the "\lstinline|**|" and another for the "\lstinline|***|".
|
|
However, as "\lstinline|**|" is not exported its precedence level will be forgotten during the import. Thus, only the precedence level for
|
|
"\lstinline|***|" will be created during the import as if is was defined at the level "\lstinline|before *|".
|
|
|
|
Respectively, multiple imports of units with custom infix operators will modify the precedence level in the order of their import. For example,
|
|
if there are two units "\lstinline|A|" and "\lstinline|B|" with declarations "\lstinline|infixl ++ before +|" and "\lstinline|infixl +++ before +|"
|
|
correspondingly, then importing "\lstinline|A|" after "\lstinline|B|" will result in "\lstinline|++|" having a \emph{lower} precedence, then
|
|
"\lstinline|+++|".
|
|
|
|
\begin{figure}[t]
|
|
\[
|
|
\begin{array}{rcl}
|
|
\defterm{infixDefinition} & : & \nonterm{infixHead}\s\term{(}\s\nonterm{functionArguments}\s\term{)}\s\nonterm{functionBody}\\
|
|
\defterm{infixHead} & : & [\s\term{public}\s]\s\nonterm{infixity}\s\token{INFIX}\s\nonterm{level}\\
|
|
\defterm{infixity} & : & \term{infix}\alt\term{infixl}\alt\term{infixr}\\
|
|
\defterm{level} & : & [\s\term{at}\alt\term{before}\alt\term{after}\s]\s\token{INFIX}
|
|
\end{array}
|
|
\]
|
|
\caption{The Syntax for Infix Operator Definition}
|
|
\label{custom_infix_construct}
|
|
\end{figure}
|
|
|
|
\section{Lazy Values and Eta-expansion}
|
|
|
|
An expression
|
|
|
|
\begin{lstlisting}
|
|
lazy$\;e$
|
|
\end{lstlisting}
|
|
|
|
where $e$~--- a $\nonterm{basicExpression}$~--- is converted into
|
|
|
|
\begin{lstlisting}
|
|
makeLazy (fun () {$e$})
|
|
\end{lstlisting}
|
|
|
|
where "\lstinline|makeLazy|"~--- a function from standard unit "\lstinline|Lazy|" (see Section~\ref{sec:std:lazy}). An import for
|
|
"\lstinline|Lazy|" is added implicitly.
|
|
|
|
An expression
|
|
|
|
\begin{lstlisting}
|
|
eta$\;e$
|
|
\end{lstlisting}
|
|
|
|
where $e$~--- a $\nonterm{basicExpression}$~--- is converted into
|
|
|
|
\begin{lstlisting}
|
|
fun ($x$) {$e$ ($x$)})
|
|
\end{lstlisting}
|
|
|
|
where "$x$"~--- a fresh variable which does not occur free in "$e$".
|
|
|
|
\section{Dot Notation}
|
|
\label{sec:dot-notation}
|
|
|
|
A function call
|
|
|
|
\begin{lstlisting}
|
|
$f$ ($e_1$, ..., $e_k$)
|
|
\end{lstlisting}
|
|
|
|
where $f$~--- an identifier~--- can be rewritten as
|
|
|
|
\begin{lstlisting}
|
|
$e_1$.$f$ ($e_2$, ..., $e_k$)
|
|
\end{lstlisting}
|
|
|
|
In particular, a call to a one-argument function $f (e)$ can be rewritten as $e.f$.
|
|
|
|
\section{Patterns in Function Arguments}
|
|
|
|
Patterns can be used in function argument specification: a declaration
|
|
|
|
\begin{lstlisting}
|
|
fun f ($p_1$, ..., $p_k$) { e }
|
|
\end{lstlisting}
|
|
|
|
is equivalent to
|
|
|
|
\begin{lstlisting}
|
|
fun f ($x_1$, ..., $x_k$) {
|
|
case $x_1$ of
|
|
$p_1$ -> case $x_2$ of
|
|
... -> e
|
|
esac
|
|
esac
|
|
}
|
|
\end{lstlisting}
|
|
|
|
where $x_i$~--- fresh variables, not free in $e$.
|
|
|
|
\section{Syntax Definitions}
|
|
|
|
Syntax definition extension represents an alternative simplified syntax for parsers written using standard unit \lstinline|Ostap| (see Section~\ref{sec:ostap}).
|
|
The syntax for syntax definition expressions is shown in Fig.~\ref{syntax_expressions}.
|
|
|
|
\begin{figure}[h]
|
|
\[
|
|
\begin{array}{rcll}
|
|
\defterm{syntaxExpression} & : & \term{syntax}\s\term{(}\s\nonterm{syntaxSeq}\s(\s\term{$\mid$}\s\nonterm{syntaxSeq}\s)^*\s\term{)}&\\
|
|
\defterm{syntaxSeq} & : & \nonterm{syntaxBinding}^+\s[\s\term{\{}\s\nonterm{expression}\s\term{\}}\s]&\\
|
|
\defterm{syntaxBinding} & : & [\s\term{-}\s]\s[\s\nonterm{pattern}\s\term{=}\s]\s\s\nonterm{syntaxPostfix}&\\
|
|
\defterm{syntaxPostfix} & : & \nonterm{syntaxPrimary}\s[\s\term{*}\s\alt\s\term{+}\s\alt\s\term{?}\s]&\\
|
|
\defterm{syntaxPrimary} & : & \token{LIDENT}\s(\s\term{[}\s[\s\nonterm{expression}\s(\s\term{,}\s\nonterm{expression}\s)^*\s]\s\term{]}\s)^*&\alt\\
|
|
& & \term{(}\s\nonterm{syntaxExpression}\s\term{)}&\alt\\
|
|
& & \term{\$(}\s\nonterm{expression}\s\term{)}&
|
|
\end{array}
|
|
\]
|
|
\caption{Syntax definition expressions}
|
|
\label{syntax_expressions}
|
|
\end{figure}
|
|
|
|
Syntax expressions can be used wherever regular expressions are allowed. Each syntax expressions is expanded in a certain combination of \lstinline|Ostap| primitives.
|
|
For example,
|
|
|
|
\begin{lstlisting}
|
|
fun sum (str) {
|
|
parseString (
|
|
syntax (l=DECIMAL token["+"] r=DECIMAL eof {
|
|
stringInt (l) + stringInt (r)
|
|
}),
|
|
str
|
|
)
|
|
}
|
|
\end{lstlisting}
|
|
|
|
defines a function which parses its arguments into an expression \lstinline|"l + r"|, where \lstinline|l| and \lstinline|r| are decimal literals, and evaluates its value.
|
|
|
|
A syntax expression itself is a sequence of alternatives, and each alternative is a sequential composition (\nonterm{syntaxSeq}) of primitive parsers equipped with optional
|
|
semantic action (a \emph{general} expression in curly brackets).
|
|
|
|
A primitive parser is either an l-indentfier (possibly supplied with arguments), or a \emph{general} expression, surrounded by brackets \term{\$(}..\term{)},
|
|
or a \emph{syntax} expression, surrounded by round brackets. Note, the arguments for primitive parsers in syntax expressions are surrounded by
|
|
\term{[}..\term{]} unlike general expressions; thus
|
|
|
|
\begin{lstlisting}
|
|
x ("a")
|
|
\end{lstlisting}
|
|
|
|
means a sequential composition of \lstinline|x| and "\lstinline|a|", not a combinator \lstinline|x| applied to "\lstinline|a|".
|
|
|
|
A primitive parser can be followed by one of postfix operators ("\term{*}", "\term{+}", or "\term{?}"), corresponding
|
|
to "\lstinline|rep0|", "\lstinline|rep|", or "\lstinline|opt|" combinators of \lstinline|Ostap| respectively, for example
|
|
|
|
\begin{lstlisting}
|
|
token["a"]+
|
|
identifier?
|
|
\end{lstlisting}
|
|
|
|
A value recognized by a primitive parser can be matched against a pattern, for example
|
|
|
|
\begin{lstlisting}
|
|
value=(identifier | constant)
|
|
h:tl=item+
|
|
\end{lstlisting}
|
|
|
|
The bindings provided by pattern-matching can be used in semantic actions.
|
|
|
|
Finally, if no semantic action is given, a sequential syntax expression returns a tuple of its components. However, if a parser
|
|
in a sequential composition is preceded by "\term{-}" then its value is not included into the default result. Thus,
|
|
|
|
\begin{lstlisting}
|
|
parse -eof
|
|
\end{lstlisting}
|
|
|
|
returns what "\lstinline|parse|" recognized; the input stream is parsed against "\lstinline|eof|", but the result of "\lstinline|eof|"
|
|
is omitted.
|