mirror of
https://github.com/ProgramSnail/Lama.git
synced 2026-01-03 04:28:19 +00:00
Spec initial commit
This commit is contained in:
parent
d89cd76cd9
commit
5a883d8fa9
12 changed files with 130 additions and 25 deletions
142
doc/01.tex
142
doc/01.tex
|
|
@ -1,142 +0,0 @@
|
|||
\section{Introduction: Languages, Semantics, Interpreters, Compilers}
|
||||
|
||||
\subsection{Language and semantics}
|
||||
|
||||
A language is a collection of programs. A program is an \emph{abstract syntax tree} (AST), which describes the hierarchy of constructs. An abstract
|
||||
syntax of a programming language describes the format of abstract syntax trees of programs in this language. Thus, a language is a set of constructive
|
||||
objects, each of which can be constructively manipulated.
|
||||
|
||||
The semantics of a language $\mathscr L$ is a total map
|
||||
|
||||
$$
|
||||
\sembr{\bullet}_{\mathscr L} : \mathscr L \to \mathscr D
|
||||
$$
|
||||
|
||||
where $\mathscr D$ is some \emph{semantic domain}. The choice of the domain is at our command; for example, for Turing-complete languages $\mathscr D$ can
|
||||
be the set of all partially-recursive (computable) functions.
|
||||
|
||||
\subsection{Interpreters}
|
||||
|
||||
In reality, the semantics often is described using \emph{interpreters}:
|
||||
|
||||
$$
|
||||
eval : \mathscr L \to \mbox{\lstinline|Input|} \to \mbox{\lstinline|Output|}
|
||||
$$
|
||||
|
||||
where \lstinline|Input| and \lstinline|Output| are sets of (all possible) inputs and outputs for the programs in the language $\mathscr L$. We claim $eval$ to
|
||||
possess the following property
|
||||
|
||||
$$
|
||||
\forall p \in \mathscr L,\, \forall x\in \mbox{\lstinline|Input|} : \sembr{p}_{\mathscr L}\;x = eval\; p\; x
|
||||
$$
|
||||
|
||||
In other words, an interpreter takes a program and its input as arguments, and returns what the program would return, being run on that
|
||||
argument. The equality in the definitional property of an interpreter has to be read ``if the right hand side is defined, then the left hand side
|
||||
is defined, too, and their values coinside'', and vice-versa.
|
||||
|
||||
Why interpreters are so important? Because they can be written as programs in a \emph{meta-lanaguge}, or a \mbox{language of implementation}. For example,
|
||||
if we take ocaml as a language of implementation, then an interpreter of a language $\mathscr L$ is some ocaml program $eval$, such that
|
||||
|
||||
$$
|
||||
\forall p \in \mathscr L,\, \forall x\in \mbox{\lstinline|Input|} : \sembr{p}_{\mathscr L}\;x = \sembr{eval}_{\mbox{ocaml}}\; p\; x
|
||||
$$
|
||||
|
||||
How to define $\sembr{\bullet}_{\mbox{ocaml}}$? We can write an interpreter in some other language. Thus, a \emph{tower} of meta-languages and interpreters
|
||||
comes into consideration. When to stop? When the meta-language is simple enough for intuitive understanding (in reality: some math-based frameworks like
|
||||
operational, denotational or game semantics, etc.)
|
||||
|
||||
Pragmatically: if you have a good implementation of a good programming language you trust, you can write interpreters of other languages.
|
||||
|
||||
\subsection{Compilers}
|
||||
|
||||
A compiler is just a language transformer
|
||||
|
||||
$$
|
||||
comp :\mathscr L \to \mathscr M
|
||||
$$
|
||||
|
||||
for two languages $\mathscr L$ and $\mathscr M$; we expect a compiler to be total and to possess the following property:
|
||||
|
||||
$$
|
||||
\forall p\in\mathscr L\;\;\sembr{p}_{\mathscr L}=\sembr{comp\; p}_{\mathscr M}
|
||||
$$
|
||||
|
||||
Again, the equality in this definition is understood functionally. The property itself is called a \emph{complete} (or full) correctness. In reality
|
||||
compilers are \emph{partially} correct, which means, that the domain of compiled programs can be wider.
|
||||
|
||||
And, again, we expect compilers to be defined in terms of some implementation language. Thus, a compiler is a program (in, say, ocaml), such, that
|
||||
its semantics in ocaml possesses the following property (fill the rest yourself).
|
||||
|
||||
|
||||
\subsection{The first example: language of expressions}
|
||||
|
||||
Abstract syntax:
|
||||
|
||||
$$
|
||||
\begin{array}{rcll}
|
||||
\mathscr X & = & \{x,\, y,\, z,\, \dots\} & \mbox{(variables)}\\
|
||||
\otimes & = & \{+,\, -,\, \times,\, /,\, \%,\, <,\, \le,\, >,\, \ge,\, =,\,\ne,\, \vee,\, \wedge\} & \mbox{(binary operators)}\\
|
||||
\mathscr E & = & \mathscr X & \mbox{(expressions)}\\
|
||||
& & \mathbb N & \\
|
||||
& & \mathscr E \otimes \mathscr E &
|
||||
\end{array}
|
||||
$$
|
||||
|
||||
Semantics of expressions:
|
||||
|
||||
\begin{itemize}
|
||||
\item state $\sigma :\mathscr X \to \mathbb Z$ assigns values to (some) variables;
|
||||
\item semantics $\sembr{\bullet}$ assigns each expression a total map $\Sigma \to \mathbb Z$, where
|
||||
$\Sigma$ is the set of all states.
|
||||
\end{itemize}
|
||||
|
||||
Empty state $\Lambda$: undefined for any variable.
|
||||
|
||||
Denotational style of semantic description:
|
||||
|
||||
$$
|
||||
\begin{array}{rclcl}
|
||||
\sembr{n} & = & \lambda \sigma . n & , & n\in \mathbb N \\
|
||||
\sembr{x} & = & \lambda \sigma . \sigma x & , & x\in \mathscr X \\
|
||||
\sembr{A\otimes B} & = & \lambda \sigma . (\sembr{A}\sigma \oplus \sembr{B}\sigma) & , & A,\,B \in \mathscr E
|
||||
\end{array}
|
||||
$$
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{c|cl}
|
||||
$\otimes$ & $\oplus$ in ocaml\\
|
||||
\hline
|
||||
$+$ & \lstinline|+| \\
|
||||
$-$ & \lstinline|-| \\
|
||||
$\times$ & \lstinline|*| \\
|
||||
$/$ & \lstinline|/| \\
|
||||
$\%$ & \lstinline|mod| \\
|
||||
$<$ & \lstinline|<| & \rdelim\}{6}{5mm}[ see note 1 below] \\
|
||||
$>$ & \lstinline|>| \\
|
||||
$\le$ & \lstinline|<=| \\
|
||||
$\ge$ & \lstinline|>=| \\
|
||||
$=$ & \lstinline|=| \\
|
||||
$\ne$ & \lstinline|<>| \\
|
||||
$\wedge$ & \lstinline|&&| & \rdelim\}{2}{5mm}[ see note 2 below]\\
|
||||
$\vee$ & \lstinline/||/
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Note 1: the result is converted into integers (true $\to$ 1, false $\to$ 0).
|
||||
|
||||
Note 2: the arguments are converted to booleans (0 $\to$ false, not 0 $\to$ true), the result is converted to
|
||||
integers as in the previous note.
|
||||
|
||||
Important observations:
|
||||
|
||||
\begin{enumerate}
|
||||
\item $\sembr{\bullet}$ is defined \emph{compositionally}: the meaning of an expression is defined in terms of meanings
|
||||
of its proper subexpressions. This is an important property of denotational style.
|
||||
\item $\sembr{\bullet}$ is total, since it takes into account all possible ways to deconstruct any expression.
|
||||
\item $\sembr{\bullet}$ is deterministic: there is no way to assign different meanings to the same expression, since
|
||||
we deconstruct each expression unambiguously.
|
||||
\item $\otimes$ is an element of language \emph{syntax}, while $\oplus$ is its interpretation in the meta-language of
|
||||
semanic description (simpler: in the language of interpreter implementation).
|
||||
\item This concrete semantics is \emph{strict}: for a binary operator both its arguments are evaluated unconditionally; thus,
|
||||
for example, \lstinline|1$\,\vee\,$x| is undefined in empty state.
|
||||
\end{enumerate}
|
||||
Loading…
Add table
Add a link
Reference in a new issue