mirror of
https://github.com/ProgramSnail/Lama.git
synced 2025-12-24 15:48:47 +00:00
Spec initial commit
This commit is contained in:
parent
d89cd76cd9
commit
5a883d8fa9
12 changed files with 130 additions and 25 deletions
142
doc/lectures/01.tex
Normal file
142
doc/lectures/01.tex
Normal file
|
|
@ -0,0 +1,142 @@
|
|||
\section{Introduction: Languages, Semantics, Interpreters, Compilers}
|
||||
|
||||
\subsection{Language and semantics}
|
||||
|
||||
A language is a collection of programs. A program is an \emph{abstract syntax tree} (AST), which describes the hierarchy of constructs. An abstract
|
||||
syntax of a programming language describes the format of abstract syntax trees of programs in this language. Thus, a language is a set of constructive
|
||||
objects, each of which can be constructively manipulated.
|
||||
|
||||
The semantics of a language $\mathscr L$ is a total map
|
||||
|
||||
$$
|
||||
\sembr{\bullet}_{\mathscr L} : \mathscr L \to \mathscr D
|
||||
$$
|
||||
|
||||
where $\mathscr D$ is some \emph{semantic domain}. The choice of the domain is at our command; for example, for Turing-complete languages $\mathscr D$ can
|
||||
be the set of all partially-recursive (computable) functions.
|
||||
|
||||
\subsection{Interpreters}
|
||||
|
||||
In reality, the semantics often is described using \emph{interpreters}:
|
||||
|
||||
$$
|
||||
eval : \mathscr L \to \mbox{\lstinline|Input|} \to \mbox{\lstinline|Output|}
|
||||
$$
|
||||
|
||||
where \lstinline|Input| and \lstinline|Output| are sets of (all possible) inputs and outputs for the programs in the language $\mathscr L$. We claim $eval$ to
|
||||
possess the following property
|
||||
|
||||
$$
|
||||
\forall p \in \mathscr L,\, \forall x\in \mbox{\lstinline|Input|} : \sembr{p}_{\mathscr L}\;x = eval\; p\; x
|
||||
$$
|
||||
|
||||
In other words, an interpreter takes a program and its input as arguments, and returns what the program would return, being run on that
|
||||
argument. The equality in the definitional property of an interpreter has to be read ``if the right hand side is defined, then the left hand side
|
||||
is defined, too, and their values coinside'', and vice-versa.
|
||||
|
||||
Why interpreters are so important? Because they can be written as programs in a \emph{meta-lanaguge}, or a \mbox{language of implementation}. For example,
|
||||
if we take ocaml as a language of implementation, then an interpreter of a language $\mathscr L$ is some ocaml program $eval$, such that
|
||||
|
||||
$$
|
||||
\forall p \in \mathscr L,\, \forall x\in \mbox{\lstinline|Input|} : \sembr{p}_{\mathscr L}\;x = \sembr{eval}_{\mbox{ocaml}}\; p\; x
|
||||
$$
|
||||
|
||||
How to define $\sembr{\bullet}_{\mbox{ocaml}}$? We can write an interpreter in some other language. Thus, a \emph{tower} of meta-languages and interpreters
|
||||
comes into consideration. When to stop? When the meta-language is simple enough for intuitive understanding (in reality: some math-based frameworks like
|
||||
operational, denotational or game semantics, etc.)
|
||||
|
||||
Pragmatically: if you have a good implementation of a good programming language you trust, you can write interpreters of other languages.
|
||||
|
||||
\subsection{Compilers}
|
||||
|
||||
A compiler is just a language transformer
|
||||
|
||||
$$
|
||||
comp :\mathscr L \to \mathscr M
|
||||
$$
|
||||
|
||||
for two languages $\mathscr L$ and $\mathscr M$; we expect a compiler to be total and to possess the following property:
|
||||
|
||||
$$
|
||||
\forall p\in\mathscr L\;\;\sembr{p}_{\mathscr L}=\sembr{comp\; p}_{\mathscr M}
|
||||
$$
|
||||
|
||||
Again, the equality in this definition is understood functionally. The property itself is called a \emph{complete} (or full) correctness. In reality
|
||||
compilers are \emph{partially} correct, which means, that the domain of compiled programs can be wider.
|
||||
|
||||
And, again, we expect compilers to be defined in terms of some implementation language. Thus, a compiler is a program (in, say, ocaml), such, that
|
||||
its semantics in ocaml possesses the following property (fill the rest yourself).
|
||||
|
||||
|
||||
\subsection{The first example: language of expressions}
|
||||
|
||||
Abstract syntax:
|
||||
|
||||
$$
|
||||
\begin{array}{rcll}
|
||||
\mathscr X & = & \{x,\, y,\, z,\, \dots\} & \mbox{(variables)}\\
|
||||
\otimes & = & \{+,\, -,\, \times,\, /,\, \%,\, <,\, \le,\, >,\, \ge,\, =,\,\ne,\, \vee,\, \wedge\} & \mbox{(binary operators)}\\
|
||||
\mathscr E & = & \mathscr X & \mbox{(expressions)}\\
|
||||
& & \mathbb N & \\
|
||||
& & \mathscr E \otimes \mathscr E &
|
||||
\end{array}
|
||||
$$
|
||||
|
||||
Semantics of expressions:
|
||||
|
||||
\begin{itemize}
|
||||
\item state $\sigma :\mathscr X \to \mathbb Z$ assigns values to (some) variables;
|
||||
\item semantics $\sembr{\bullet}$ assigns each expression a total map $\Sigma \to \mathbb Z$, where
|
||||
$\Sigma$ is the set of all states.
|
||||
\end{itemize}
|
||||
|
||||
Empty state $\Lambda$: undefined for any variable.
|
||||
|
||||
Denotational style of semantic description:
|
||||
|
||||
$$
|
||||
\begin{array}{rclcl}
|
||||
\sembr{n} & = & \lambda \sigma . n & , & n\in \mathbb N \\
|
||||
\sembr{x} & = & \lambda \sigma . \sigma x & , & x\in \mathscr X \\
|
||||
\sembr{A\otimes B} & = & \lambda \sigma . (\sembr{A}\sigma \oplus \sembr{B}\sigma) & , & A,\,B \in \mathscr E
|
||||
\end{array}
|
||||
$$
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{c|cl}
|
||||
$\otimes$ & $\oplus$ in ocaml\\
|
||||
\hline
|
||||
$+$ & \lstinline|+| \\
|
||||
$-$ & \lstinline|-| \\
|
||||
$\times$ & \lstinline|*| \\
|
||||
$/$ & \lstinline|/| \\
|
||||
$\%$ & \lstinline|mod| \\
|
||||
$<$ & \lstinline|<| & \rdelim\}{6}{5mm}[ see note 1 below] \\
|
||||
$>$ & \lstinline|>| \\
|
||||
$\le$ & \lstinline|<=| \\
|
||||
$\ge$ & \lstinline|>=| \\
|
||||
$=$ & \lstinline|=| \\
|
||||
$\ne$ & \lstinline|<>| \\
|
||||
$\wedge$ & \lstinline|&&| & \rdelim\}{2}{5mm}[ see note 2 below]\\
|
||||
$\vee$ & \lstinline/||/
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Note 1: the result is converted into integers (true $\to$ 1, false $\to$ 0).
|
||||
|
||||
Note 2: the arguments are converted to booleans (0 $\to$ false, not 0 $\to$ true), the result is converted to
|
||||
integers as in the previous note.
|
||||
|
||||
Important observations:
|
||||
|
||||
\begin{enumerate}
|
||||
\item $\sembr{\bullet}$ is defined \emph{compositionally}: the meaning of an expression is defined in terms of meanings
|
||||
of its proper subexpressions. This is an important property of denotational style.
|
||||
\item $\sembr{\bullet}$ is total, since it takes into account all possible ways to deconstruct any expression.
|
||||
\item $\sembr{\bullet}$ is deterministic: there is no way to assign different meanings to the same expression, since
|
||||
we deconstruct each expression unambiguously.
|
||||
\item $\otimes$ is an element of language \emph{syntax}, while $\oplus$ is its interpretation in the meta-language of
|
||||
semanic description (simpler: in the language of interpreter implementation).
|
||||
\item This concrete semantics is \emph{strict}: for a binary operator both its arguments are evaluated unconditionally; thus,
|
||||
for example, \lstinline|1$\,\vee\,$x| is undefined in empty state.
|
||||
\end{enumerate}
|
||||
151
doc/lectures/02.tex
Normal file
151
doc/lectures/02.tex
Normal file
|
|
@ -0,0 +1,151 @@
|
|||
\section{Statements, Stack Machine, Stack Machine Compiler}
|
||||
|
||||
\subsection{Statements}
|
||||
|
||||
More interesting language~--- a language of simple statements:
|
||||
|
||||
$$
|
||||
\begin{array}{rcl}
|
||||
\mathscr S & = & \mathscr X \mbox{\lstinline|:=|} \;\mathscr E \\
|
||||
& & \mbox{\lstinline|read (|} \mathscr X \mbox{\lstinline|)|} \\
|
||||
& & \mbox{\lstinline|write (|} \mathscr E \mbox{\lstinline|)|} \\
|
||||
& & \mathscr S \mbox{\lstinline|;|} \mathscr S
|
||||
\end{array}
|
||||
$$
|
||||
|
||||
Here $\mathscr E, \mathscr X$ stand for the sets of expressions and variables, as in the previous lecture.
|
||||
|
||||
Again, we define the semantics for this language
|
||||
|
||||
$$
|
||||
\sembr{\bullet}_{\mathscr S} : \mathscr S \mapsto \mathbb Z^* \to \mathbb Z^*
|
||||
$$
|
||||
|
||||
with the semantic domain of partial functions from integer strings to integer strings. This time we will
|
||||
use \emph{big-step operational semantics}: we define a ternary relation ``$\Rightarrow$''
|
||||
|
||||
$$
|
||||
\Rightarrow \subseteq \mathscr C \times \mathscr S \times \mathscr C
|
||||
$$
|
||||
|
||||
where $\mathscr C = \Sigma \times \mathbb Z^* \times \mathbb Z^*$~--- a set of all configurations during a
|
||||
program execution. We will write $c_1\xRightarrow{S}c_2$ instead of $(c_1, S, c_2)\in\Rightarrow$ and informally
|
||||
interpret the former as ``the execution of a statement $S$ in a configuration $c_1$ completes with the configuration
|
||||
$c_2$''. The components of a configuration are state, which binds (some) variables to their values, and input and
|
||||
output streams, represented as (finite) strings of integers.
|
||||
|
||||
The relation ``$\Rightarrow$'' is defined by the following deductive system (see Fig.~\ref{bs_stmt}). The first
|
||||
three rules are \emph{axioms} as they do not have any premises. Note, according to these rules sometimes a program
|
||||
cannot do a step in a given configuration: a value of an expression can be undefined in a given state in rules
|
||||
$\rulename{Assign}$ and $\rulename{Write}$, and there can be no input value in rule $\rulename{Read}$. This style of
|
||||
a semantics description is called big-step operational semantics, since the results of a computation are
|
||||
immediately observable at the right hand side of ``$\Rightarrow$'' and, thus, the computation is performed in
|
||||
a single ``big'' step. And, again, this style of a semantic description can be used to easily implement a
|
||||
reference interpreter.
|
||||
|
||||
With the relation ``$\Rightarrow$'' defined we can abbreviate the ``surface'' semantics for the language of statements:
|
||||
|
||||
\setarrow{\xRightarrow}
|
||||
|
||||
\[
|
||||
\forall S\in\mathscr S,\,\forall \iota\in\mathbb Z^*\;:\;\sembr{S}_{\mathscr S} \iota = o \Leftrightarrow \trans{\inbr{\Lambda, i, \epsilon}}{S}{\inbr{\_, \_, o}}
|
||||
\]
|
||||
|
||||
|
||||
\begin{figure}[t]
|
||||
\arraycolsep=10pt
|
||||
\[\trans{\inbr{\sigma,\, \iota,\, o}}{\llang{x := $\;\;e$}}{\inbr{\sigma\,[x\gets\sembr{e}_{\mathscr E}\;\sigma],\, \iota,\, o}}\ruleno{Assign}\]
|
||||
\[\trans{\inbr{\sigma,\, z\iota,\, o}}{\llang{read ($x$)}}{\inbr{\sigma\,[x\gets z],\, \iota,\, o}}\ruleno{Read}\]
|
||||
\[\trans{\inbr{\sigma,\, \iota,\, o}}{\llang{write ($e$)}}{\inbr{\sigma,\, \iota,\, o(\sembr{e}_{\mathscr E}\;\sigma)}}\ruleno{Write}\]
|
||||
\[\trule{\begin{array}{cc}
|
||||
\trans{c_1}{S_1}{c^\prime} & \trans{c^\prime}{S_2}{c_2}
|
||||
\end{array}}
|
||||
{\trans{c_1}{S_1\llang{;}S_2}{c_2}}\ruleno{Seq}\]
|
||||
\caption{Big-step operational semantics for statements}
|
||||
\label{bs_stmt}
|
||||
\end{figure}
|
||||
|
||||
\section{Stack Machine}
|
||||
|
||||
Stack machine is a simple abstract computational device, which can be used as a convenient model to constructively describe
|
||||
the compilation process.
|
||||
|
||||
In short, stack machine operates on the same configurations, as the language of statements, plus a stack of integers. The
|
||||
computation, performed by the stack machine, is controlled by a program, which is described as follows:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\mathscr I & = & \llang{BINOP $\;\otimes$} \\
|
||||
& & \llang{CONST $\;\mathbb N$} \\
|
||||
& & \llang{READ} \\
|
||||
& & \llang{WRITE} \\
|
||||
& & \llang{LD $\;\mathscr X$} \\
|
||||
& & \llang{ST $\;\mathscr X$} \\
|
||||
\mathscr P & = & \epsilon \\
|
||||
& & \mathscr I\mathscr P
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
Here the syntax category $\mathscr I$ stands for \emph{instructions}, $\mathscr P$~--- for \emph{programs}; thus, a program is a finite
|
||||
string of instructions.
|
||||
|
||||
The semantics of stack machine program can be described, again, in the form of big-step operational semantics. This time the set of
|
||||
stack machine configurations is
|
||||
|
||||
\[
|
||||
\mathscr C_{SM} = \mathbb Z^* \times \mathscr C
|
||||
\]
|
||||
|
||||
where the first component is a stack, and the second~--- a configuration as in the semantics of statement language. The rules are shown on Fig.~\ref{bs_sm}; note,
|
||||
now we have one axiom and six inference rules (one per instruction).
|
||||
|
||||
As for the statement, with the aid of the relation ``$\Rightarrow$'' we can define the surface semantics of stack machine:
|
||||
|
||||
\[
|
||||
\forall p\in\mathscr P,\,\forall i\in\mathbb Z^*\;:\;\sembr{p}_{SM}\;i=o\Leftrightarrow\trans{\inbr{\epsilon, \inbr{\Lambda, i, \epsilon}}}{p}{\inbr{\_, \inbr{\_, \_, o}}}
|
||||
\]
|
||||
|
||||
\begin{figure}[t]
|
||||
\[\trans{c}{\epsilon}{c}\ruleno{Stop$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{(x\oplus y)\llang{::}st, c}}{p}{c^\prime}}{\trans{\inbr{y\llang{::}x\llang{::}st, c}}{(\llang{BINOP $\;\otimes$})p}{c^\prime}}\ruleno{Binop$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{z\llang{::}st, c}}{p}{c^\prime}}{\trans{\inbr{st, c}}{(\llang{CONST $\;z$})p}{c^\prime}}\ruleno{Const$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{z\llang{::}st, \inbr{s, i, o}}}{p}{c^\prime}}{\trans{\inbr{st, \inbr{s, z\llang{::}i, o}}}{(\llang{READ})p}{c^\prime}}\ruleno{Read$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{st, \inbr{s, i, o\llang{@}z}}}{p}{c^\prime}}{\trans{\inbr{z\llang{::}st, \inbr{s, i, o}}}{(\llang{WRITE})p}{c^\prime}}\ruleno{Write$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{(s\;x)\llang{::}st, \inbr{s, i, o}}}{p}{c^\prime}}{\trans{\inbr{st, \inbr{s, i, o}}}{(\llang{LD $\;x$})p}{c^\prime}}\ruleno{LD$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{st, \inbr{s[x\gets z], i, o}}}{p}{c^\prime}}{\trans{\inbr{z\llang{::}st, \inbr{s, i, o}}}{(\llang{ST $\;x$})p}{c^\prime}}\ruleno{ST$_{SM}$}\]
|
||||
\caption{Big-step operational semantics for stack machine}
|
||||
\label{bs_sm}
|
||||
\end{figure}
|
||||
|
||||
\subsection{A Compiler for the Stack Machine}
|
||||
|
||||
A compiler of the statement language into the stack machine is a total mapping
|
||||
|
||||
\[
|
||||
\sembr{\bullet}_{comp} : \mathscr S \mapsto \mathscr P
|
||||
\]
|
||||
|
||||
We can describe the compiler in the form of denotational semantics for the source language. In fact, we can treat the compiler as a \emph{static} semantics, which
|
||||
maps each program into its stack machine equivalent.
|
||||
|
||||
As the source language consists of two syntactic categories (expressions and statments), the compiler has to be ``bootstrapped'' from the compiler for expressions
|
||||
$\sembr{\bullet}^{\mathscr E}_{comp}$:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\sembr{x}^{\mathscr E}_{comp}&=&\llang{[LD $\;x$]}\\
|
||||
\sembr{n}^{\mathscr E}_{comp}&=&\llang{[CONST $\;n$]}\\
|
||||
\sembr{A\otimes B}^{\mathscr E}_{comp}&=&\sembr{A}^{\mathscr E}_{comp}\llang{@}\sembr{B}^{\mathscr E}_{comp}\llang{@}(\llang{BINOP $\;\otimes$})
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
And now the main dish:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\sembr{\llang{$x$ := $e$}}_{comp}&=&\sembr{e}^{\mathscr E}_{comp}\llang{@}\llang{[ST $\;x$]}\\
|
||||
\sembr{\llang{read ($x$)}}_{comp}&=&\llang{[READ; ST $\;x$]}\\
|
||||
\sembr{\llang{write ($e$)}}_{comp}&=&\sembr{e}^{\mathscr E}_{comp}\llang{@}\llang{[WRITE]}\\
|
||||
\sembr{\llang{$S_1$;$\;S_2$}}_{comp}&=&\sembr{S_1}_{comp}\llang{@}\sembr{S_2}_{comp}
|
||||
\end{array}
|
||||
\]
|
||||
169
doc/lectures/03.tex
Normal file
169
doc/lectures/03.tex
Normal file
|
|
@ -0,0 +1,169 @@
|
|||
\section{Structural Induction}
|
||||
|
||||
\begin{figure}
|
||||
\begin{subfigure}{\textwidth}
|
||||
\[
|
||||
\begin{array}{rclr}
|
||||
\sembr{n} & = & \lambda \sigma . n & \mbox{\scriptsize\rulename{[Const]}}\\
|
||||
\sembr{x} & = & \lambda \sigma . \sigma x & \mbox{\scriptsize\rulename{[Var]}} \\
|
||||
\sembr{A\otimes B} & = & \lambda \sigma . (\sembr{A}\sigma \oplus \sembr{B}\sigma) & \mbox{\scriptsize\rulename{[Binop]}}
|
||||
\end{array}
|
||||
\]
|
||||
\caption{Denotational semantics for expressions}
|
||||
\end{subfigure}
|
||||
\vskip5mm
|
||||
\begin{subfigure}{\textwidth}
|
||||
\[\trans{c}{\epsilon}{c}\ruleno{Stop$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{(x\oplus y)\llang{::}st, s}}{p}{c^\prime}}{\trans{\inbr{y\llang{::}x\llang{::}st, s}}{(\llang{BINOP $\;\otimes$})p}{c^\prime}}\ruleno{Binop$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{z\llang{::}st, s}}{p}{c^\prime}}{\trans{\inbr{st, s}}{(\llang{CONST $\;z$})p}{c^\prime}}\ruleno{Const$_{SM}$}\]
|
||||
\[\trule{\trans{\inbr{(s\;x)\llang{::}st, s}}{p}{c^\prime}}{\trans{\inbr{st, s}}{(\llang{LD $\;x$})p}{c^\prime}}\ruleno{LD$_{SM}$}\]
|
||||
\caption{Big-step operational semantics for stack machine}
|
||||
\end{subfigure}
|
||||
\vskip5mm
|
||||
\begin{subfigure}{\textwidth}
|
||||
\[
|
||||
\begin{array}{rclr}
|
||||
\sembr{x}^{\mathscr E}_{comp}&=&\llang{[LD $\;x$]} & \mbox{\scriptsize\rulename{[Var$_{comp}$]}} \\
|
||||
\sembr{n}^{\mathscr E}_{comp}&=&\llang{[CONST $\;n$]} & \mbox{\scriptsize\rulename{[Const$_{comp}$]}}\\
|
||||
\sembr{A\otimes B}^{\mathscr E}_{comp}&=&\sembr{A}^{\mathscr E}_{comp}\llang{@}\sembr{B}^{\mathscr E}_{comp}\llang{@[BINOP $\otimes$]}) & \mbox{\scriptsize\rulename{[Binop$_{comp}$]}}
|
||||
\end{array}
|
||||
\]
|
||||
\caption{Compilation}
|
||||
\end{subfigure}
|
||||
\caption{All relevant definitions}
|
||||
\label{definitions}
|
||||
\end{figure}
|
||||
|
||||
|
||||
We have considered two languages (a language of expressions $\mathscr E$ and a language of stack machine programs $\mathscr P$), and a compiler from the former to the latter.
|
||||
It can be formally proven, that the compiler is (fully) correct in the sense, given in the lecture 1. Due to the simplicity of the languages, the proof technique~---
|
||||
\emph{structural induction}~--- is simple as well.
|
||||
|
||||
First, we collect all needed definitions in one place (see Fig.~\ref{definitions}). We simplified the description of stack machine semantics a little bit: first,
|
||||
we dropped off all instructions, which cannot be generated by the expression compiler, and then, we removed the input and output streams from the stack machine
|
||||
configurations, since they are never affected by the remaining instructions.
|
||||
|
||||
\begin{lemma}(Determinism)
|
||||
Let $p$ be an arbitrary stack machine program, and let $c$, $c_1$ and $c_2$ be arbitrary configurations. Then
|
||||
|
||||
\[
|
||||
\trans{c}{p}{c_1} \wedge \trans{c}{p}{c_2} \Rightarrow c_1= c_2
|
||||
\]
|
||||
\end{lemma}
|
||||
\begin{proof}
|
||||
Induction on the structure of $p$.
|
||||
|
||||
\textbf{Base case}. If $p=\epsilon$, then, by the rule \rulename{Stop$_{SM}$}, $c_1=c$ and $c_2=c$. Since no other rule can be
|
||||
applied, we're done.
|
||||
|
||||
\textbf{Induction step}. If $p=\iota p^\prime$, then, by condition, we have
|
||||
|
||||
\[
|
||||
\trule{\trans{c^\prime}{p^\prime}{c_1}}{\trans{c}{\iota p^\prime}{c_1}}
|
||||
\]
|
||||
|
||||
and
|
||||
|
||||
\[
|
||||
\trule{\trans{c^{\prime\prime}}{p^\prime}{c_2}}{\trans{c}{\iota p^\prime}{c_2}}
|
||||
\]
|
||||
|
||||
where $c^\prime$ and $c^{\prime\prime}$ depend only on $c$ and $\iota$. By the case analysis on $\iota$ we conclude, that
|
||||
$c^\prime=c^{\prime\prime}$. Since $p^\prime$ is shorter, than $p$, we can apply the induction hypothesis, which gives us
|
||||
$c_1=c_2$.
|
||||
\end{proof}
|
||||
|
||||
\FloatBarrier
|
||||
|
||||
\begin{lemma} (Compositionality)
|
||||
Let $p=p_1p_2$ be an arbitrary stack machine program, subdivided into arbitrary subprograms $p_1$ and $p_2$. Then,
|
||||
|
||||
\[
|
||||
\forall c_1, c_2:\;\trans{c_1}{p}{c_2}\;\Leftrightarrow\;\exists c^\prime:\; \trans{c_1}{p_1}{c^\prime} \wedge \trans{c^\prime}{p_2}{c_2}
|
||||
\]
|
||||
\end{lemma}
|
||||
\begin{proof}
|
||||
Induction on the structure of $p$.
|
||||
|
||||
\textbf{Base case}. The base case $p=\epsilon$ is trivial: use the rule \rulename{Stop$_{SM}$} and get $c^\prime=c_2=c_1$.
|
||||
|
||||
\textbf{Induction step}. When $p=\iota p^\prime$, then there are two cases:
|
||||
|
||||
\begin{itemize}
|
||||
\item Either $p_1=\epsilon$, then $c^\prime=c_1$ trivially by the rule \rulename{Stop$_{SM}$}, and we're done.
|
||||
\item Otherwise $p_1=\iota p_1^\prime$, and, thus, $p=\iota p_1^\prime p_2$. In order to prove the lemma, we need to prove two implications:
|
||||
\begin{enumerate}
|
||||
\item Let $\trans{c_1}{p=\iota p_1^\prime p_2}{c_2}$. Technically, we need here to consider three cases (one for each type of the instruction
|
||||
$\iota$), but in all cases the outcome would be the same: we have the picture
|
||||
|
||||
\[
|
||||
\trule{\trans{c^{\prime\prime}}{p_1^\prime p_2}{c_2}}{\trans{c_1}{p=\iota p_1^\prime p_2}{c_2}}
|
||||
\]
|
||||
|
||||
where $c^{\prime\prime}$ depends only on $\iota$ and $c_1$. Since $p_1^\prime p_2$ is shorter, than $p$, we can apply the induction hypothesis, which gives us a
|
||||
configuration $c^\prime$, such, that $\trans{c^{\prime\prime}}{p_1^\prime}{c^\prime}$ and $\trans{c^\prime}{p_2}{c_2}$. The observation $\trans{c_1}{\iota p_1^\prime}{c^\prime}$
|
||||
concludes the proof (note, we implicitly use determinism here).
|
||||
\item Let there exists $c^\prime$, such that $\trans{c_1}{\iota p_1^\prime}{c^\prime}$ and $\trans{c^\prime}{p_2}{c_2}$. From the first relation we have
|
||||
|
||||
\[
|
||||
\trule{\trans{c^{\prime\prime}}{p_1^\prime}{c^\prime}}{\trans{c_1}{\iota p_1^\prime}{c^\prime}}
|
||||
\]
|
||||
|
||||
where $c^{\prime\prime}$ depends only on $\iota$ and $c_1$. Since $p_1^\prime p_2$ is shorter, than $p$, we can apply the induction hypothesis, which
|
||||
gives us $\trans{c^{\prime\prime}}{p_1^\prime p_2}{c_2}$, and, thus, $\trans{c_1}{\iota p_1^\prime p_2}{c_2}$ (again, we implicitly use determinism here).
|
||||
\end{enumerate}
|
||||
\end{itemize}
|
||||
\end{proof}
|
||||
|
||||
\begin{theorem}(Correctness of compilation)
|
||||
Let $e\in\mathscr E$ be arbitrary expression, $s$~--- arbitrary state, and $st$~--- arbitrary stack. Then
|
||||
|
||||
\[
|
||||
\trans{\inbr{st, s}}{\sembr{e}^{\mathscr E}_{comp}}{\inbr{(\sembr{e}\,s)::st, s}}\; \mbox{iff} \; (\sembr{e}\,s)\; \mbox{is defined}
|
||||
\]
|
||||
\end{theorem}
|
||||
\begin{proof}
|
||||
Induction on the structure of $e$.
|
||||
|
||||
\textbf{Base case}. There are two subcases:
|
||||
|
||||
\begin{enumerate}
|
||||
\item $e$ is a constant $z$. Then:
|
||||
\begin{itemize}
|
||||
\item $\sembr{e}\,s=z$ for each state $s$;
|
||||
\item \mbox{$\sembr{e}^{\mathscr E}_{comp}=[\llang{CONST z}]$};
|
||||
\item $\trans{\inbr{st, s}}{[\llang{CONST z}]}{\inbr{z::st, s}}$ for arbitrary $st$ and $s$.
|
||||
\end{itemize}
|
||||
|
||||
This concludes the first base case.
|
||||
|
||||
\item $e$ is a variable $x$. Then:
|
||||
\begin{itemize}
|
||||
\item $\sembr{s}\,s=s\,x$ for each state $s$, such that $s\,x$ is defined;
|
||||
\item \mbox{$\sembr{e}^{\mathscr E}_{comp}=[\llang{LD x}]$};
|
||||
\item $\trans{\inbr{st, s}}{[\llang{CONST z}]}{\inbr{(s\,x)::st, s}}$ for arbitrary $st$ and arbitrary $s$, such that $s\, x$ is defined.
|
||||
\end{itemize}
|
||||
|
||||
This concludes the second base case.
|
||||
\end{enumerate}
|
||||
|
||||
\textbf{Induction step}. Let $e$ be $A\otimes B$. Then:
|
||||
|
||||
\begin{itemize}
|
||||
\item $\sembr{A\otimes B}s=\sembr{A}s\oplus\sembr{B}s$ for each state $s$, such that both $\sembr{A}s$ and $\sembr{B}s$ are defined;
|
||||
\item $\sembr{A\otimes B}^{\mathscr E}_{comp}=\sembr{A}^{\mathscr E}_{comp}\llang{@}\sembr{B}^{\mathscr E}_{comp}\llang{@[BINOP $\oplus$]}$;
|
||||
\item by the inductive hypothesis, for arbitrary $st$ and $s$
|
||||
|
||||
\[
|
||||
\trans{\inbr{st, s}}{\sembr{A}^{\mathscr E}_{comp}}{\inbr{(\sembr{A}s)::st, s}} \mbox{iff} \; (\sembr{A}\,s)\; \mbox{is defined}
|
||||
\]
|
||||
|
||||
and
|
||||
|
||||
\[
|
||||
\trans{\inbr{(\sembr{A}s)::st, s}}{\sembr{B}^{\mathscr E}_{comp}}{\inbr{(\sembr{B}s)::(\sembr{A}s)::st, s}} \mbox{iff} \; (\sembr{A}\,s) \;\mbox{and}\; (\sembr{A}\,s) \; \mbox{are defined}
|
||||
\]
|
||||
|
||||
Taking into account the semantics of \llang{BINOP $\otimes$} and applying the compositionality lemma, the theorem follows.
|
||||
\end{itemize}
|
||||
\end{proof}
|
||||
228
doc/lectures/04.tex
Normal file
228
doc/lectures/04.tex
Normal file
|
|
@ -0,0 +1,228 @@
|
|||
\title{Control Flow Constructs}
|
||||
|
||||
\subsection{Structural Control Flow}
|
||||
|
||||
We add a few structural control flow constructs to the language:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\mathscr S & += & \llang{skip} \\
|
||||
& & \llang{if $\;\mathscr E\;$ then $\;\mathscr S\;$ else $\;\mathscr S$} \\
|
||||
& & \llang{while $\;\mathscr E\;$ do $\;\mathscr S$}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
The big-step operational semantics is straightforward and is shown on Fig.~\ref{bs_stmt_cf}.
|
||||
|
||||
In the concrete syntax for the constructs we add the closing keywords ``\llang{if}'' and ``\llang{od}'' as follows:
|
||||
|
||||
\[
|
||||
\begin{array}{c}
|
||||
\llang{if $\;e\;$ then $\;s_1\;$ else $\;s_2\;$ fi}\\
|
||||
\llang{while $\;e\;$ do $\;s\;$ od}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
\begin{figure}[t]
|
||||
\arraycolsep=10pt
|
||||
%\def\arraystretch{2.9}
|
||||
\[\trans{c}{\llang{skip}}{c}\ruleno{Skip}\]\vskip2mm
|
||||
\[
|
||||
\trule{\begin{array}{cc}
|
||||
\sembr{e}\;\sigma\ne 0 & \trans{c}{\mbox{$S_1$}}{c^\prime}
|
||||
\end{array}}
|
||||
{\trans{c=\inbr{\sigma,\, \_,\, \_}}{\llang{if $\;e\;$ then $\;S_1\;$ else $\;S_2\;$}}{c^\prime}}
|
||||
\ruleno{If-True}
|
||||
\]\vskip2mm
|
||||
\[
|
||||
\trule{\begin{array}{cc}
|
||||
\sembr{e}\;\sigma=0 & \trans{c}{\mbox{$S_1$}}{c^\prime}
|
||||
\end{array}}
|
||||
{\trans{c=\inbr{\sigma,\, \_,\, \_}}{\llang{if $\;e\;$ then $\;S_1\;$ else $\;S_2\;$}}{c^\prime}}
|
||||
\ruleno{If-False}
|
||||
\]\vskip3mm
|
||||
\[
|
||||
\trule{\begin{array}{ccc}
|
||||
{\sembr{e}\;\sigma\ne 0} & \trans{c}{\llang{$S$}}{c^\prime} & \trans{c^\prime}{\llang{while $\;e\;$ do $\;S\;$}}{c^{\prime\prime}}\\
|
||||
\end{array}
|
||||
}
|
||||
{\trans{c=\inbr{\sigma,\, \_,\, \_}}{\llang{while $\;e\;$ do $\;S\;$}}{c^{\prime\prime}}}
|
||||
\ruleno{While-True}
|
||||
\]\vskip3mm
|
||||
\[
|
||||
\trule{\sembr{e}\;\sigma=0}
|
||||
{\trans{c=\inbr{\sigma,\, \_,\, \_}}{\llang{while $\;e\;$ do $\;S\;$}}{c}}
|
||||
\ruleno{While-False}
|
||||
\]
|
||||
\caption{Big-step operational semantics for control flow statements}
|
||||
\label{bs_stmt_cf}
|
||||
\end{figure}
|
||||
|
||||
\subsection{Extended Stack Machine}
|
||||
|
||||
In order to compile the extended language into a program for the stack machine the latter has to be extended. First, we introduce a set of label names
|
||||
|
||||
\[
|
||||
\mathscr L = \{l_1, l_2, \dots\}
|
||||
\]
|
||||
|
||||
Then, we add three extra control flow instructions:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\mathscr I & += & \llang{LABEL $\;\mathscr L$} \\
|
||||
& & \llang{JMP $\;\mathscr L$} \\
|
||||
& & \llang{CJMP$_x$ $\;\mathscr L$},\;\mbox{where $x\in\{\llang{nz}, \llang{z}\}$}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
In order to give the semantics to these instructions, we need to extend the syntactic form of rules, used in the description of big-step operational smeantics. Instead of
|
||||
the rules in the form
|
||||
|
||||
\setarrow{\xRightarrow}
|
||||
|
||||
\[
|
||||
\trule{\trans{c}{p}{c^\prime}}{\trans{c^\prime}{p^\prime}{c^{\prime\prime}}}
|
||||
\]
|
||||
|
||||
we use the following form
|
||||
|
||||
\[
|
||||
\trule{\withenv{\Gamma}{\trans{c}{p}{c^\prime}}}{\withenv{\Gamma^\prime}{\trans{c^\prime}{p^\prime}{c^{\prime\prime}}}}
|
||||
\]
|
||||
|
||||
where $\Gamma, \Gamma^\prime$~--- \emph{environments}. The structure of environments can be different in different cases; for now environment is just a program. Informally,
|
||||
the semantics of control flow instructions can not be described in terms of just a current instruction and current configuration~--- we need to take the whole
|
||||
program into account. Thus, the enclosing program is used as an environment.
|
||||
|
||||
Additionally, for a program $P$ and a label $l$ we define a subprogram $P[l]$, such that $P$ is uniquely represented as $p^\prime(\llang{LABEL $\;l$})P[l]$.
|
||||
In other words $P[l]$ is a unique suffix of $P$, immediately following the label $l$ (if there are multiple (or no) occurrences of label $l$ in $P$, then $P[l]$ is
|
||||
undefined).
|
||||
|
||||
All existing rules have to be rewritten~--- we need to formally add a $\withenv{P}{\dots}$ part everywhere. For the new instructions the rules are given on Fig.~\ref{bs_sm_cc}.
|
||||
|
||||
\begin{figure}[t]
|
||||
\[\trule{\withenv{P}{\trans{c}{p}{c^\prime}}}{\withenv{P}{\trans{c}{(\llang{LABEL $\;l$})p}{c^\prime}}}\ruleno{Label$_{SM}$}\]
|
||||
\[\trule{\withenv{P}{\trans{c}{P[l]}{c^\prime}}}{\withenv{P}{\trans{c}{(\llang{JMP $\;l$})p}{c^\prime}}}\ruleno{JMP$_{SM}$}\]
|
||||
\[\crule{\withenv{P}{\trans{c}{P[l]}{c^\prime}}}{\withenv{P}{\trans{\inbr{z::st, c}}{(\llang{CJMP$_{nz}$ $\;l$})p}{c^\prime}}}{z\ne 0}\ruleno{CJMP$_{nz}^+$$_{SM}$}\]
|
||||
\[\crule{\withenv{P}{\trans{c}{p}{c^\prime}}}{\withenv{P}{\trans{\inbr{z::st, c}}{(\llang{CJMP$_{nz}$ $\;l$})p}{c^\prime}}}{z = 0}\ruleno{CJMP$_{nz}^-$$_{SM}$}\]
|
||||
\[\crule{\withenv{P}{\trans{c}{P[l]}{c^\prime}}}{\withenv{P}{\trans{\inbr{z::st, c}}{(\llang{CJMP$_z$ $\;l$})p}{c^\prime}}}{z = 0}\ruleno{CJMP$_{z}^+$$_{SM}$}\]
|
||||
\[\crule{\withenv{P}{\trans{c}{p}{c^\prime}}}{\withenv{P}{\trans{\inbr{z::st, c}}{(\llang{CJMP$_z$ $\;l$})p}{c^\prime}}}{z \ne 0}\ruleno{CJMP$_{z}^-$$_{SM}$}\]
|
||||
\caption{Big-step operational semantics for extended stack machine}
|
||||
\label{bs_sm_cc}
|
||||
\end{figure}
|
||||
|
||||
Finally, the top-level semantics for the extended stack machine can be redefined as follows:
|
||||
|
||||
\[
|
||||
\forall p\in\mathscr P,\,\forall i\in\mathbb Z^*\;:\;\sembr{p}_{SM}\;i=o\Leftrightarrow\withenv{p}{\trans{\inbr{\epsilon, \inbr{\lambda, i, \epsilon}}}{p}{\inbr{\_, \inbr{\_, \_, o}}}}
|
||||
\]
|
||||
|
||||
\subsection{Syntax Extensions}
|
||||
|
||||
With the structural control flow constructs already implemented, it is rather simple to ``saturate'' the language with more elaborated control constructs, using
|
||||
the method of \emph{syntactic extension}. Namely, we may introduce the following constructs
|
||||
|
||||
\begin{lstlisting}
|
||||
if $\;e_1\;$ then $\;s_1\;$
|
||||
elif $\;e_2\;$ then $\;s_2\;$
|
||||
$\dots$
|
||||
elif $\;e_k\;$ then $\;s_k\;$
|
||||
[ else $\;s_{k+1}\;$ ]
|
||||
fi
|
||||
\end{lstlisting}
|
||||
|
||||
and
|
||||
|
||||
\begin{lstlisting}
|
||||
for $\;s_1$, $\;e$, $\;s_2\;$ do $\;s_3\;$ od
|
||||
\end{lstlisting}
|
||||
|
||||
only at the syntactic level, directly parsing these constructs into the original abstract syntax tree, using the following conversions:
|
||||
|
||||
\begin{tabular}{p{3cm}p{1cm}p{3cm}}
|
||||
\begin{lstlisting}
|
||||
if $\;e_1\;$ then $\;s_1\;$
|
||||
elif $\;e_2\;$ then $\;s_2\;$
|
||||
$\dots$
|
||||
elif $\;e_k\;$ then $\;s_k\;$
|
||||
else $\;s_{k+1}\;$
|
||||
fi
|
||||
\end{lstlisting} &
|
||||
\begin{center}
|
||||
\vskip1cm
|
||||
$\leadsto$
|
||||
\end{center} &
|
||||
\begin{lstlisting}
|
||||
if $\;e_1\;$ then $\;s_1\;$
|
||||
else if $\;e_2\;$ then $\;s_2\;$
|
||||
$\dots$
|
||||
else if $\;e_k\;$ then $\;s_k\;$
|
||||
else $\;s_{k+1}\;$
|
||||
fi
|
||||
$\dots$
|
||||
fi
|
||||
\end{lstlisting}
|
||||
\end{tabular}
|
||||
|
||||
\begin{tabular}{p{3cm}p{1cm}p{3cm}}
|
||||
\begin{lstlisting}
|
||||
if $\;e_1\;$ then $\;s_1\;$
|
||||
elif $\;e_2\;$ then $\;s_2\;$
|
||||
$\dots$
|
||||
elif $\;e_k\;$ then $\;s_k\;$
|
||||
fi
|
||||
\end{lstlisting} &
|
||||
\begin{center}
|
||||
\vskip1cm
|
||||
$\leadsto$
|
||||
\end{center} &
|
||||
\begin{lstlisting}
|
||||
if $\;e_1\;$ then $\;s_1\;$
|
||||
else if $\;e_2\;$ then $\;s_2\;$
|
||||
$\dots$
|
||||
else if $\;e_k\;$ then $\;s_k\;$
|
||||
else skip
|
||||
fi
|
||||
$\dots$
|
||||
fi
|
||||
\end{lstlisting}
|
||||
\end{tabular}
|
||||
|
||||
\begin{tabular}{p{5cm}p{1cm}p{3cm}}
|
||||
\begin{lstlisting}
|
||||
for $\;s_1$, $\;e$, $\;s_2\;$ do $\;s_3\;$ od
|
||||
\end{lstlisting} &
|
||||
\begin{center}
|
||||
$\leadsto$
|
||||
\end{center} &
|
||||
\begin{lstlisting}
|
||||
$s_1$;
|
||||
while $\;e\;$ do
|
||||
$s_3$;
|
||||
$s_2$
|
||||
od
|
||||
\end{lstlisting}
|
||||
\end{tabular}
|
||||
|
||||
The advantage of syntax extension method is that it makes it possible to add certain constructs with almost zero cost~--- indeed, no steps have to be made in order
|
||||
to implement the extended constructs (besides parsing). Note, the semantics of extended constructs is provided for free as well (which is not always desirable).
|
||||
Another potential problem with syntax extensions is that they can easily provide unreasonable results. For example, one may be tempted to implement a post-condition
|
||||
loop using syntax extension:
|
||||
|
||||
\begin{tabular}{p{5cm}p{1cm}p{3cm}}
|
||||
\begin{lstlisting}
|
||||
repeat $\;s$ until $\;e$
|
||||
\end{lstlisting} &
|
||||
\begin{center}
|
||||
$\leadsto$
|
||||
\end{center} &
|
||||
\begin{lstlisting}
|
||||
$s$;
|
||||
while $\;e\;$ == 0 do
|
||||
$s$
|
||||
od
|
||||
\end{lstlisting}
|
||||
\end{tabular}
|
||||
|
||||
However, for nested \lstinline{repeat} constructs the size of extended program is exponential w.r.t. the nesting depth, which makes the whole idea unreasonable.
|
||||
173
doc/lectures/05.tex
Normal file
173
doc/lectures/05.tex
Normal file
|
|
@ -0,0 +1,173 @@
|
|||
\section{Procedures}
|
||||
|
||||
Procedures are unit-returning functions. We consider adding procedures as a separate step, since introducing full-fledged functions would require
|
||||
essential redefinition of the semantics for expressions; at the same time the code generation for funictions is a little trickier, than for procedures, so
|
||||
it is reasonable to split the implementation in two steps.
|
||||
|
||||
At the source level procedures are added as a separate syntactic category~--- definition $\mathscr D$:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\mathscr D & = & \epsilon \\
|
||||
& & (\llang{fun $\;\mathscr X\;$ ($\mathscr X^*$) local $\;\mathscr X^*\;$ \{$\mathscr S$\}})\mathscr D
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
In other words, a definition is a (possibly empty) sequence of procedure descriptions. Each description consists of a name for the procedure, a
|
||||
list of names for its arguments and local variables, and a body (statement). In concrete syntax a single definition looks like
|
||||
|
||||
\begin{lstlisting}
|
||||
fun $name$ ($a_1$, $\;a_2$, $\dots$, $\;a_k$)
|
||||
local $l_1$, $\;l_2$, $\dots$, $\;l_n$ {
|
||||
$s$
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
where $name$~--- a name for the procedure, $a_i$~--- its arguments, $l_i$~--- local variables, $s$~--- body.
|
||||
|
||||
We also need to add a call statement to the language:
|
||||
|
||||
\[
|
||||
\mathscr S += \mathscr X (\mathscr E^*)
|
||||
\]
|
||||
|
||||
In a concrete syntax a call to a procedure $f$ with arguments $e_1,\dots,e_k$ looks like
|
||||
|
||||
\[
|
||||
\llang{$f\;$ ($e_1,\dots,e_k$)}
|
||||
\]
|
||||
|
||||
Finally, we have to redefine the syntax category for programs at the top level:
|
||||
|
||||
\[
|
||||
\mathscr L = \mathscr D\mathscr S
|
||||
\]
|
||||
|
||||
In other words, we extend a statement with a set of definitions.
|
||||
|
||||
With procedures, we need to introduce the notion of \emph{scope}. When a procedure is called, its arguments
|
||||
are associated with actual parameter values. A procedure is also in ``posession'' of its local variables.
|
||||
So, in principle, the context of a procedure execution is a set of arguments, local variables and, possibly, some
|
||||
other variables, for example, the global ones. However, the exact details of procedure context manipilation can
|
||||
differ essentially from language to language.
|
||||
|
||||
In our case, we choose a static scoping~--- each procedure, besides its arguments and local variables, has an access
|
||||
only to global variables. To describe this semantics, we need to change the definition of a state we've used so far:
|
||||
|
||||
\[
|
||||
\Sigma = (\mathscr X \to \mathbb Z) \times 2^{\mathscr X} \times (\mathscr X \to \mathbb Z)
|
||||
\]
|
||||
|
||||
Now the state is a triple: a \emph{global} state, a set of variables, and a \emph{local} state. Informally, in a new
|
||||
state $\inbr{\sigma_g,\,S,\,\sigma_l}$ $S$ describes a set of local variables, $\sigma_l$~--- their values, and $\sigma_g$~---
|
||||
the values of all other accessible variables.
|
||||
|
||||
We need to redefine all state-manipulation primitives; first, the valuation of variables:
|
||||
|
||||
\[
|
||||
\inbr{\sigma_g,\,S,\,\sigma_l}\;x=
|
||||
\left\{\begin{array}{rcl}
|
||||
\sigma_g\;x & , & x\not\in S\\
|
||||
\sigma_l\;x & , & x \in S
|
||||
\end{array}
|
||||
\right.
|
||||
\]
|
||||
|
||||
Then, updating the state:
|
||||
|
||||
\[
|
||||
\inbr{\sigma_g,\,S,\,\sigma_l}[x\gets z] =
|
||||
\left\{\begin{array}{rcl}
|
||||
\inbr{\sigma_g[x\gets z],\,S,\,\sigma_l} & , & x\not\in S\\
|
||||
\inbr{\sigma_g,\,S,\,\sigma_l[x\gets z]} & , & x \in S
|
||||
\end{array}
|
||||
\right.
|
||||
\]
|
||||
|
||||
As an empty state, we take the following triple:
|
||||
|
||||
\[
|
||||
\Lambda=\inbr{\Lambda,\,\emptyset,\,\Lambda}
|
||||
\]
|
||||
|
||||
Finally, we need two transformations for states:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\primi{enter}{\inbr{\sigma_g,\,\_,\,\_}\,S} & = & \inbr{\sigma_g,\,S,\,\Lambda}\\
|
||||
\primi{leave}{\inbr{\sigma_g,\,\_,\,\_}\,\inbr{\_,\,S,\,\sigma_l}}& = & \inbr{\sigma_g,\,S,\,\sigma_l}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
The first one simlulates entering the new scope with a set of local variables $S$; the second one simulates leaving
|
||||
from an inner scope (described by the first state) to an outer one (described by the second).
|
||||
|
||||
\setarrow{\xRightarrow}
|
||||
|
||||
All exising rules for big-step operational semantics have to be enriched by \emph{functional environment} $\Gamma$, which
|
||||
binds procedure names to their definitions. As this binding never changes during the program interpretation, we need only to
|
||||
propagate this environment, adding $\withenv{\Gamma}{...}$ for each transition ``$\trans{...}{}{...}$''. The only thing we
|
||||
need now is to describe the rule for procedure calls:
|
||||
|
||||
\[
|
||||
\arraycolsep=10pt
|
||||
\trule{\begin{array}{cc}
|
||||
\llang{fun $\;f\;$ ($\bar{a}$) local $\;\bar{l}\;$ \{$S$\}}=\Gamma\,f &
|
||||
\withenv{\Gamma}{\trans{\inbr{\primi{enter}{\sigma\,(\bar{a}\bar{l})[\overline{a\gets\sembr{e}\sigma}]},\,\iota,\,o}}{S}{\inbr{\sigma^\prime,\,\iota^\prime,\,o^\prime}}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{\Gamma}{\trans{\inbr{\sigma,\,\iota,\,o}}{f (\bar{e})}{\inbr{\primi{leave}{\sigma^\prime\,\sigma},\,\iota^\prime,o^\prime}}}}
|
||||
\ruleno{Call}
|
||||
\]
|
||||
|
||||
where $\Gamma\,f = \llang{fun $\;f\;$ ($\bar{a}$) local $\;\bar{l}\;$ \{$S$\}}$.
|
||||
|
||||
|
||||
\section{Extended Stack Machine}
|
||||
|
||||
In order to support procedures and calls, we enrich the stack machine with three following instructions:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\mathscr I & += & \llang{BEGIN $\;\mathscr X^*\;\mathscr X^*$} \\
|
||||
& & \llang{CALL $\;\mathscr X$}\\
|
||||
& & \llang{END}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
Informally speaking, instruction \llang{BEGIN} performs entering into the scope of a procedure; its operands are the lists of argument names and
|
||||
local variables; \llang{END} leaves the scope and returns to the call site, and \llang{CALL} performs the call itself.
|
||||
|
||||
We need to enrich the configurations for the stack machine as well:
|
||||
|
||||
\[
|
||||
\mathscr C_{SM} = (\mathscr P\times \Sigma)\times\mathbb Z^*\times \mathscr C
|
||||
\]
|
||||
|
||||
Here we added a \emph{control stack}~--- a stack of pairs of programs and states. Informally, when performing the \llang{CALL} instruction, we put the following
|
||||
program and current state on a stack to use them later on, when corresponding \llang{END} instruction will be encountered. As all other instructions does not
|
||||
affect the control stack, it gets threaded through all rules of operational semantics unchanged.
|
||||
|
||||
Now we specify additional rules for the new instructions:
|
||||
|
||||
\[
|
||||
\trule{\withenv{P}{\trans{\inbr{cs,\,st,\,\inbr{\primi{enter}{\sigma\,(\bar{a}@\bar{l})\overline{[a\gets z]}},\,i,\,o}}}{p}{c^\prime}}}
|
||||
{\withenv{P}{\trans{\inbr{cs,\,\bar{z}@st,\,\inbr{\sigma,\,i,\,o}}}{(\llang{BEGIN $\;\bar{a}\,\bar{l}$})p}{c^\prime}}}
|
||||
\ruleno{Begin$_{SM}$}
|
||||
\]
|
||||
|
||||
\[
|
||||
\trule{\withenv{P}{\trans{\inbr{(p,\,\sigma)::cs,\,st,\,\inbr{\sigma,\,i,\,o}}}{P[f]}{c^\prime}}}
|
||||
{\withenv{P}{\trans{\inbr{cs,\,st,\,\inbr{\sigma,\,i,\,o}}}{(\llang{CALL $\;f$})p}{c^\prime}}}
|
||||
\ruleno{Call$_{SM}$}
|
||||
\]
|
||||
|
||||
\[
|
||||
\trule{\withenv{P}{\trans{\inbr{cs,\,st,\,\inbr{\primi{leave}{\sigma\,\sigma^\prime},\,i,\,o}}}{p^\prime}{c^\prime}}}
|
||||
{\withenv{P}{\trans{\inbr{(p^\prime,\,\sigma^\prime)::cs,\,st,\,\inbr{\sigma,\,i,\,o}}}{\llang{END}p}{c^\prime}}}
|
||||
\ruleno{EndRet$_{SM}$}
|
||||
\]
|
||||
|
||||
\[
|
||||
\withenv{P}{\trans{\inbr{\epsilon,\,st,\,c}}{\llang{END}p}{\inbr{\epsilon,\,st,\,c}}}\ruleno{EndStop$_{SM}$}
|
||||
\]
|
||||
193
doc/lectures/06.tex
Normal file
193
doc/lectures/06.tex
Normal file
|
|
@ -0,0 +1,193 @@
|
|||
\section{Functions}
|
||||
|
||||
\subsection{Functions in Expressions}
|
||||
|
||||
At the syntax level function calls are introduced with the following construct:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\mathscr E & += & \mathscr X \mathscr E^*
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
In the concrete syntax function call looks conventional:
|
||||
|
||||
\begin{lstlisting}
|
||||
$f$ ($e_1, e_2, \dots, e_k$)
|
||||
\end{lstlisting}
|
||||
|
||||
where $f$~--- a function name, $e_i$~--- its actual parameters.
|
||||
|
||||
Surpisingly, such a mild extension results in a complete redefinition of the semantics. Indeed, as function body can perform
|
||||
arbitrary actions on state, input and output streams, expressions with function calls can now have side effects. In order to express these
|
||||
side effects we need to redefine the semantics completely, this time using big-step operational style.
|
||||
|
||||
We extend the configuration, used in the semantic description for statements, with a fourth component~--- an optional integer value:
|
||||
|
||||
\[
|
||||
\mathscr C = \Sigma \times \mathbb Z^* \times \mathbb Z^* \times \mathbb Z^?
|
||||
\]
|
||||
|
||||
This component will correspond to an optional return value for function/procedure calls (either integer value $n$ or ``$\hbox{---}$'', nothing).
|
||||
|
||||
We introduce two primitives to make the semantic description shorter:
|
||||
|
||||
\[
|
||||
\begin{array}{c}
|
||||
\primi{val}{\inbr{s,\,i,\,o,\,v}}=v\\
|
||||
\primi{ret}{\inbr{s,\,i,\,o,\,\_}\,v}=\inbr{s,\,i,\,o,\,v}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
|
||||
The rules themselves are summarized in the Fig.~\ref{bs_expr}. Note the use of double environment for evaluating the body of a function in the rule
|
||||
\rulename{Call}; note also, that now semantics of expressions and statements are mutually recursive.
|
||||
|
||||
\setarrow{\xRightarrow}
|
||||
\setsubarrow{_{\mathscr E}}
|
||||
|
||||
\begin{figure}
|
||||
\arraycolsep=10pt
|
||||
\[\withenv{\Phi}{\trans{\inbr{\sigma,\, i,\, o,\, \_}}{n}{\inbr{\sigma,\, i,\, o,\, n}}} \ruleno{Const} \]\vskip2mm
|
||||
\[\withenv{\Phi}{\trans{\inbr{\sigma,\, i,\, o,\, \_}}{x}{\inbr{\sigma,\, i,\, o,\, \sigma\;x}}} \ruleno{Var} \]\vskip2mm
|
||||
\[\trule{\begin{array}{cc}
|
||||
\withenv{\Phi}{\trans{c}{A}{c^\prime}} & \withenv{\Phi}{\trans{c^\prime}{B}{c^{\prime\prime}}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{\Phi}{\trans{c}{A\otimes B}{\primi{ret}{(\primi{val}{c^\prime}\oplus\primi{val}{c^{\prime\prime}})}}}}
|
||||
\ruleno{Binop}
|
||||
\]\vskip2mm
|
||||
\[\trule{\begin{array}{c}
|
||||
\withenv{\Phi}{\trans{c_{j-1}}{e_j}{c_j=\inbr{\sigma_j,\,i_j,\,o_j,\,v_j}}}\\
|
||||
\Phi\;f = \llang{fun $\;f\;$ ($\overline{a}$) local $\;\overline{l}\;$ \{$s$\}}\\
|
||||
{\setsubarrow{}
|
||||
\withenv{\llang{skip},\,\Phi}{\trans{\inbr{\primi{enter}{\sigma_k\; (\overline{a}@\overline{l})\; [\overline{a_j\gets v_j}]},\,i_k,o_k,\hbox{---}}}{s}{\inbr{\sigma^\prime, i^\prime, o^\prime, n}}}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{\Phi}{\trans{c_0=\inbr{\sigma_0,\,\_,\,\_,\,\_}}{f (\overline{e_k})}{\inbr{\primi{leave}{\sigma^\prime\;\sigma_0}, i^\prime, o^\prime, n}}}}
|
||||
\ruleno{Call}
|
||||
\]
|
||||
\caption{Big-step Operational Semantics for Expressions}
|
||||
\label{bs_expr}
|
||||
\end{figure}
|
||||
|
||||
\subsection{Return Statement}
|
||||
|
||||
In order to make it possible to return values from procedures we add a return statement to the language of statements:
|
||||
|
||||
\[
|
||||
\mathscr S += \llang{return $\;\mathscr E^?$}
|
||||
\]
|
||||
|
||||
And, again, this small addition leads to redefinition of the semantics in \emph{continuation-passing style} (CPS).
|
||||
|
||||
First, we define the following meta-operator ``$\diamond$'' on statements:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
s \diamond \llang{skip} & = & s\\
|
||||
s_1 \diamond s_2 & = & \llang{$s_1$; $\;s_2$}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
\setsubarrow{}
|
||||
|
||||
Then, we add another environment component, $K$, in the description of semantic relation ``$\withenv{...}{\trans{...}{...}{...}}$''. Informally
|
||||
speaking, now
|
||||
|
||||
\[
|
||||
\withenv{K,\,\Phi}{\trans{c}{s}{c^\prime}}
|
||||
\]
|
||||
|
||||
is read ``the execution of $s$, immediately followed by $K$, in the configuration $c$ results in the configuration $c^\prime$''. Statement $K$ is called \emph{continuation}.
|
||||
The rules themselves are shown on Fig.~\ref{bs_cps}. Note, the rule for the call statement is exactly the same, as for the call expression.
|
||||
|
||||
\begin{figure}
|
||||
\[\withenv{\llang{skip},\,\Phi}{\trans{c}{\llang{skip}}{c}}\ruleno{SkipSkip}\]
|
||||
\[\trule{\withenv{\llang{skip},\,\Phi}{\trans{c}{K}{c^\prime}}}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{skip}}{c^\prime}}}
|
||||
\ruleno{Skip}
|
||||
\]
|
||||
\[\trule{
|
||||
\begin{array}{cc}
|
||||
{\setsubarrow{_{\mathscr E}}\withenv{\Phi}{\trans{c}{e}{\inbr{\sigma,\,i,\,o,\,n}}}} & \withenv{\llang{skip},\,\Phi}{\trans{\inbr{\sigma[x\gets n],\,i,\,o,\,\hbox{---}}}{K}{c^\prime}}
|
||||
\end{array}}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{$x\,$:=$\,e$}}{c^\prime}}}
|
||||
\ruleno{Assign}
|
||||
\]
|
||||
\[\trule{\begin{array}{cc}
|
||||
{\setsubarrow{_{\mathscr E}}\withenv{\Phi}{\trans{c}{e}{\inbr{\sigma,\,i,\,o,\,n}}}} & \withenv{\llang{skip},\,\Phi}{\trans{\inbr{\sigma,\,i,\,o@[n],\,\hbox{---}}}{K}{c^\prime}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{write ($e$)}}{c^\prime}}}
|
||||
\ruleno{Write}
|
||||
\]
|
||||
\[\trule{
|
||||
\withenv{\llang{skip},\,\Phi}{\trans{\inbr{\sigma[x\gets z],\,i,\,o,\,\hbox{---}}}{K}{c^\prime}}
|
||||
}
|
||||
{\withenv{K,\,\Phi}{\trans{\inbr{\sigma,\,z:i,\,o,\,\hbox{---}}}{\llang{read ($x$)}}{c^\prime}}}
|
||||
\ruleno{Read}
|
||||
\]
|
||||
|
||||
\[\trule{\withenv{s_2\diamond K,\,\Phi}{\trans{c}{\llang{$s_1$}}{c^\prime}}}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{$s_1$; $\;s_2$}}{c^\prime}}}
|
||||
\ruleno{Seq}
|
||||
\]
|
||||
|
||||
\[\trule{\begin{array}{cc}
|
||||
{\setsubarrow{_{\mathscr E}}\withenv{\Phi}{\trans{c}{e}{\inbr{\sigma,\,i,\,o,\,n}}}} & \withenv{K,\,\Phi}{\trans{\inbr{\sigma,\,i,\,o,\,\hbox{---}}}{\llang{$s_1$}}{c^\prime}}\\
|
||||
n\ne 0 &
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{if $\;e\;$ then $\;s_1\;$ else $\;s_2$}}{c^\prime}}}
|
||||
\ruleno{IfTrue}
|
||||
\]
|
||||
|
||||
\[\trule{\begin{array}{cc}
|
||||
{\setsubarrow{_{\mathscr E}} \withenv{\Phi}{\trans{c}{e}{\inbr{\sigma,\,i,\,o,\,0}}}} & \withenv{K,\,\Phi}{\trans{\inbr{\sigma,\,i,\,o,\,\hbox{---}}}{\llang{$s_2$}}{c^\prime}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{if $\;e\;$ then $\;s_1\;$ else $\;s_2$}}{c^\prime}}}
|
||||
\ruleno{IfFalse}
|
||||
\]
|
||||
|
||||
\[\trule{\begin{array}{cc}
|
||||
{\setsubarrow{_{\mathscr E}} \withenv{\Phi}{\trans{c}{e}{\inbr{\sigma,\,i,\,o,\,n}}}} &
|
||||
\withenv{\llang{while $\;e\;$ do $\;s$}\diamond K,\,\Phi}{\trans{\inbr{\sigma,\,i,\,o,\,\hbox{---}}}{\llang{$s$}}{c^\prime}}\\
|
||||
n \ne 0 &
|
||||
\end{array}}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{while $\;e\;$ do $\;s$}}{c^\prime}}}
|
||||
\ruleno{WhileTrue}
|
||||
\]
|
||||
|
||||
\[\trule{\begin{array}{cc}
|
||||
{\setsubarrow{_{\mathscr E}} \withenv{\Phi}{\trans{c}{e}{\inbr{\sigma,\,i,\,o,\,0}}}} & \withenv{\llang{skip},\,\Phi}{\trans{\inbr{\sigma,\,i,\,o,\,\hbox{---}}}{K}{c^\prime}}
|
||||
\end{array}}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{while $\;e\;$ do $\;s$}}{c^\prime}}}
|
||||
\ruleno{WhileFalse}
|
||||
\]
|
||||
|
||||
\[\trule{\begin{array}{c}
|
||||
{\setsubarrow{_{\mathscr E}}\withenv{\Phi}{\trans{c_{j-1}}{e_j}{c_j=\inbr{\sigma_j,\,i_j,\,o_j,\,v_j}}}}\\
|
||||
\Phi\;f = \llang{fun $\;f\;$ ($\overline{a}$) local $\;\overline{l}\;$ \{$s$\}} \\
|
||||
\withenv{\llang{skip},\,\Phi}{\trans{\inbr{\primi{enter}{\sigma_k\; (\overline{a}@\overline{l})\; [\overline{a_j\gets v_j}]},i_k,o_k,\hbox{---}}}{s}{\inbr{\sigma^\prime, i^\prime, o^\prime, \hbox{---}}}}\\
|
||||
\withenv{\llang{skip},\,\Phi}{\trans{\inbr{\primi{leave}{\sigma^\prime\;\sigma_0}, i^\prime, o^\prime, \hbox{---}}}{K}{\inbr{\sigma^{\prime\prime}, i^{\prime\prime}, o^{\prime\prime}, n}}}
|
||||
\end{array}}
|
||||
{\withenv{K,\,\Phi}{\trans{c_0=\inbr{\sigma_0,\,\_,\,\_,\,\_}}{f (\overline{e_k})}{\inbr{\sigma^{\prime\prime}, i^{\prime\prime}, o^{\prime\prime}, n}}}}
|
||||
\ruleno{Call}
|
||||
\]
|
||||
|
||||
\[
|
||||
\withenv{K,\,\Phi}{\trans{c}{\llang{return}}{c}}
|
||||
\ruleno{ReturnEmpty}
|
||||
\]
|
||||
|
||||
\[
|
||||
\trule{{\setsubarrow{_{\mathscr E}} \withenv{\Phi}{\trans{c}{e}{c^\prime}}}}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{return $\;e\;$}}{c^\prime}}}
|
||||
\ruleno{Return}
|
||||
\]
|
||||
|
||||
\caption{Continuation-passing Style Semantics for Statements}
|
||||
\label{bs_cps}
|
||||
\end{figure}
|
||||
155
doc/lectures/07.tex
Normal file
155
doc/lectures/07.tex
Normal file
|
|
@ -0,0 +1,155 @@
|
|||
\section{Arrays and strings}
|
||||
|
||||
An array can be represented as a pair: the length of the array and a mapping from indices to elements. If we denote
|
||||
$\mathscr E$ the set of elements then the set of all arrays $\mathscr A (\mathscr E)$ can be defined as follows:
|
||||
|
||||
\[
|
||||
\mathscr A(\mathscr E) = \mathbb N \times (\mathbb N \to \mathscr E)
|
||||
\]
|
||||
|
||||
For an array $(n, f)$ we assume $\dom{f}=[0\,..\,n-1]$. An element selection function:
|
||||
|
||||
\[
|
||||
\begin{array}{c}
|
||||
\bullet[\bullet] : \mathscr A (\mathscr E) \to \mathbb N \to \mathscr E\\[2mm]
|
||||
(n, f) [i] = \left\{
|
||||
\begin{array}{rcl}
|
||||
f\;i &, & i < n\\
|
||||
\bot&,&\;\mbox{otherwise}
|
||||
\end{array}
|
||||
\right.
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
We represent arrays by \emph{references}. Thus, we introduce a (linearly) ordered set of locations
|
||||
|
||||
\[
|
||||
\mathscr L = \{l_0, l_1, \dots\}
|
||||
\]
|
||||
|
||||
Now, the set of all values the programs operate on can be described as follows:
|
||||
|
||||
\[
|
||||
\mathscr V = \mathbb Z \uplus \mathscr L
|
||||
\]
|
||||
|
||||
Here, every value is either an integer, or a reference (some location). The disjoint union ``$\uplus$'' makes it possible to
|
||||
unambiguously discriminate between the shapes of each value. To access arrays, we introduce an abstraction of memory:
|
||||
|
||||
|
||||
\[
|
||||
\mathscr M = \mathscr L \to \mathscr A\,(\mathscr V)
|
||||
\]
|
||||
|
||||
We now add two more components to the configurations: a memory function $\mu$ and the first free memory location $l_m$, and
|
||||
define the following primitive:
|
||||
|
||||
\[
|
||||
\primi{mem}{\inbr{s,\,\mu,\,l_m,\,i,\,o,\,v}}=\mu
|
||||
\]
|
||||
|
||||
which gives a memory function from a configuration.
|
||||
|
||||
\subsection{Adding arrays on expression level}
|
||||
|
||||
On expression level, abstractly/concretely:
|
||||
|
||||
\[
|
||||
\begin{array}{rclll}
|
||||
\mathscr E += & &\mathscr E \mathtt{[} \mathscr E \mathtt{]} & (\llang{$a$ [$e$]}) & \mbox{taking an element}\\
|
||||
& \mid & \mathtt{[} \mathscr E^* \mathtt{]} & (\llang{[$e_1$, $e_2$,.., $e_k$]}) & \mbox{creating an array}\\
|
||||
& \mid & \mathscr E \mathtt{.length} & ($e$\mathtt{.length}) & \mbox{taking the length}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
%In addition, in a concrete syntax we supply two special forms for strings: \llang{'$x$'} as a denotation for integer code of ASCII symbol
|
||||
%$x$, and \llang{"..."}~--- a string constant.
|
||||
|
||||
The semantics of enriched expressions is modified as follows. First, we add two additional premises to the rule for binary operators:
|
||||
|
||||
\setsubarrow{_{\mathscr E}}
|
||||
\arraycolsep=10pt
|
||||
\[\trule{\begin{array}{cc}
|
||||
\withenv{\Phi}{\trans{c}{A}{c^\prime}} & \withenv{\Phi}{\trans{c^\prime}{B}{c^{\prime\prime}}}\\
|
||||
\primi{val}{c^\prime}\in\mathbb Z & \primi{val}{c^{\prime\prime}}\in\mathbb Z
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{\Phi}{\trans{c}{A\otimes B}{\primi{ret}{c^{\prime\prime}\;(\primi{val}{c^\prime}\oplus \primi{val}{c^{\prime\prime}})}}}}
|
||||
\ruleno{Binop}
|
||||
\]
|
||||
|
||||
These two premises ensure that both operand expressions are evaluated into integer values. Second, we have to add the rules for new
|
||||
kinds of expressions (see Figure~\ref{array_expressions}).
|
||||
|
||||
\begin{figure}
|
||||
\arraycolsep=10pt
|
||||
\[\trule{\begin{array}{cc}
|
||||
\withenv{\Phi}{\trans{c}{e}{c^\prime}} &\withenv{\Phi}{\trans{c^\prime}{j}{c^{\prime\prime}}}\\
|
||||
l=\primi{val}{c^\prime} &j=\primi{val}{c^{\prime\prime}}\\
|
||||
l\in\mathscr L &j\in\mathbb N\\
|
||||
(n,\,f)=\primi{mem}{l} &j<n
|
||||
\end{array}}
|
||||
{\withenv{\Phi}{\trans{c}{e\,\mathtt{[}j\mathtt{]}}{\primi{ret}{c^{\prime\prime}(f\;j)}}}}
|
||||
\ruleno{ArrayElement}
|
||||
\]
|
||||
\vskip5mm
|
||||
\[\trule{\begin{array}{c}
|
||||
\withenv{\Phi}{\trans{c_j}{e_j}{c_{j+1}}},\,j\in [0..k]\\
|
||||
\inbr{s,\,\mu,\,l_m,\,i,\,o,\,\_}=c_{k+1}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{\Phi}{\trans{c_0}{\mathtt{[}e_0, e_1,...,e_k\mathtt{]}}{\inbr{s,\,\mu\,[l_m\gets(k+1,\,\lambda n.\primi{val}{c_n})],\,l_{m+1},\,i,\,o,\,l_m}}}}
|
||||
\ruleno{Array}
|
||||
\]
|
||||
\vskip5mm
|
||||
\[\trule{\begin{array}{c}
|
||||
\withenv{\Phi}{\trans{c}{e}{c^\prime}}\\
|
||||
l=\primi{val}{c^\prime}\\
|
||||
l\in\mathscr L\\
|
||||
(n,\,f)=(\primi{mem}{c^\prime})\;l
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{\Phi}{\trans{c}{e\mathtt{.length}}{\primi{return}{c^\prime\;n}}}}
|
||||
\ruleno{ArrayLength}
|
||||
\]
|
||||
\caption{Big-step Operational Semantics for Array Expressions}
|
||||
\label{array_expressions}
|
||||
\end{figure}
|
||||
|
||||
\subsection{Adding arrays on statement level}
|
||||
|
||||
On statement level, we add the single construct:
|
||||
|
||||
\[
|
||||
\mathscr S += \mathscr E \mathtt{[} \mathscr E \mathtt{]} \llang{:=} \mathscr E
|
||||
\]
|
||||
|
||||
This construct is interpreted as an assignment to an element of an array. The semantics of this construct is described by the following rule:
|
||||
\[
|
||||
\trule{\setsubarrow{_{\mathscr E}}
|
||||
\arraycolsep=10pt
|
||||
\begin{array}{ccc}
|
||||
\withenv{\Phi}{\trans{c}{e}{c^\prime}} & \withenv{\Phi}{\trans{c^\prime}{j}{c^{\prime\prime}}} & \withenv{\Phi}{\trans{c^{\prime\prime}}{g}{\inbr{s,\mu,l_m,i,o,v}}}\\
|
||||
l=\primi{val}{c^\prime} & i=\primi{val}{c^{\prime\prime}} & \\
|
||||
l\in\mathscr L & i\in\mathbb N & \\[3mm]
|
||||
\multicolumn{3}{c}{(n,\,f)=\mu\;l}\\
|
||||
\multicolumn{3}{c}{i<n}\\
|
||||
\multicolumn{3}{c}{\setsubarrow{}\withenv{\llang{skip},\,\Phi}{\trans{\inbr{s,\,\mu\,[l\gets (n,\,f\,[i\gets x])],\,l_m,\,i,\,o,\,\hbox{---}}}{K}{\widetilde{c}}}}
|
||||
\end{array}
|
||||
}
|
||||
{\setsubarrow{}\withenv{K,\,\Phi}{\trans{c}{e\mathtt{[}j\mathtt{]}\llang{:=}g}{\widetilde{c}}}}
|
||||
\ruleno{ArrayAssign}
|
||||
\]
|
||||
|
||||
\subsection{Strings}
|
||||
|
||||
With arrays in our hands, we can easily add strings as arrays of characters. In fact, on the source language the strings can be
|
||||
introduced as a syntactic extension:
|
||||
|
||||
\begin{enumerate}
|
||||
\item we add a character constants \llang{'c'} as a shortcut for their integer codes;
|
||||
\item we add a string literals \llang{"abcd..."} as a shortcut for arrays \llang{['a', 'b', 'c', 'd', ...]}.
|
||||
\end{enumerate}
|
||||
|
||||
Nothing else has to be done~--- now we have mutable reference-representable strings.
|
||||
|
||||
215
doc/lectures/08.tex
Normal file
215
doc/lectures/08.tex
Normal file
|
|
@ -0,0 +1,215 @@
|
|||
\section{S-expressions and Pattern Matching}
|
||||
|
||||
S-expressions can be considered as arrays with additionally attached \emph{tags} (of constructors). We denote
|
||||
the set of all constructors as $\mathscr C$:
|
||||
|
||||
\[
|
||||
\mathscr{C}=\{C_1, C_2, \dots\}
|
||||
\]
|
||||
|
||||
Thus, the set of all S-expressions can be described as
|
||||
|
||||
\[
|
||||
\mathscr{S}_{exp}=\mathscr{C}\times\mathscr{A}(\mathscr{V})
|
||||
\]
|
||||
|
||||
In the concrete syntax constructors are represented by identifiers started with capital letters (A-Z). As
|
||||
being represented as augmented arrays, the values of S-expressions can be manipulated by array
|
||||
primitives (in particular, their elements can be accessed and assigned using square brackets); like arrays
|
||||
they are stored in abstract memory and accessed via locations.
|
||||
|
||||
\subsection{S-expressions on expression level}
|
||||
|
||||
At the expression level S-expressions are represented by a single additional syntactic form:
|
||||
|
||||
\[
|
||||
\mathscr{E} += \mathscr{C}\mathscr{E}^*
|
||||
\]
|
||||
|
||||
In the concrete syntax this form is represented by expressions ``\lstinline|C (e$_1$, e$_1$, ..., e$_k$)|'' or just ``\lstinline|C|''
|
||||
if no constructor arguments are specified. The semantics of this form of expression is straightforward: all arguments are
|
||||
sequentially evauated left-to right and their values are packed into an array with appropriate tag attached:
|
||||
|
||||
\setsubarrow{_{\mathscr E}}
|
||||
\arraycolsep=10pt
|
||||
\[\trule{\begin{array}{c}
|
||||
\withenv{\Phi}{\trans{c_j}{e_j}{c_{j+1}}},\,j\in [0..k]\\
|
||||
\inbr{s,\,\mu,\,l_m,\,i,\,o,\,\_}=c_{k+1}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{\Phi}{\trans{c_0}{C\mathtt{(}e_0, e_1,...,e_k\mathtt{)}}{\inbr{s,\,\mu\,[l_m\gets(C, (k+1,\,\lambda n.\primi{val}{c_n}))],\,l_{m+1},\,i,\,o,\,l_m}}}}
|
||||
\ruleno{S$_{exp}$}
|
||||
\]
|
||||
|
||||
|
||||
\subsection{Patterns Matching}
|
||||
|
||||
Pattern-matching is a new control construct which allows to discriminate between different constructors of S-expressions. Pattern-matching is added at
|
||||
the statement level, its abstract syntax is as follows:
|
||||
|
||||
\[
|
||||
\mathscr S += \primi{case}\,\mathscr{E}\,(\mathscr{P}\times\mathscr{S})^*
|
||||
\]
|
||||
|
||||
Here $\mathscr P$ stands for a new syntactic category~--- \emph{patterns} (see below). In the concrete syntax:
|
||||
|
||||
\begin{lstlisting}
|
||||
case $e$ of $p_1$ -> $s_1$ | ... | $p_k$ -> $s_k$ esac
|
||||
\end{lstlisting}
|
||||
|
||||
where $e$~--- expression (\emph{scrutinee}), $p_i$~--- patterns, $s_i$~--- statements.
|
||||
|
||||
The syntactic category of patterns is defined as follows:
|
||||
|
||||
\[
|
||||
\mathscr P = \_ \mid \mathscr X \mid C \mathscr{P}^*
|
||||
\]
|
||||
|
||||
In the concrete syntax: ``\lstinline|_|'', ``\lstinline|x|'', ``\lstinline|C ($p_1$, $p_2$, ..., $p_k$)|'' or just ``\lstinline|C|'',
|
||||
where $x$~--- variable, $C$~--- constructor, $p_i$~--- patterns. Informally speaking, each pattern describes a
|
||||
set of S-expressions and a set of bindings of variables to sub sub-expression of each of these expressions. To specify
|
||||
this precisly we introduce the following semantics for patterns: for a pattern $p$ and abstract memory function $\mu$ we
|
||||
define
|
||||
|
||||
\[
|
||||
\sembr{p}^\mu : \mathscr{V}\to\bot\uplus(2^{\mathscr X}\times(\mathscr{X}\to\mathscr{V}))
|
||||
\]
|
||||
|
||||
$\sembr{p}^\mu$ defines a procedure to inspect a value $v$; as a result either $\bot$ (which can be interpreted as a witness that $v$
|
||||
does not belong to the set of values defined by $p$) or a set of variables in the patterns with their bindings is returned.
|
||||
The definition is as follows:
|
||||
|
||||
\[
|
||||
\begin{array}{rclcl}
|
||||
\sembr{\_}^\mu\; v &=& \inbr{\emptyset,\,\mbox{empty state}} & & \\
|
||||
\sembr{x}^\mu\; v &=& \inbr{\{x\},\, [x \gets v]} & & \\
|
||||
\sembr{C (p_1, p_2, \dots, p_k)}^\mu\; l &=& \bigoplus_i\,(\sembr{p_i}^\mu\;v_i) &,& \mu\,(l)=C\,(v_1,v_2,\dots,v_k)\\
|
||||
\sembr{p}^\mu\; v &=& \bot &,& \mbox{otherwise}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
where $\bigoplus$ unions the bindings:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\bot \oplus \_ & = & \bot \\
|
||||
\_ \oplus \bot & = & \bot \\
|
||||
\inbr{S_1,\,\sigma_1} \oplus \inbr{S_2,\,\sigma_2} & = & \inbr{S_1\cup S_2,\,\lambda x . \left\{
|
||||
\begin{array}{rcl}
|
||||
\sigma_2\, x &,& x\in S_2\\
|
||||
\sigma_1\, x &,& x\in S_1\setminus S_2
|
||||
\end{array}
|
||||
\right.}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
As pattern matching may introduce new bindings, we need also to modify the definition of state:
|
||||
|
||||
\[
|
||||
\Sigma=(\mathscr X\to \mathscr V) \times (2^{\mathscr X} \times (\mathscr X\to \mathscr V))^*
|
||||
\]
|
||||
|
||||
|
||||
Now we have one global state and a list of local scopes with sets of variables and their bindings. The redefined state primitives
|
||||
look like the follows. Evaluating in the state:
|
||||
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\inbr{\sigma_g,\,\epsilon}\;x&=&\sigma_g\;x\\
|
||||
\inbr{\sigma_g,\,\inbr{S,\,\sigma_l}::t}\;x&=&
|
||||
\left\{\begin{array}{rcl}
|
||||
\sigma_l\;x & , & x\in S\\
|
||||
\inbr{\sigma_g,\,t}\;x & , & x\not\in S
|
||||
\end{array}
|
||||
\right.
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
Updating the state:
|
||||
|
||||
\[
|
||||
\inbr{\sigma_g,\,s}[x\gets z]=\primi{update}{\inbr{\sigma_g,\,s}\;\epsilon\;x\;z}
|
||||
\]
|
||||
|
||||
where $\primi{update}{}$ is defined as follows:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\primi{update}{\inbr{\sigma_g,\,\epsilon}\;s\;x\;z}&=&\inbr{\sigma_g[x\gets z],\,s}\\
|
||||
\primi{update}{\inbr{\sigma_g,\,\inbr{S,\,\sigma_l}::t}\;s\;x\;z}&=&\left\{
|
||||
\begin{array}{rcl}
|
||||
\inbr{\sigma_g,\,s\llang{@}\inbr{S,\,\sigma_l[x\gets z]}t} &,&x\in S\\
|
||||
\primi{update}{\inbr{\sigma_g,\,t}\;s\llang{@}\inbr{S,\,\sigma_l}\;x\;z} &,&x\not\in S
|
||||
\end{array}\right.
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
Empty state:
|
||||
|
||||
\[
|
||||
\Lambda=\inbr{\Lambda,\,\epsilon}
|
||||
\]
|
||||
|
||||
Primitives $\primi{enter}{}$/$\primi{leave}{}$:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\primi{enter}{\inbr{\sigma_g,\,\_}\,S} & = & \inbr{\sigma_g,\,\inbr{S,\,\Lambda}}\\
|
||||
\primi{leave}{\inbr{\sigma_g,\,\_}\,\inbr{\_,\,t}}& = & \inbr{\sigma_g,\,t}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
We will also need two additional primitives to push/drop new scopes:
|
||||
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
\primi{push}{\inbr{\sigma_g,\,t}\;s}&=&\inbr{\sigma_g,\,s::t}\\
|
||||
\primi{drop}{\inbr{\sigma_g,\,\_::t}}&=&\inbr{\sigma_g,\,t}
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
We will also use these primitives for the whole configurations, meaning that they apply only to their state parts and leave
|
||||
all other components intact.
|
||||
|
||||
To specify big-step semantics for pattern-matching we need a yet another ``intrinsic'' statement
|
||||
$\primi{leave}{}$ (remember, the first one was ``$\diamond$''). The rules are shown on Fig.~\ref{pattern_matching}; note,
|
||||
we need an additional transition relation ``${\setsubarrow{_{\mathscr P}}\transarrow{}\subarrow}$'' to reflect the top-down
|
||||
pattern-matching discipline.
|
||||
|
||||
\begin{figure}
|
||||
\[\trule{\withenv{\llang{skip},\,\Phi}{\trans{\primi{drop}{c}}{K}{c^\prime}}}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\primi{leave}{}}{c^\prime}}}
|
||||
\ruleno{Leave}
|
||||
\]
|
||||
|
||||
\[\trule{\begin{array}{cc}
|
||||
{\setsubarrow{_{\mathscr E}}\withenv{\Phi}{\trans{c}{e}{c^\prime}}} & {\setsubarrow{_{\mathscr P}}\withenv{K,\,\Phi}{\trans{c^\prime}{\inbr{\primi{val}{c^\prime},\,ps}}{c^{\prime\prime}}}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\llang{case $\;e\;$ $\;ps$}}{c^{\prime\prime}}}}
|
||||
\ruleno{Case}
|
||||
\]
|
||||
|
||||
\setsubarrow{_{\mathscr P}}
|
||||
|
||||
\[\trule{\begin{array}{cc}
|
||||
\sembr{p}^{\primi{mem}{c}}\;v=r\;(\neq\bot) & \withenv{\primi{leave}{}\diamond K,\,\Phi}{\trans{\primi{push}{c\;r}}{s}{c^\prime}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\inbr{v,\,\inbr{p,\,s}::t}}{c^\prime}}}
|
||||
\ruleno{PatternMatched}
|
||||
\]
|
||||
|
||||
\[\trule{\begin{array}{cc}
|
||||
\sembr{p}^{\primi{mem}{c}}\;v=\bot & \withenv{K,\,\Phi}{\trans{c}{\inbr{v,\,t}}{c^\prime}}
|
||||
\end{array}
|
||||
}
|
||||
{\withenv{K,\,\Phi}{\trans{c}{\inbr{v,\,\inbr{p,\,s}::t}}{c^\prime}}}
|
||||
\ruleno{PatternNotMatched}
|
||||
\]
|
||||
|
||||
\caption{Continuation-passing Style Semantics for Pattern Matching}
|
||||
\label{pattern_matching}
|
||||
\end{figure}
|
||||
|
||||
130
doc/lectures/lectures.tex
Normal file
130
doc/lectures/lectures.tex
Normal file
|
|
@ -0,0 +1,130 @@
|
|||
\documentclass{article}
|
||||
|
||||
\usepackage{amssymb, amsmath}
|
||||
\usepackage{alltt}
|
||||
\usepackage{pslatex}
|
||||
\usepackage{epigraph}
|
||||
\usepackage{verbatim}
|
||||
\usepackage{latexsym}
|
||||
\usepackage{array}
|
||||
\usepackage{comment}
|
||||
\usepackage{makeidx}
|
||||
\usepackage{listings}
|
||||
\usepackage{indentfirst}
|
||||
\usepackage{verbatim}
|
||||
\usepackage{color}
|
||||
\usepackage{url}
|
||||
\usepackage{xspace}
|
||||
\usepackage{hyperref}
|
||||
\usepackage{stmaryrd}
|
||||
\usepackage{amsmath, amsthm, amssymb}
|
||||
\usepackage{graphicx}
|
||||
\usepackage{euscript}
|
||||
\usepackage{mathtools}
|
||||
\usepackage{mathrsfs}
|
||||
\usepackage{multirow,bigdelim}
|
||||
\usepackage{subcaption}
|
||||
\usepackage{placeins}
|
||||
|
||||
\makeatletter
|
||||
|
||||
\makeatother
|
||||
|
||||
\definecolor{shadecolor}{gray}{1.00}
|
||||
\definecolor{darkgray}{gray}{0.30}
|
||||
|
||||
\def\transarrow{\xrightarrow}
|
||||
\newcommand{\setarrow}[1]{\def\transarrow{#1}}
|
||||
|
||||
\def\padding{\phantom{X}}
|
||||
\newcommand{\setpadding}[1]{\def\padding{#1}}
|
||||
|
||||
\def\subarrow{}
|
||||
\newcommand{\setsubarrow}[1]{\def\subarrow{#1}}
|
||||
|
||||
\newcommand{\trule}[2]{\frac{#1}{#2}}
|
||||
\newcommand{\crule}[3]{\frac{#1}{#2},\;{#3}}
|
||||
\newcommand{\withenv}[2]{{#1}\vdash{#2}}
|
||||
\newcommand{\trans}[3]{{#1}\transarrow{\padding{\textstyle #2}\padding}\subarrow{#3}}
|
||||
\newcommand{\ctrans}[4]{{#1}\transarrow{\padding#2\padding}\subarrow{#3},\;{#4}}
|
||||
\newcommand{\llang}[1]{\mbox{\lstinline[mathescape]|#1|}}
|
||||
\newcommand{\pair}[2]{\inbr{{#1}\mid{#2}}}
|
||||
\newcommand{\inbr}[1]{\left<{#1}\right>}
|
||||
\newcommand{\highlight}[1]{\color{red}{#1}}
|
||||
\newcommand{\ruleno}[1]{\eqno[\scriptsize\textsc{#1}]}
|
||||
\newcommand{\rulename}[1]{\textsc{#1}}
|
||||
\newcommand{\inmath}[1]{\mbox{$#1$}}
|
||||
\newcommand{\lfp}[1]{fix_{#1}}
|
||||
\newcommand{\gfp}[1]{Fix_{#1}}
|
||||
\newcommand{\vsep}{\vspace{-2mm}}
|
||||
\newcommand{\supp}[1]{\scriptsize{#1}}
|
||||
\newcommand{\sembr}[1]{\llbracket{#1}\rrbracket}
|
||||
\newcommand{\cd}[1]{\texttt{#1}}
|
||||
\newcommand{\free}[1]{\boxed{#1}}
|
||||
\newcommand{\binds}{\;\mapsto\;}
|
||||
\newcommand{\dbi}[1]{\mbox{\bf{#1}}}
|
||||
\newcommand{\sv}[1]{\mbox{\textbf{#1}}}
|
||||
\newcommand{\bnd}[2]{{#1}\mkern-9mu\binds\mkern-9mu{#2}}
|
||||
\newtheorem{lemma}{Lemma}
|
||||
\newtheorem{theorem}{Theorem}
|
||||
\newcommand{\meta}[1]{{\mathcal{#1}}}
|
||||
\renewcommand{\emptyset}{\varnothing}
|
||||
\newcommand{\dom}[1]{\mathtt{dom}\;{#1}}
|
||||
\newcommand{\primi}[2]{\mathbf{#1}\;{#2}}
|
||||
|
||||
\definecolor{light-gray}{gray}{0.90}
|
||||
\newcommand{\graybox}[1]{\colorbox{light-gray}{#1}}
|
||||
|
||||
\lstdefinelanguage{ocaml}{
|
||||
keywords={let, begin, end, in, match, type, and, fun,
|
||||
function, try, mod, with, class, object, method, of, rec, repeat, until,
|
||||
while, not, do, done, as, val, inherit, module, sig, @type, struct,
|
||||
if, then, else, open, virtual, new, fresh, skip, od, fi, elif, for, local, return, read, write, fi, case, esac, od},
|
||||
sensitive=true,
|
||||
%basicstyle=\small,
|
||||
commentstyle=\scriptsize\rmfamily,
|
||||
keywordstyle=\ttfamily\bfseries,
|
||||
identifierstyle=\ttfamily,
|
||||
basewidth={0.5em,0.5em},
|
||||
columns=fixed,
|
||||
fontadjust=true,
|
||||
literate={->}{{$\to$}}3 {===}{{$\equiv$}}1 {=/=}{{$\not\equiv$}}1 {|>}{{$\triangleright$}}3 {\&\&\&}{{$\wedge$}}2 {|||}{{$\vee$}}2 {^}{{$\uparrow$}}1,
|
||||
morecomment=[s]{(*}{*)}
|
||||
}
|
||||
|
||||
\lstset{
|
||||
mathescape=true,
|
||||
%basicstyle=\small,
|
||||
identifierstyle=\ttfamily,
|
||||
keywordstyle=\bfseries,
|
||||
commentstyle=\scriptsize\rmfamily,
|
||||
basewidth={0.5em,0.5em},
|
||||
fontadjust=true,
|
||||
escapechar=!,
|
||||
language=ocaml
|
||||
}
|
||||
|
||||
\sloppy
|
||||
|
||||
\newcommand{\ocaml}{\texttt{OCaml}\xspace}
|
||||
|
||||
\theoremstyle{definition}
|
||||
|
||||
\title{Introduction to Programming Languages, Compilers and Tools}
|
||||
|
||||
\author{Dmitry Boulytchev}
|
||||
|
||||
\begin{document}
|
||||
|
||||
\maketitle
|
||||
|
||||
\input{01}
|
||||
\input{02}
|
||||
\input{03}
|
||||
\input{04}
|
||||
\input{05}
|
||||
\input{06}
|
||||
\input{07}
|
||||
\input{08}
|
||||
|
||||
\end{document}
|
||||
Loading…
Add table
Add a link
Reference in a new issue