\section{Statements, Stack Machine, Stack Machine Compiler} \subsection{Statements} More interesting language~--- a language of simple statements: $$ \begin{array}{rcl} \mathscr S & = & \mathscr X \mbox{\lstinline|:=|} \;\mathscr E \\ & & \mbox{\lstinline|read (|} \mathscr X \mbox{\lstinline|)|} \\ & & \mbox{\lstinline|write (|} \mathscr E \mbox{\lstinline|)|} \\ & & \mathscr S \mbox{\lstinline|;|} \mathscr S \end{array} $$ Here $\mathscr E, \mathscr X$ stand for the sets of expressions and variables, as in the previous lecture. Again, we define the semantics for this language $$ \sembr{\bullet}_{\mathscr S} : \mathscr S \mapsto \mathbb Z^* \to \mathbb Z^* $$ with the semantic domain of partial functions from integer strings to integer strings. This time we will use \emph{big-step operational semantics}: we define a ternary relation ``$\Rightarrow$'' $$ \Rightarrow \subseteq \mathscr C \times \mathscr S \times \mathscr C $$ where $\mathscr C = \Sigma \times \mathbb Z^* \times \mathbb Z^*$~--- a set of all configurations during a program execution. We will write $c_1\xRightarrow{S}c_2$ instead of $(c_1, S, c_2)\in\Rightarrow$ and informally interpret the former as ``the execution of a statement $S$ in a configuration $c_1$ completes with the configuration $c_2$''. The components of a configuration are state, which binds (some) variables to their values, and input and output streams, represented as (finite) strings of integers. The relation ``$\Rightarrow$'' is defined by the following deductive system (see Fig.~\ref{bs_stmt}). The first three rules are \emph{axioms} as they do not have any premises. Note, according to these rules sometimes a program cannot do a step in a given configuration: a value of an expression can be undefined in a given state in rules $\rulename{Assign}$ and $\rulename{Write}$, and there can be no input value in rule $\rulename{Read}$. This style of a semantics description is called big-step operational semantics, since the results of a computation are immediately observable at the right hand side of ``$\Rightarrow$'' and, thus, the computation is performed in a single ``big'' step. And, again, this style of a semantic description can be used to easily implement a reference interpreter. With the relation ``$\Rightarrow$'' defined we can abbreviate the ``surface'' semantics for the language of statements: \setarrow{\xRightarrow} \[ \forall S\in\mathscr S,\,\forall \iota\in\mathbb Z^*\;:\;\sembr{S}_{\mathscr S} \iota = o \Leftrightarrow \trans{\inbr{\Lambda, i, \epsilon}}{S}{\inbr{\_, \_, o}} \] \begin{figure}[t] \arraycolsep=10pt \[\trans{\inbr{\sigma,\, \iota,\, o}}{\llang{x := $\;\;e$}}{\inbr{\sigma\,[x\gets\sembr{e}_{\mathscr E}\;\sigma],\, \iota,\, o}}\ruleno{Assign}\] \[\trans{\inbr{\sigma,\, z\iota,\, o}}{\llang{read ($x$)}}{\inbr{\sigma\,[x\gets z],\, \iota,\, o}}\ruleno{Read}\] \[\trans{\inbr{\sigma,\, \iota,\, o}}{\llang{write ($e$)}}{\inbr{\sigma,\, \iota,\, o(\sembr{e}_{\mathscr E}\;\sigma)}}\ruleno{Write}\] \[\trule{\begin{array}{cc} \trans{c_1}{S_1}{c^\prime} & \trans{c^\prime}{S_2}{c_2} \end{array}} {\trans{c_1}{S_1\llang{;}S_2}{c_2}}\ruleno{Seq}\] \caption{Big-step operational semantics for statements} \label{bs_stmt} \end{figure} \section{Stack Machine} Stack machine is a simple abstract computational device, which can be used as a convenient model to constructively describe the compilation process. In short, stack machine operates on the same configurations, as the language of statements, plus a stack of integers. The computation, performed by the stack machine, is controlled by a program, which is described as follows: \[ \begin{array}{rcl} \mathscr I & = & \llang{BINOP $\;\otimes$} \\ & & \llang{CONST $\;\mathbb N$} \\ & & \llang{READ} \\ & & \llang{WRITE} \\ & & \llang{LD $\;\mathscr X$} \\ & & \llang{ST $\;\mathscr X$} \\ \mathscr P & = & \epsilon \\ & & \mathscr I\mathscr P \end{array} \] Here the syntax category $\mathscr I$ stands for \emph{instructions}, $\mathscr P$~--- for \emph{programs}; thus, a program is a finite string of instructions. The semantics of stack machine program can be described, again, in the form of big-step operational semantics. This time the set of stack machine configurations is \[ \mathscr C_{SM} = \mathbb Z^* \times \mathscr C \] where the first component is a stack, and the second~--- a configuration as in the semantics of statement language. The rules are shown on Fig.~\ref{bs_sm}; note, now we have one axiom and six inference rules (one per instruction). As for the statement, with the aid of the relation ``$\Rightarrow$'' we can define the surface semantics of stack machine: \[ \forall p\in\mathscr P,\,\forall i\in\mathbb Z^*\;:\;\sembr{p}_{SM}\;i=o\Leftrightarrow\trans{\inbr{\epsilon, \inbr{\Lambda, i, \epsilon}}}{p}{\inbr{\_, \inbr{\_, \_, o}}} \] \begin{figure}[t] \[\trans{c}{\epsilon}{c}\ruleno{Stop$_{SM}$}\] \[\trule{\trans{\inbr{(x\oplus y)\llang{::}st, c}}{p}{c^\prime}}{\trans{\inbr{y\llang{::}x\llang{::}st, c}}{(\llang{BINOP $\;\otimes$})p}{c^\prime}}\ruleno{Binop$_{SM}$}\] \[\trule{\trans{\inbr{z\llang{::}st, c}}{p}{c^\prime}}{\trans{\inbr{st, c}}{(\llang{CONST $\;z$})p}{c^\prime}}\ruleno{Const$_{SM}$}\] \[\trule{\trans{\inbr{z\llang{::}st, \inbr{s, i, o}}}{p}{c^\prime}}{\trans{\inbr{st, \inbr{s, z\llang{::}i, o}}}{(\llang{READ})p}{c^\prime}}\ruleno{Read$_{SM}$}\] \[\trule{\trans{\inbr{st, \inbr{s, i, o\llang{@}z}}}{p}{c^\prime}}{\trans{\inbr{z\llang{::}st, \inbr{s, i, o}}}{(\llang{WRITE})p}{c^\prime}}\ruleno{Write$_{SM}$}\] \[\trule{\trans{\inbr{(s\;x)\llang{::}st, \inbr{s, i, o}}}{p}{c^\prime}}{\trans{\inbr{st, \inbr{s, i, o}}}{(\llang{LD $\;x$})p}{c^\prime}}\ruleno{LD$_{SM}$}\] \[\trule{\trans{\inbr{st, \inbr{s[x\gets z], i, o}}}{p}{c^\prime}}{\trans{\inbr{z\llang{::}st, \inbr{s, i, o}}}{(\llang{ST $\;x$})p}{c^\prime}}\ruleno{ST$_{SM}$}\] \caption{Big-step operational semantics for stack machine} \label{bs_sm} \end{figure} \subsection{A Compiler for the Stack Machine} A compiler of the statement language into the stack machine is a total mapping \[ \sembr{\bullet}_{comp} : \mathscr S \mapsto \mathscr P \] We can describe the compiler in the form of denotational semantics for the source language. In fact, we can treat the compiler as a \emph{static} semantics, which maps each program into its stack machine equivalent. As the source language consists of two syntactic categories (expressions and statments), the compiler has to be ``bootstrapped'' from the compiler for expressions $\sembr{\bullet}^{\mathscr E}_{comp}$: \[ \begin{array}{rcl} \sembr{x}^{\mathscr E}_{comp}&=&\llang{[LD $\;x$]}\\ \sembr{n}^{\mathscr E}_{comp}&=&\llang{[CONST $\;n$]}\\ \sembr{A\otimes B}^{\mathscr E}_{comp}&=&\sembr{A}^{\mathscr E}_{comp}\llang{@}\sembr{B}^{\mathscr E}_{comp}\llang{@}(\llang{BINOP $\;\otimes$}) \end{array} \] And now the main dish: \[ \begin{array}{rcl} \sembr{\llang{$x$ := $e$}}_{comp}&=&\sembr{e}^{\mathscr E}_{comp}\llang{@}\llang{[ST $\;x$]}\\ \sembr{\llang{read ($x$)}}_{comp}&=&\llang{[READ; ST $\;x$]}\\ \sembr{\llang{write ($e$)}}_{comp}&=&\sembr{e}^{\mathscr E}_{comp}\llang{@}\llang{[WRITE]}\\ \sembr{\llang{$S_1$;$\;S_2$}}_{comp}&=&\sembr{S_1}_{comp}\llang{@}\sembr{S_2}_{comp} \end{array} \]