Introduction
Sail is a language for expressing the instruction-set architecture (ISA) semantics of processors.
Vendor architecture specification documents typically describe the sequential behaviour of their ISA with a combination of prose, tables, and pseudocode for each instruction.
They vary in how precise that pseudocode is: in some it is just suggestive, while in others it is close to a complete description of the envelope of architecturally allowed behaviour for sequential code.
For x86 [1], the Intel pseudocode is just suggestive, with embedded prose, while the AMD descriptions [2] are prose alone. For IBM Power [3], there is detailed pseudocode which has recently become machine-processed [4]. For Arm [5], there is detailed pseudocode, which has recently become machine-processed [6]. For MIPS [7, 8] there is also reasonably detailed pseudocode.
Sail is intended:
-
To support precise definition of real-world ISA semantics;
-
To be accessible to engineers familiar with existing vendor pseudocodes, with a similar style to the pseudocodes used by ARM and IBM Power (modulo minor syntactic differences);
-
To expose the structure needed to combine the sequential ISA semantics with the relaxed-memory concurrency models we have developed;
-
To provide an expressive type system that can statically check the bitvector length and indexing computation that arises in these specifications, to detect errors and to support code generation, with type inference to minimise the required type annotations;
-
To support execution, for architecturally complete emulation automatically based on the definition;
-
To support automatic generation of theorem-prover definitions, for mechanised reasoning about ISA specifications; and
-
To be as minimal as possible given the above, to ease the tasks of code generation and theorem-prover definition generation.
A Sail specification will typically define an abstract syntax type/tre (AST) of machine instructions, a decode function that takes binary values to AST values, and an execute function that describes how each of those behaves at runtime, together with whatever auxiliary functions and types are needed.
Given such a specification, the Sail implementation can typecheck it and generate:
-
An internal representation of the fully type-annotated definition (a deep embedding of the definition) in a form that can be executed by the Sail interpreter. These are both expressed in Lem [9, 10], a language of type, function, and relation definitions that can be compiled into OCaml and various theorem provers. The Sail interpreter can also be used to analyse instruction definitions (or partially executed instructions) to determine their potential register and memory footprints.
-
A shallow embedding of the definition, also in Lem, that can be executed or converted to theorem-prover code more directly. Currently this is aimed at Isabelle/HOL or HOL4, though the Sail dependent types should support generation of idiomatic Coq definitions (directly rather than via Lem).
-
A compiled version of the specification directly into OCaml.
-
A more efficient executable version of the specification, compiled into C code.
Sail has been used to develop models of parts of several architectures:
Arm-v8 (ASL) |
generated from Arm’s v8.5 public ASL spec |
|
Arm-v9 (ASL) |
generated from Arm’s v9.3 public ASL spec |
|
RISC-V |
hand-written |
|
CHERI-MIPS |
hand-written |
The Arm-v8 (ASL) model is based on an automatic translation of Arm’s machine-readable public v8.3 ASL specification [1]. It includes everything in ARM’s specification.
The MIPS model is hand-written based on the MIPS64 manual version 2.5 [7, 8], but covering only the features in the BERI hardware reference [11], which in turn drew on MIPS4000 and MIPS32 [12, 13].
The CHERI model is based on that and the CHERI ISA reference manual version 5 [14]. These two are both principally by Norton-Wright; they cover all basic user and kernel mode MIPS features sufficient to boot FreeBSD, including a TLB, exceptions and a basic UART for console interaction. ISA extensions such as floating point are not covered. The CHERI model supports either 256-bit capabilities or 128-bit compressed capabilities.
A tutorial RISC-V style example
We introduce the basic features of Sail via a small example from our RISC-V model that includes just two instructions: add immediate and load double. We start by defining the default order (see Vectors for details), and including the Sail prelude:
default Order dec
$include <prelude.sail>
The Sail prelude is very minimal, and it is expected that Sail
specifications will usually build upon it. Some Sail specifications
are derived from pre-existing pseudocode, which already use specific
idioms — our Sail Arm specification uses ZeroExtend
and
SignExtend
mirroring the ASL code, whereas our MIPS and RISC-V
specification use EXTZ
and EXTS
for the same functions. Therefore for
this example we define zero-extension and sign-extension functions as:
val EXTZ : forall 'n 'm, 'm >= 'n. (implicit('m), bits('n)) -> bits('m)
function EXTZ(m, v) = sail_zero_extend(v, m)
val EXTS : forall 'n 'm, 'm >= 'n. (implicit('m), bits('n)) -> bits('m)
function EXTS(m, v) = sail_sign_extend(v, m)
We now define an integer type synonym xlen
, which for this example
will be equal to 64. Sail supports definitions which are generic over
both regular types, and integers (think const generics in C++, but
more expressive). We also create a type xlenbits
for bitvectors of
length xlen
.
type xlen : Int = 64
type xlen_bytes : Int = 8
type xlenbits = bits(xlen)
For the purpose of this example, we also introduce a type synonym for bitvectors of length 5, which represent registers.
type regbits = bits(5)
We now set up some basic architectural state. First creating a
register of type xlenbits
for both the program counter PC
, and
the next program counter, nextPC
. We define the general purpose
registers as a vector of 32 xlenbits
bitvectors. The dec
keyword isn’t important in this example, but Sail supports two
different numbering schemes for (bit)vectors inc
, for most
significant bit is zero, and dec
for least significant bit is
zero. We then define a getter and setter for the registers, which
ensure that the zero register is treated specially (in
RISC-V register 0 is always hardcoded to be 0). Finally we overload
both the read (rX
) and write (wX
) functions as simply
X
. This allows us to write registers as X(r) = value
and
read registers as value = X(r)
. Sail supports flexible ad-hoc
overloading, and has an expressive l-value language in assignments,
with the aim of allowing pseudo-code like definitions.
register PC : xlenbits
register nextPC : xlenbits
register Xs : vector(32, dec, xlenbits)
val rX : regbits -> xlenbits
function rX(r) =
match r {
0b00000 => EXTZ(0x0),
_ => Xs[unsigned(r)]
}
val wX : (regbits, xlenbits) -> unit
function wX(r, v) =
if r != 0b00000 then {
Xs[unsigned(r)] = v;
}
overload X = {rX, wX}
We also give a function MEMr
for reading memory, this function just
points at a builtin we have defined elsewhere. The builtin is very
general (it allows addresses of multiple bitwidths), so we also derive
a simpler read_mem
function.
val MEMr = impure { lem: "MEMr", coq: "MEMr", _ : "read_ram" } : forall 'n 'm, 'n >= 0.
(int('m), int('n), bits('m), bits('m)) -> bits(8 * 'n)
val read_mem : forall 'n, 'n >= 0. (xlenbits, int('n)) -> bits(8 * 'n)
function read_mem(addr, width) =
MEMr(sizeof(xlen), width, EXTZ(0x0), addr)
It’s common when defining architecture specifications to break
instruction semantics down into separate functions that handle
decoding (possibly even in several stages) into custom intermediate
datatypes and executing the decoded instructions. However it’s often
desirable to group the relevant parts of these functions and datatypes
together in one place, as they would usually be found in an
architecture reference manual. To support this Sail supports
scattered definitions. First we give types for the execute and
decode functions, as well as the ast
union.
enum iop = {RISCV_ADDI, RISCV_SLTI, RISCV_SLTIU, RISCV_XORI, RISCV_ORI, RISCV_ANDI}
scattered union ast
val decode : bits(32) -> option(ast)
val execute : ast -> unit
Now we provide the clauses for the add-immediate ast
type, as well
as its execute and decode clauses. We can define the decode function
by directly pattern matching on the bitvector representing the
instruction. Sail supports vector concatenation patterns (@
is the
vector concatenation operator), and uses the types provided
(e.g. bits(12)
and regbits
) to destructure the vector in the
correct way. We use the EXTS
function that sign-extends its
argument.
union clause ast = ITYPE : (bits(12), regbits, regbits, iop)
function clause decode imm : bits(12) @ rs1 : regbits @ 0b000 @ rd : regbits @ 0b0010011
= Some(ITYPE(imm, rs1, rd, RISCV_ADDI))
function clause execute (ITYPE (imm, rs1, rd, RISCV_ADDI)) = {
let rs1_val = X(rs1);
let imm_ext : xlenbits = EXTS(imm);
let result = rs1_val + imm_ext;
X(rd) = result
}
Now we do the same thing for the load-double instruction:
union clause ast = LOAD : (bits(12), regbits, regbits)
function clause decode imm : bits(12) @ rs1 : regbits @ 0b011 @ rd : regbits @ 0b0000011
= Some(LOAD(imm, rs1, rd))
function clause execute(LOAD(imm, rs1, rd)) = {
let addr : xlenbits = X(rs1) + EXTS(imm);
let result : xlenbits = read_mem(addr, sizeof(xlen_bytes));
X(rd) = result
}
Finally we define the fallthrough case for the decode function. Note that the clauses in a scattered function will be matched in the order they appear in the file.
function clause decode _ = None()
Usage
In its most basic use-case Sail is a command-line tool, analogous to a compiler: one gives it a list of input Sail files; it type-checks them and provides translated output.
To simply typecheck Sail files, one can pass them on the command line with no other options, for example:
sail prelude.sail riscv_types.sail riscv_mem.sail riscv_sys.sail riscv_vmem.sail riscv.sail
The sail files passed on the command line are simply treated as if
they are one large file concatenated together, although the parser
will keep track of locations on a per-file basis for
error-reporting. As can be seen, this specification is split into
several logical components. prelude.sail
defines the initial
type environment and builtins, riscv_types.sail
gives type
definitions used in the rest of the specification, riscv_mem.sail
and riscv_vmem.sail
describe the physical and virtual memory
interaction, and then riscv_sys.sail
and riscv.sail
implement most of the specification.
One can use also use $include
directives in Sail source, for
example:
$include <library.sail>
$include "file.sail"
Here, Sail will look for library.sail
in the $SAIL_DIR/lib
directory, where $SAIL_DIR
is usually the root of the sail
repository (or opam var sail:share
when Sail is installed using
opam). It will search for file.sail
relative to the location of the
file containing the $include
. The space after the $include
before
the filename is mandatory. Sail also supports $define
, $ifdef
, and
$ifndef
for basic conditional compilation. These are things that are
understood by Sail itself, not a separate preprocessor, and are
handled after the AST is parsed.
For more complex projects, a module hierarchy can be defined. See the Modules section for details.
Sail options
For backwards compatibility reasons, Sail accepts arguments of the
form -long_opt
, i.e. leading with a single -
and words
separated by _
. Such an option will be treated the same as
--long-opt
.
For a list of all options, one can call Sail as sail --help
.
C compilation
To compile Sail into C, the -c
option is used, like so:
sail -c FILES 1> out.c
The translated C is by default printed to stdout, but one can also use
the -o
option to output to a file, so
sail -c FILES -o out
will generate a file called out.c
. To produce an executable
this needs to be compiled and linked with the C files in the
$SAIL_DIR/lib
directory:
gcc out.c $SAIL_DIR/lib/*.c -lgmp -lz -I $SAIL_DIR/lib/ -o out
The C output requires the GMP library for arbitrary precision arithmetic, as well as zlib for working with compressed ELF binaries.
There are several Sail options that affect the C output:
-
-O
turns on optimisations. The generated C code will be quite slow unless this flag is set. -
--Oconstant-fold
apply constant folding optimisations. -
--c-include
Supply additional header files to be included in the generated C. -
--c-no-main
Do not generate amain()
function. -
--static
Mark generated C functions as static where possible. This is useful for measuring code coverage.
SystemVerilog compilation (Experimental)
Caution
|
This feature is new and experimental, so it is not guaranteed to provide working SystemVerilog. Furthermore, it is intended for hardware model checking against a hand-written design. Sail is not a hardware description language! |
To compile Sail into SystemVerilog, the --sv
option is used. The -o
option provides a prefix that is used on the various generated files.
There are several options for the SystemVerilog output:
-
--sv-output-dir
Generate all files in the specified directory -
--sv-include
Add an include directive to the generated SystemVerilog -
--sv-verilate
Can be used as either--sv-verilate run
or--sv-verilate compile
. If used Sail will automatically invoke verilator on the generated SystemVerilog -
--sv-lines
Output SystemVerilog `line directives to aid debugging. -
--sv-int-size
Set the maximum integer size allowed in the specification. -
--sv-bits-size
Bound the maximum bitvector width on the generated SystemVerilog. -
--sv-specialize
The--sv-specialize n
option will performn
rounds of specialisation on the Sail code before generating SystemVerilog. This will make bitvectors more monomorphic, but at the cost of code duplication.
The are various other options that control various minutae about the
generated SystemVerilog, see sail --help
for more details.
Automatic formatting (Experimental)
Caution
|
This feature is new and experimental, so be sure to inspect changes to source made by the tool and use at your own risk! |
Sail supports automatic code formatting similar to tools like go fmt
or rustfmt
. This is built into Sail itself, and can be used via the
--fmt
flag. To format a file my_file.sail
, we would use the command:
sail --fmt my_file.sail
Note that Sail does not attempt to type-check files when formatting
them, so in this case we do not necessarily have to pass the other
files that my_file.sail
would otherwise require to
type-check. However, it is perfectly fine to pass multiple files like
so:
sail --fmt file1.sail file2.sail file3.sail
The one exception is if a file uses a custom infix operator, then the
file that declares that operator must be passed before any file that
uses it. So if my_file.sail
uses an operator declared in
operator.sail
(otherwise it would not be able to parenthesize infix
expressions correctly), we would be required to do:
sail --fmt operator.sail my_file.sail
This will format both files. If we want to avoid formatting
operator.sail
, we could either use --fmt-skip
, like so:
sail --fmt-skip operator.sail --fmt operator.sail my_file.sail
or the --fmt-only
option, like so:
sail --fmt-only my_file.sail --fmt operator.sail my_file.sail
Both of these options can be passed multiple times if required.
Formatting configuration is done using a JSON configuration file: as:
{
"fmt": {
"indent": 4,
"preserve_structure": false,
"line_width": 120,
"ribbon_width": 1.0
}
}
which is passed to sail with the --config
flag.
The various keys supported under the "fmt"
key are as follows:
-
indent
The default indentation level -
preserve_structure
Preserve the structure of the syntax tree as much as possible. Note that the use of this operation is not recommended as it inhibits many formatting options, such as inserting blocks on if statements and loops. -
line_width
The desired maximum line-width. Note that this is a soft limit, and the line-width can go beyond if there are no possible line break options (e.g. if you have extremely long identifiers). -
ribbon_width
A soft limit on what proportion (between 0.0 and 1.0) of the line should be non-whitespace. A value of 1.0 indicates that the entirity ofline_width
can be taken up by non-whitespace.
If this file is not passed on the command line, Sail will check the
$SAIL_CONFIG
environment variable, and if that is unset it will
search for a file named sail_config.json
in the current working
directory, then recursively backwards through parent directories.
Interactive mode
Sail has a GHCi-style interactive interpreter. This can be used by
starting Sail with sail -i
. Sail will still handle any other
command line arguments as per usual. To use Sail files within the
interpreter they must be passed on the command line as if they were
being compiled normally. One can also pass a list of commands to the
interpreter by using the --is
flag, as
sail --is <file>
where <file>
contains a list of commands. Once inside the
interactive mode, a list of available commands can be accessed by
typing :commands
, while :help <command>
can be used to provide
some documentation for each command.
Other options
Here we summarize most of the other options available for
Sail. Debugging options (usually for debugging Sail itself) are
indicated by starting with the letter d
.
-
-v
Print the Sail version. -
-help
Print a list of options. -
--no-warn
Turn off warnings. -
--enum-casts
Allow elements of enumerations to be automatically cast to numbers. -
--memo-z3
Memoize calls to the Z3 solver. This can greatly improve typechecking times if you are repeatedly typechecking the same specification while developing it. -
--no-lexp-bounds-check
Turn off bounds checking in the left hand side of assignments. This option only exists for some Sail translated from ASL (as ASL does not do compile-time bounds checking here). -
--just-check
Force Sail to terminate immediately after typechecking. -
--require-version
. Check that the Sail version is newer than the specified version, if not Sail will print an error and exit with a non-zero exit code.
Debug options
These options are mostly used for debugging the Sail compiler itself.
They are included as they might be useful for those writing custom Sail
plugins. These options are all prefixed with d
.
-
--dtc-verbose <verbosity>
Make the typechecker print a trace of typing judgements. If the verbosity level is 1, then this should only include fairly readable judgements about checking and inference rules. If verbosity is 2 then it will include a large amount of debugging information. This option can be useful to diagnose tricky type-errors, especially if the error message isn’t very good. -
--ddump-initial-ast
Write the AST out immediately after parsing and desugaring. -
--ddump-tc-ast
Write the typechecked AST to stdout after typechecking -
--ddump-side-effects
Print inferred information about function side effects -
--ddump-rewrite-ast <prefix>
Write the AST out after each re-writing pass. The output from each pass is placed in a file starting withprefix
. -
--dmagic-hash
Allow the#
symbol in identifiers. It’s currently used as a magic symbol to separate generated identifiers from those the user can write, so this option allows for the output of the various other debugging options to be fed back into Sail. The name comes from the GHC option with the same purpose: ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/magic_hash.html
The Sail Language
Functions
In Sail, we often define functions in two parts. First we can write
the type signature for the function using the val
keyword, then
define the body of the function using the function
keyword. In this
Subsection, we will write our own version of the replicate_bits
function from the Sail library. This function takes a number n
and a
bitvector, and creates a new bitvector containing that bitvector
copied n
times.
val my_replicate_bits : forall 'n 'm, 'm >= 1 & 'n >= 1.
(int('n), bits('m)) -> bits('n * 'm)
This signature shows how Sail can track the length of bitvectors and
the value of integer variables in type signatures, using type
variables. Type variables are written with a leading 'tick', so 'n
and 'm
are the type variables in the above signature.
Note
|
The leading tick is a convention derived from Standard ML, and
other functional languages derived from Standard ML, such as OCaml.
Readers who are familiar with Rust will also recognise this naming
convention from lifetime variables in Rust types. The val keyword to
declare functions is also taken from OCaml.
|
The type bits('m)
is a bitvector of length 'm
, and int('n)
is an
integer with the value 'n
. The result of this function will
therefore be a bitvector of length 'n * 'm
. We can also add
constraints on these types. Here we require that we are replicating
the input bitvector at least once with the 'n >= 1
constraint, and
that the input bitvector length is at least one with the 'm >= 1
constraint. Sail will check that all callers of this function are
guaranteed to satisfy these constraints.
Sail will also ensure that the output of our function has precisely
the length bits('n * 'm)
for all possible inputs (hence why the
keyword uses the mathematical forall quantifier).
A potential implementation of this function looks like:
function my_replicate_bits(n, xs) = {
var ys = zeros(n * length(xs));
foreach (i from 1 to n) {
ys = ys << length(xs);
ys = ys | zero_extend(xs, length(ys))
};
ys
}
Functions may also have implicit parameters, e.g. we can implement a zero extension function that implicitly picks up its result length from the calling context as follows:
val extz : forall 'n 'm, 'm >= 'n. (implicit('m), bits('n)) -> bits('m)
function extz(m, xs) = zero_extend(xs, m)
Implicit parameters are always integers, and they must appear first before any other parameters in the function type signature. The first argument can then just be omitted when calling the function, like so:
let xs: bits(32) = 0x0000_0000;
let ys: bits(64) = extz(xs);
Blocks
You may have noticed that in the definition of my_replicate_bits
above, there was no return
keyword. This is because unlike languages
such as C and C++, and more similar to languages like OCaml and Rust,
everything in Sail is an expression which evaluates to a value. A
block in Sail is simply a sequence of expressions surrounded by curly
braces {
and }
, and separated by semicolons. The value returned by
a block is the value returned by the last expressions, and likewise
the type of a block is determined by it’s final expressions, so
{ A; B; C }
, will evaluate to the value of C
after evaluating A
and B
in order. The expressions other than the
final expression in the block must have type unit
, which is
discussed in the following section. Within blocks we can declare
immutable variables using let
, and mutable variables using var
,
for example:
{
let x : int = 3;
var y : int = 2;
y = y + 1;
x + y
}
The above block has type int
and evaluates to the value 6
.
Note
|
For those familiar with Rust, a trailing semicolon in Sail does not change the semantics of the block and is purely optional. |
The unit type
The simplest type in Sail is the unit type unit
. It is a type with
a single member ()
. Rather than have functions that takes zero
arguments, we have functions that take a single unit
argument.
Similarly, rather than having functions that return no results, a
function with no meaningful return value can return ()
. The ()
notation reflects the fact that the unit type can be thought of as an
empty tuple (see Tuples).
In Sail unit
plays a similar role to void in C and C++, except
unlike void it is an ordinary type and can appear anywhere and be used
in generic functions.
The wikipedia page for the unit type, goes into further details on the difference between unit and void.
Numeric types and bits
Sail has three basic numeric types, int
, nat
, and range
. The
type int
which we have already seen above is an arbitrary precision
mathematical integer. Likewise, nat
is an arbitrary precision
natural number. The type range('n, 'm)
is an inclusive range
between type variables 'n
and 'm
. For both int
and nat
we can
specify a type variable that constrains elements of the type to be
equal to the value of that type variable. In other words, the values
of type int('n)
are integers equal to 'n
. So
5 : int(5)
and similarly for any integer constant. These
types can often be used interchangeably provided certain constraints
are satisfied. For example, int('n)
is equivalent to
range('n, 'n)
and range('n, 'm)
can be converted into
int('n)
when 'n == 'm
.
Note
|
The colon operator : is used for type ascription, so x : Y
is read as x has type Y
|
Note that bit
isn’t a numeric type (i.e. it’s not
range(0, 1)
). This is intentional, as otherwise it would be
possible to write expressions like (1 : bit) + 5
which would end up being equivalent to
6 : range(5, 6)
. This kind of implicit casting from
bits to other numeric types would be highly undesirable. The bit
type itself is a two-element type with members bitzero
and bitone
.
In addition, we can write a numeric type that only contains a fixed
set of integers. The type {32, 64}
is a type that can only
contain the values 32
and 64
.
Note
|
In older Sail versions the numeric set type would have been
denoted {|32, 64|} . This syntax is now deprecated.
|
Bitvector literals
Bitvector literals in Sail are written as either 0x
followed by a
sequence of hexadecimal digits (as in 0x12FE
) or 0b
followed by a
sequence of binary digits (as in 0b1010100
). The bit length of a
hex literal is always four times the number of hexadecimal digits,
and the bit length of binary literal is always the exact number of
binary digits. So, 0x12FE
has bit length 16, and 0b1010100
has
bit length 7. To ensure bitvector logic
in specifications is precisely specified, we do not allow any kind of
implicit widening or truncation as might occur in C. To change the
length of a bitvector, explicit zero/sign extension and truncation
functions must be used. Underscores can be used in bitvector literals
to separate groups, where each group is typically 16 bits. For
example:
let large_bitvector : bits(64) = 0xFFFF_0000_1234_0000
We can make bitvectors as large as we need:
let even_larger_bitvector : bits(192) =
0xFFFF_FFFF_FFFF_FFFF_0000_0000_0000_0000_ABCD_ABCD_ABCD_ABCD
We can also write bitvectors very verbosely using bitzero
and
bitone
, like:
let v : bits(2) = [bitzero, bitzero]
The @
operator is used to concatenate bitvectors, for example:
let x = 0xFFFF;
let y = 0x0000;
assert(x @ y == 0xFFFF_0000);
For historical reasons the bit
type is not equal to bits(1)
, and
while this does simplify naively mapping the bits type into a (very
inefficient!) representation like bit list
in Isabelle or OCaml, it
might be something we reconsider in the future.
Sail allows two different types of bitvector orderings---increasing
(inc
) and decreasing (dec
). These two orderings are shown for the
bitvector 0b10110000 below.
For increasing (bit)vectors, the 0 index is the most significant bit and the indexing increases towards the least significant bit. Whereas for decreasing (bit)vectors the least significant bit is 0 indexed, and the indexing decreases from the most significant to the least significant bit. For this reason, increasing indexing is sometimes called `most significant bit is zero' or MSB0, while decreasing indexing is sometimes called `least significant bit is zero' or LSB0. While this vector ordering makes most sense for bitvectors (it is usually called bit-ordering), in Sail it applies to all vectors. A default ordering can be set using
default Order dec
and this should usually be done right at the beginning of a specification. This setting is global, and increasing and decreasing bitvectors can therefore never be mixed within the same specification!
In practice decreasing order is the almost universal standard and only POWER uses increasing order. All currently maintained Sail specifications use decreasing. You may run into issues with increasing bitvectors as the code to support these is effectively never exercised as a result.
Vectors
Sail has the built-in type vector
, which is a polymorphic type for
fixed-length vectors. For example, we could define a vector v
of
three integers as follows:
let v : vector(3, int) = [3, 2, 1]
The first argument of the vector type is a numeric expression representing the length of the vector, and the second is the type of the vector’s elements. As mentioned in the bitvector section, the ordering of bitvectors and vectors is always the same, so:
let a_generic_vector : vector(3, bit) = [bitzero, bitzero, bitone];
let a_bitvector : bits(3) = [bitzero, bitzero, bitone]; // 0b001
assert(a_generic_vector[0] == bitone);
assert(a_bitvector[0] == bitone)
Note that a generic vector of bits and a bitvector are not the same type, despite us being able to write them using the same syntax. This means you cannot write a function that is polymorphic over both generic vectors and bitvectors). This is because we typically want these types to have very different representations in our various Sail backends, and we don’t want to be forced into a compilation strategy that requires monomorphising such functions.
Accessing and updating vectors
A (bit)vector can be indexed by using the vector index notation. So, in the following code:
let v : vector(4, int) = [1, 2, 3, 4]
let a = v[0]
let b = v[3]
a
will be 4
, and b
will be 1
(we assume default Order dec
here). By default, Sail will statically check for out of bounds
errors, and will raise a type error if it cannot prove that all such
vector accesses are valid.
A vector v
can be sliced using the v[n .. m]
notation. The
indexes are always supplied with the index closest to the MSB being
given first, so we would take the bottom 32-bits of a decreasing
bitvector v
as v[31 .. 0]
, and the upper 32-bits of an
increasing bitvector as v[0 .. 31]
, i.e. the indexing
order for decreasing vectors decreases, and the indexing order for
increasing vectors increases.
A vector v
can have an index index using
[v with index = expression]
. Similarly, a
sub-range of v can be updated using
[v with n .. m = expression]
where the
order of the indexes is the same as described above for increasing and
decreasing vectors.
The list type
In addition to vectors, Sail also has list
as a built-in type. For
example:
let l : list(int) = [|1, 2, 3|]
The cons operator is ::
, so we could equally write:
let l : list(int) = 1 :: 2 :: 3 :: [||]
For those unfamiliar the word, 'cons' is derived from Lisp dialects, and has become standard functional programming jargon for such an operator — see en.wikipedia.org/wiki/Cons for more details.
Caution
|
The list type (plus the recursive functions typically used to
manipulate lists) does not work well with certain Sail targets, such
as the SMT and SystemVerilog backends. The vector type is almost
always preferable to the list type. The inclusion of the list type
(where we otherwise forbid recursive types) was perhaps a design
mistake.
|
Tuples
Sail has tuple types which represent heterogeneous sequences containing
values of different types. A tuple type (T1, T2, …)
has
values (x1, x2, …)
where x1 : T1
,
x2 : T2
and so on. A tuple must have 2 or more elements.
Some examples of tuples would be:
let tuple1 : (string, int) = ("Hello, World!", 3)
let tuple2 : (nat, nat, nat) = (1, 2, 3)
Note that while the function type (A, B) -> C
might
look like a function taking a single tuple argument, it is in fact a
function taking two arguments. If we wanted to write a function taking
a single tuple argument, we would instead write:
val single_tuple_argument : ((int, int)) -> unit
function single_tuple_argument(tuple) = {
let (x, y) = tuple;
print_int("x = ", x);
print_int("y = ", y);
}
which would be called as
single_tuple_argument((1, 2))
Note
|
This is because in Sail the function type is denoted
(A, B, …) -> C , but we allow the brackets to
be elided when the function has a single non-tuple argument so we can
write A -> B rather than (A) -> B .
|
Strings
Sail has a string
type, which is primarily used for error reporting
and debugging.
Caution
|
Sail is not a language designed for working with strings, and the semantics of ISA specifications should not depend on any logic involving strings. If you find yourself using strings for reasons other than printing or logging errors in a Sail specification, you should probably reconsider. |
A Sail string is any sequence of ASCII characters between double quotes. Backslash is used to introduce escape sequences, and the following escape sequences are supported:
-
\\
— Backslash -
\n
— Newline character -
\t
— Tab character -
\b
— Backspace character -
\r
— Carriage return -
\'
— Single quote (somewhat unnecessary, as single quotes are allowed in Sail strings) -
\"
— Double quote -
\DDD
— The character with decimal ASCII codeDDD
-
\xHH
— The character with hexadecimal ASCII codeHH
Multi-line strings can be written by escaping the newline character at the end of a line:
print_endline("Hello, \
World!");
// Is equivalent to
print_endline("Hello, World!")
User-defined types
Sail supports various kinds of user-defined types: structs, unions, and enums.
Structs
Like in C, structs are used to hold multiple values. Structs are
created using the struct
keyword followed by the name of the struct.
The data the struct contains is then defined following the =
symbol.
Each piece of data stored in a struct is called a field
, and each is
given a unique name that identifies that piece of data, and can have a
different type.
The following example struct defines three fields. The first contains
a bitvector of length 5 and is called field1
. The second, field2
contains an integer. The third, field3
contains a string.
struct My_struct = {
field1 : bits(5),
field2 : int,
field3 : string,
}
We can then construct struct values by using the struct
keyword and
providing values for all the fields. The individual fields can be
accessed using the .
operator. If the struct is mutable, we can also
the .
operator on the left-hand side of an assignment to assign to
the struct field. For immutable structs, we can update them using the
with
keyword, which will create a new struct value with some fields
changed.
let s : My_struct = struct {
field1 = 0b11111,
field2 = 5,
field3 = "test",
};
// Accessing a struct field, prints "field1 is 0b11111"
print_bits("field1 is ", s.field1);
// Updating struct fields, creating a new struct variable s2
var s2 = { s with field1 = 0b00000, field3 = "a string" };
// Prints "field1 is 0b00000"
print_bits("field1 is ", s2.field1);
// Prints "field2 is 5", as field2 was taken from 's' and not modified
print_int("field2 is ", s2.field2);
// Because s2 is mutable we can assign to it
s2.field3 = "some string";
// Prints "some string"
print_endline(s2.field3)
Structs can have user defined type parameters. The following example
has an integer parameter 'n
and a type parameter 'a
.
struct My_other_struct('n, 'a) = {
a_bitvector : bits('n),
something : 'a
}
The struct can then be instantiated with any length of bitvector for
the a_bitvector
field, and any data type for the something
field,
including another struct like the My_struct
type we defined earlier.
var s3 : My_other_struct(2, string) = struct {
a_bitvector = 0b00,
something = "a string"
};
var s4 : My_other_struct(32, My_struct) = struct {
a_bitvector = 0xFFFF_FFFF,
something = struct {
field1 = 0b11111,
field2 = 6,
field3 = "nested structs!",
}
};
s3.a_bitvector = 0b11;
// Note that once created, we cannot change a variable's type, so the following is forbidden:
// s3.a_bitvector = 0xFFFF_FFFF;
// Changing the type is also forbidden in a with expression:
// let s4 : My_other_struct(32, string) = { s3 with a_bitvector = 0xFFFF_FFFF };
// If structs are nested, then field accesses can also be nested
print_endline(s4.something.field3);
// and assignments
s4.something.field3 = "another string"
Enums
Sail enumerations are also much like their C counterparts. An enum is
declared using the enum
keyword followed by the name of the
enumeration type. The members of the enumeration are specified as a
comma-separated list of identifiers between curly braces, like so:
enum My_enum = {
Foo,
Bar,
Baz,
Quux,
}
The above style is best when there are a lot of elements in the
enumeration. We allow enums to be defined in a style similar to
Haskell, where the identifiers are separated by a vertical bar |
character, like so:
enum My_short_enum = A | B | C
This style is better when the enumeration is short and has few constructors, however specification authors may prefer to use the first style exclusively for consistency.
Sail will automatically generate functions for converting enumeration members into numbers, starting with 0 for the first element of the enumeration.
function enum_example1() = {
assert(num_of_My_enum(Foo) == 0);
assert(num_of_My_enum(Bar) == 1);
assert(num_of_My_enum(Baz) == 2);
assert(num_of_My_enum(Quux) == 3);
}
Likewise, there is a function for converting the other way
function enum_example2() = {
assert(My_enum_of_num(0) == Foo)
}
If the names of the generated functions are not ideal, then custom
names can be provided using the enum_number_conversions
attribute.
$[enum_number_conversions { from_enum = to_number, to_enum = from_number }]
enum Another_enum = {
Member1,
Member2,
}
function custom_conversions() -> unit = {
assert(to_number(Member1) == 0);
assert(from_number(1) == Member2);
}
The no_enum_number_conversions
attribute can be used to disable the
generation of these functions entirely.
Unions
Structs provide us a way to group multiple pieces of related data, for example
struct rectangle = {
width : int,
height : int,
}
defines a type rectangle
that has both a width and a height. The
other logical way to combine data is if we want to say we have some
data or some other data, like if we wanted to represent a shape that
could be a circle or a rectangle. In Sail, we use a union for this.
struct circle = { radius : int }
union shape = {
Rectangle : rectangle,
Circle : circle,
}
Note
|
While structs are ubiquitous in programming languages, the
equivalent feature for safely representing disjunctions of types is
often a second-class afterthought or even absent entirely. The
requirement to safely represent disjunctions of types is why C’s
union feature is something different to what we want here, which is
sometimes called tagged unions or variants to distinguish them
from the unsafe C construct. C++ is therefore a good example of a
language where structs are first class, but std::variant is
relegated to a library feature.
|
This defines a type with two constructors Rectangle
and Circle
.
Constructors are used like functions with a single argument, as so:
function example() = {
// Construct a shape using the Rectangle constructor
let r : shape = Rectangle(struct { width = 30, height = 20 });
// Construct a shape using the Circle constructor
// Note that we can allow the 'shape' type to be inferred
let c = Circle(struct { radius = 15 });
}
To destructure unions, we typically use pattern matching to handle each case in the union separately. This will be discussed in detail in Pattern matching.
Constructors in Sail always have a single argument. If we want a
constructor in a union type to contain no information, then we can use
the unit
type. For a constructor with multiple arguments we can use
a tuple type. We provide some syntax sugar that allows writing
constructor applications with either tuple or unit argument types as
if they were functions with multiple arguments or no arguments
respectively.
union example_unit_constructor = {
Foo : (int, string),
Bar : unit,
}
// Using a constructor with a tuple type
let x1 = Foo(2, "a string")
// Note that the above is equivalent to
let x2 = Foo((2, "a string"))
// Using a unit-type constructor
let y1 = Bar()
// Similarly, equivalent to
let y2 = Bar(())
In our shape example, we defined rectangle
and circle
as two
separate structs, but if we wanted we could declare those structs
inline
union shape2 = {
Rectangle2 : { width : int, height : int },
Circle2 : { radius : int },
}
Behind the scenes, Sail will just generate separate struct definitions, so we use this type in exactly the same was as we did previously.
As with structs, we can define polymorphic union types, like
union either('a, 'b) = {
Left : 'a,
Right : 'b,
}
Type synonyms
We have now described all the various types available in Sail.
However, in order to create more descriptive and self-documenting
specifications, we might want to give types new names that better
inform the reader of the intent behind the types. We can do this using
type synonyms, which are created with the type
keyword.
For example, if we have an architecture with 32 general purpose
registers, we might want to index them using a 5-bit type. Rather than
using bits(5)
directly, we create a synonym:
type regbits = bits(5)
We can now write functions that use this type:
register R1 : bits(32)
register R2 : bits(32)
function access_register(r : regbits) -> bits(32) = {
match r {
0b00000 => 0x0000_0000, // zero register
0b00001 => R1,
0b00010 => R2,
// and so on
}
}
Sail types also include numbers, as in bits(32)
above. We can
therefore write a type synonym for just such a number:
type xlen = 32
In which case, we could re-write our example as
register R1 : bits(xlen)
register R2 : bits(xlen)
function access_register(r : regbits) -> bits(xlen) = {
match r {
0b00000 => 0x0000_0000, // zero register
0b00001 => R1,
0b00010 => R2,
// and so on
}
}
Type synonyms are completely transparent to the Sail type-checker, so a type synonym can be used exactly the same as the original type.
We can also write type-synonyms that have arguments, for example:
type square('n, 'a) = vector('n, vector('n, 'a))
Type kinds
Throughout the previous section we have seen many different types. For
example we could have a type like list(int)
, or a type like
range(2, 5)
. The range
type takes two numbers as an arguments,
whereas list takes a type as an argument. We’ve also seen polymorphic
user-defined types like:
struct foo('n, 'a) = {
some_bits : bits('n),
some_a : 'a,
}
Which we could use as the type foo(32, string)
, or foo(16, int)
.
What stops us from writing foo(string, int)
or even range(int,
int)
or list(3)
however? Those should clearly be type errors, and
if they are type errors does that mean types have types? The answer
to this is yes, and we call the types of types kinds. Sail has three
different kinds of types, which we denote as Int
, Bool
, and
Type
. The Int
kind is for type-level integers, Bool
is for
type-level (boolean) constraints, and Type
is for ordinary types.
Type constructors therefore have types much like functions, with
list
having the type Type → Type
and range
having the type
(Int, Int) → Type
.
Note
|
From a type-theory perspective, one really can think of type
constructors as just functions at the type level. As functions are
applied to terms, type constructors are applied to types. This is
partly why we use parentheses for type constructor application in Sail — not only does it neatly sidestep the notoriously thorny parsing
problems of < and > in types (see Rust’s
infamous turbofish operator, or requiring extra spaces in C++ templates to
avoid ambiguities with >> ), but it really is the same concept when
viewed in the right way.
|
Sail allows writing the kinds explicitly in type definitions, so we could write:
struct foo('n : Int, 'a : Type) = {
some_bits : bits('n),
some_a : 'a,
}
but in practice we don’t need to do this, as Sail has kind-inference to avoid us having to think about this particular detail.
Pattern matching
Like most functional languages, and an increasing number of imperative
languages like Rust, Sail supports pattern matching via the match
keyword. For example:
let n : int = f();
match n {
1 => print_endline("1"),
2 => print_endline("2"),
3 => print_endline("3"),
_ => print_endline("wildcard"),
}
The match
keyword takes an expression and then branches by comparing
the matched value with a pattern. Each case in the match expression
takes the form <pattern> => <expression>
, separated by commas (a
trailing comma is allowed). The cases are checked sequentially from
top to bottom, and when the first pattern matches its expression will
be evaluated.
The concrete match statement syntax in Sail is inspired by the syntax used in Rust — but programmers coming from languages with no pattern matching features may be unfamiliar with the concept. One can think of the match statement like a super-powered switch statement in C. At its most basic level a match statement can function like a switch statement (except without any fall-through). As in the above example we can use match to compare an expression against a sequence of values like so:
match expression {
1 => print_endline("expression == 1"),
2 => {
// A match arm can have a single expression or a block
// (because blocks in Sail are also expressions!)
print_endline("expression == 2")
// Note that unlike in C, no 'break' is needed as there is no fallthrough
},
_ => print_endline("This wildcard pattern acts like default: in a C switch")
}
However the pattern in a match statement can also bind variables. In
the following example we match on a numeric expression x + y
, and if
it is equal to 1
we execute the first match arm. However, if that is
not the case the value of x + y
is bound to a new immutable variable
z
.
match x + y {
1 => print_endline("x + y == 1"),
z => {
// here z is a new variable defined as x + y.
print_int("x + y = ", z)
// z is only defined within the match arm
},
}
Finally, we can use patterns to destructure values — breaking them apart into their constituent parts. For example if we have a pair expression we can break it apart into the first value in the pair and the second, which can then be used as individual variables:
match pair : (int, int) {
(first, second) => {
// here we have split a pair into two variables first and second
print_int("first = ", first);
print_int("second = ", second)
}
}
These features can be combined, so if we had a pattern (first, 3)
in
the above example, the expression for that pattern would be executed
when the second element of the pair is equal to 3, and the first element
would be bound to the variable first
.
Sail will check match statements for exhaustiveness (meaning that the patterns in the match cover every possible value), and prints a warning if the overall match statement is not exhaustive. There are some limitations on the exhaustiveness checker which we will discuss further below.
Guards
What if we need to switch based on more complex logic than just the
structure and values of the expressions we are matching on? For this
matches in Sail support guards. A guard allows us to combine the
behavior of a match expression and the boolean logic of an if
expression — and the syntax is reflective of this, as we can use the
if
keyword to add extra logic to each match arm:
match x + y {
z if z > 0 => print_endline("y is greater than 0"),
z if z == 0 => print_endline("z is equal to 0"),
z => print_endline("z is less than 0"),
}
You may wonder — why not write z if z < 0
for the final case? Here
we run into one of the limitations of the exhaustiveness checker
mentioned above. Sail can only check the exhaustiveness of unguarded
clauses, meaning that the checker only considers cases without an if
guard. The z
pattern by itself is exhaustive, so the checker is
happy, but if we added a if z < 0
guard the checker would complain
that there are no unguarded patterns for it to look at. This may seem
suprising for such a simple case (we can easily see the three guards
would cover all cases!), however each guard clause could contain
arbitrarily complex logic potentially abstracted behind arbitrary
function calls, which the completeness checker cannot reason about.
We now describe all the things that can be matched on in Sail
Matching on literals
First, and as we have already seen, we can match on literal
values. These literal values can be ()
, bitvectors, the boolean values
true
and false
, numbers, and strings.
Matching on enumerations
Match can be used to match on possible values of an enum, like so:
match x {
A => print_endline("A"),
B => print_endline("B"),
C => print_endline("C")
}
Note that because Sail places no restrictions on the lexical structure of enumeration elements to differentiate them from ordinary identifiers, pattern matches on variables and enum elements can be somewhat ambiguous. Issues with this are usually caught by the exhaustiveness checker — it can warn you if removing an enum constructor breaks a pattern match.
Matching on tuples
We use match to destructure tuple types, for example:
let x: (int, int) = (2, 3);
match x {
(y, z) => print_endline("y = 2 and z = 3")
}
Matching on unions
Match can also be used to destructure tagged union constructors, for example using the option type from the Sail library.
$include <option.sail>
val g : unit -> option(int)
function match_union() -> unit = {
let x = g();
match x {
Some(n) => print_int("n = ", n),
None() => print_endline("got None()!"),
}
}
Note that like how calling a function with a unit argument can be done
as f()
rather than f(())
, matching on a constructor C
with a
unit type can be achieved by using C()
rather than C(())
.
Matching on lists
Sail allows lists to be destructured using patterns. There are two types of patterns for lists, cons patterns and list literal patterns. The cons pattern destructures lists into the first element (the head) and the rest of the list (the tail).
match ys {
x :: xs => print_endline("cons pattern"),
[||] => print_endline("empty list")
}
On the other hand, a list pattern matches on the entire list:
match ys {
[|1, 2, 3|] => print_endline("list pattern"),
_ => print_endline("wildcard")
}
Matching on structs
Match can also be used for structures, for example:
struct my_struct = {
field1 : bits(16),
field2 : int,
}
function match_struct(value : my_struct) -> unit = {
match value {
// match against literals in the struct fields
struct { field1 = 0x0000, field2 = 3 } => (),
// bind the struct field values to immutable variables
struct { field1 = x, field2 = y } => {
print_bits("value.field1 = ", x);
print_int("value.field2 = ", y)
}
}
}
We can omit fields from the match by using a wildcard _
in place of
some of the fields:
match value {
struct { field1 = x, _ } => {
print_bits("value.field1 = ", x)
}
}
Finally, if we want to create a variable with the same name as a
field, rather than typing struct { field_name = field_name, _ }
, we
can shorten this to just struct { field_name, _ }
, So the above
example is equivalent to:
match value {
struct { field1, _ } => {
print_bits("value.field1 = ", field1)
}
}
As patterns
Like OCaml, Sail also supports naming parts of patterns using the as
keyword. For example, in the above cons pattern example we could bind the
entire matched list to the zs
variable:
match (1 :: 2 :: ys) : list(int) {
x :: xs as zs => print_endline("cons with as pattern"),
[||] => print_endline("empty list"),
}
The as
pattern has lower precedence than any other keyword or
operator in a pattern, so in this example the pattern brackets as
(x :: xs) as zs
Automatic wildcard insertion
The various theorem provers that Sail can produce definitions for are often strict, and enforce that pattern matches are exhaustive. However, their pattern exhaustiveness checkers do not understand bitvectors in the same way Sail does. For example, Sail can tell that the following match is complete:
match (x, y) {
(0b1, _) => print_endline("1"),
(_, 0b0) => print_endline("2"),
(0b0, 0b1) => print_endline("3"),
}
When translating to a target without bitvector literals, this would be rewritten to use guards:
match (x, y) {
(t1, _) if t1 == 0b1 => print_endline("1"),
(_, t2) if t2 == 0b0 => print_endline("2"),
(t1, t2) if t1 == 0b0 & t2 == 0b1 => print_endline("3"),
}
Most targets that check pattern exhaustiveness share the same limitation as Sail — they only check match arms without guards, so they would not see that this match is complete. To avoid this, Sail will attempt to replace literal patterns with wildcards when possible, so the above will be rewritten to:
match (x, y) {
(0b1, _) => print_endline("1"),
(_, 0b0) => print_endline("2"),
(_, _) => print_endline("3"),
}
which is equivalent.
One can find situations where such wildcards cannot be inserted. For example:
function cannot_wildcard(x: option(bits(1)), y: option(bits(1))) -> unit = {
match (x, y) {
(Some(0b0), Some(_)) => print_endline("1"),
(Some(0b1), Some(0b0)) => print_endline("2"),
(Some(0b1), _) => print_endline("3"),
(Some(0b0), None()) => print_endline("4"),
(None(), _) => print_endline("5"),
}
}
Here the 0b1
literal in the (Some(0b1), _)
case would need to be
replaced for an exhaustiveness checker without bitvector literals to
check the case where x
and y
are both Some
, but this would
change the behavior when x
is Some
and y
is None
, hence a
wildcard cannot be inserted.
In this case Sail will print a warning explaining the problem:
Warning: Required literal ../examples/cannot_wildcard.sail:17.14-17:
17 | (Some(0b1), _) => print_endline("3"),
| ^-^
|
Sail cannot simplify the above pattern match:
This bitvector pattern literal must be kept, as it is required for Sail to show that the surrounding pattern match is complete.
When translated into prover targets (e.g. Lem, Coq) without native bitvector patterns, they may be unable to verify that the match covers all possible cases.
This warning should be heeded, and the match simplified otherwise the generated theorem prover definitions produced by Sail may be rejected by the prover.
Matching in let and function arguments
The match
statement isn’t the only place we can use patterns. We can
also use patterns in function arguments and with let
, for example:
struct S = {
field1 : int,
field2 : string,
}
function example1(s : S) -> unit = {
// Destructure 's' using let
let struct { field1, _ } = s;
print_int("field1 = ", field1)
}
// Destructure directly in the function argument
function example2(struct { field1, _ } : S) -> unit = {
print_int("field1 = ", field1)
}
Type patterns
In the previous section we saw as patterns, which allowed us bind additional variables for subpatterns. However, as patterns can also be used to bind type variables. For example:
// Some function that returns either 32 or 64 at runtime
val get_current_xlen : unit -> {32, 64}
register R : bits(32)
val example : int -> unit
function example() = {
let xlen as int('n) = get_current_xlen()
// Create a bitvector of length xlen
let bv : bits('n) = zero_extend(xlen, *R);
print_bits("bv = ", bv)
}
You can think of the as int('n)
as matching on the return type of
the get_current_xlen
rather than the value, and binding it’s length
to a new type variable 'n
, which we can subsequently use in types
later in our function. Note that even though we only know if xlen will
be 32 or 64 at runtime after the call to get_current_xlen, Sail is
still able to statically check all our bitvector accesses.
If a type only contains a single type variable (as int('n)
does),
then we allow omitting the type name and just using a variable as the
type pattern, for example the following would be equivalent to the
first line of example above:
let xlen as 'n = get_current_xlen();
If we want to give the variable xlen
and the type variable 'n
the
same name, we could go further and simplify to:
function example() = {
let 'xlen = get_current_xlen()
// Create a bitvector of length xlen
let bv : bits('xlen) = zero_extend(xlen, *R);
print_bits("bv = ", bv)
}
Here we can use xlen within the function as both a regular variable
xlen
and as a type variable 'xlen
.
Mutable variables
Bindings made using let
are always immutable, but Sail also allows
mutable variables. Mutable variables are created by using the var
keyword within a block.
{
var x : int = 3; // Create a new mutable variable x initialised to 3
x = 2 // Rebind it to the value 2
}
Like let-bound variables, mutable variables are lexically scoped, so they only exist within the block that declared them.
Technically, unless the --strict-var
option is used Sail also allows
mutable variables to be implicitly declared, like so:
{
x : int = 3 // Create a new mutable variable x initialised to 3
x = 2 // Rebind it to the value 2
}
This functions identically, just without the keyword. We consider this
deprecated and strongly encourage the use of the var
keyword going
forwards.
The assignment operator is the equality symbol, as in C and other programming languages. Sail supports a rich language of l-value forms, which can appear on the left of an assignment. These will be described in the next section.
One important thing to note is that Sail always infers the most specific type it can for variables, and in the presence of integer types with constraints, these types can be very specific. This is not a problem for immutable bindings, but can cause issues for mutable variables when explicit types are omitted. The following will not typecheck:
{
var x = 3;
x = 2;
}
The reason is that Sail (correctly) infers that x
has the type 'the
integer equal to 3', and therefore refuses to allow us to assign 2
to it (as it well should), because two is not equal to three. To avoid
this we must give an annotation with a less specific type like int
.
Assignment and l-values
It is common in ISA specifications to assign to complex l-values, e.g. to a subvector or named field of a bitvector register, or to an l-value computed with some auxiliary function, e.g. to select the appropriate register for the current execution model.
Sail has l-values that allow writing to individual elements of a (bit)vector:
var v : bits(8) = 0xFF;
v[0] = bitzero;
assert(v == 0xFE)
As well as sub-ranges of a (bit)vector:
var v : bits(8) = 0xFF;
v[3 .. 0] = 0x0; // assume default Order dec
assert(v == 0xF0)
We also have vector concatenation l-values, which work much like vector concatenation patterns
var v1 : bits(4) = 0xF;
var v2 : bits(4) = 0xF;
v1 @ v2 = 0xAB;
assert(v1 == 0xA & v2 == 0xB)
For structs we can write to an individual struct field as
// Assume S is a struct type with a single bits(8) field (called field)
var s : S = struct { field = 0xFF };
s.field = 0x00;
assert(s.field == 0x00)
We can also do multiple assignment using tuples, for example:
var (x, y) = (2, 3);
assert(x == 2 & x == 3)
// swap values
(y, x) = (x, y)
assert(x == 3 & x == 2)
Finally, we allow functions to appear in l-values. This is a very simple way to declare setter functions that look like custom l-values, for example:
memory(addr) = 0x0F
// is just syntactic sugar for
memory(addr, 0x0F)
This feature is commonly used when setting registers or memory that
have some additional semantics when they are read or written. We
commonly use the ad-hoc overloading feature to declare what appear to
be getter/setter pairs, so for the above example we could implement a
read_memory
function and a write_memory
function and overload them
both as memory
to allow us to write memory using
memory(addr) = data
and read memory as
data = memory(addr)
, for example:
val read_memory : bits(64) -> bits(8)
val write_memory : (bits(64), bits(8)) -> unit
overload memory = {read_memory, write_memory}
Bitfields
The following example creates a bitfield type called cr_type
and a
register CR
of that type. The underlying bitvector type (in this
case bits(8)
) must be specified as part of the bitfield declaration.
Note
|
The underlying bitvector type can be specified using a type
synonym, like xlenbits in sail-riscv.
|
bitfield cr_type : bits(8) = {
CR0 : 7 .. 4,
LT : 7,
GT : 6,
CR1 : 3 .. 2,
CR3 : 1 .. 0
}
register CR : cr_type
If the bitvector is decreasing then indexes for the fields must also be in decreasing order, and vice-versa for an increasing vector. The field definitions can be overlapping and do not need to be contiguous.
A bitfield generates a type that is simply a struct wrapper around the
underlying bitvector, with a single field called bits
containing
that bitvector. For the above example, this will be:
struct cr_type = {
bits : bits(8)
}
This representation is guaranteed, so it is expected that Sail code
will use the bits
field to access the underlying bits of the
bitfield as needed. The following function shows how the bits
contained in a bitfield can be accessed:
function bitfield_access_example() = {
// Rather than using numeric indices, the bitfield names are used
let cr0_field : bits(4) = CR[CR0];
// Bitfield accessors always return bitvector results
let lt_field : bits(1) = CR[LT]; // Note 'bits(1)' not 'bit'
// Can access the underlying bitvector using the bits field
let some_bits : bits(7) = CR.bits[6 .. 0];
}
Similarly, bitfields can be updated much like regular vectors just using the field names rather than numeric indices:
function bitfield_update_example() = {
// We can set fields on the CR register using their field names
CR[CR3] = 0b01;
// If we want to set the underlying bits
CR.bits[1 .. 0] = 0b01;
// We can even use vector update notation!
CR = [CR with CR3 = 0b01, LT = 0b1];
// Because of the representation, we could set the whole thing like:
CR = (struct { bits = 0x00 } : cr_type);
}
Older versions of Sail did not guarantee the underlying representation of the bitfield (because it tried to do clever things to optimise them). This meant that bitfields had to be accessed using getter and setter functions, like so:
function legacy_bitfields() = {
// Assigning to a field of a bitfield register
CR->CR3() = 0b01;
// '->' notation just means the setter takes the register by reference:
(ref CR).CR3() = 0b01;
// Assigning all the bits (now just 'CR.bits = 0x00')
CR->bits() = 0x00;
// Accessing a field
let cr0_field : bits(4) = CR.CR0();
// Updating a field
CR = update_CR3(CR, 0b01); // now '[ CR with CR3 = 0b01 ]'
// Construct a new CR bitfield
CR = Mk_cr_type(0x00);
}
The method like accessor syntax was (overly sweet) syntactic sugar for getter and setter functions following a specific naming convention that was generated by the bitfield. These functions are still generated for backwards compatibility, but we would advise against using them.
Note
|
Except perhaps for the Mk_cr_type function or equivalent for
other bitfields, which is still quite useful for creating bitfields.
|
Operators
Valid operators in Sail are sequences of the following non
alpha-numeric characters: !%&*+-./:<>=@^|#
. Additionally, any such
sequence may be suffixed by an underscore followed by any valid
identifier, so <=_u
or even <=_unsigned
are valid operator names.
Operators may be left, right, or non-associative, and there are 10
different precedence levels, ranging from 0 to 9, with 9 binding the
tightest. To declare the precedence of an operator, we use a fixity
declaration like:
infix 4 <=_u
For left or right associative operators, we’d use the keywords
infixl
or infixr
respectively. An operator can be used anywhere a
normal identifier could be used via the operator
keyword. As such,
the <=_u
operator can be defined as:
val operator <=_u : forall 'n. (bits('n), bits('n)) -> bool
function operator <=_u(x, y) = unsigned(x) <= unsigned(y)
Built-in precedences
The precedence of several common operators are built into Sail. These include all the operators that are used in type-level numeric expressions, as well as several common operations such as equality, division, and modulus. The precedences for these operators are summarised in the following table.
Precedence | Left associative | Non-associative | Right associative |
---|---|---|---|
9 |
|||
8 |
|
||
7 |
|
||
6 |
|
||
4 |
|
||
3 |
|
||
2 |
|
||
1 |
|||
0 |
Ad-hoc overloading
Sail has a flexible overloading mechanism using the overload keyword. For example:
overload foo = {bar, baz}
It takes an identifier name, and a list of other identifier names to overload that name with. When the overloaded name is seen in a Sail definition, the type-checker will try each of the overloads (that are in scope) in order from left to right until it finds one that causes the resulting expression to type-check correctly.
Multiple overload
declarations are permitted for the same
identifier, with each overload declaration after the first adding its
list of identifier names to the right of the overload list (so earlier
overload declarations take precedence over later ones). As such, we
could split every identifier from above example into its own
line like so:
overload foo = {bar}
overload foo = {baz}
Note that if an overload is defined in module B
using identifiers
provided by another module A
, then a module C
that requires only
B
will not see any of the identifiers from A
, unless it also
requires A
. See the section on modules for details. Note that this
means an overload cannot be used to 're-export' definitions provided by
another module.
As an example for how overloaded functions can be used, consider the
following example, where we define a function print_int
and a
function print_string
for printing integers and strings
respectively. We overload print
as either print_int
or
print_string
, so we can print either number such as 4, or strings
like "Hello, World!"
in the following main
function definition.
val print_int : int -> unit
val print_string : string -> unit
overload print = {print_int, print_string}
function main() : unit -> unit = {
print("Hello, World!");
print(4)
}
We can see that the overloading has had the desired effect by dumping
the type-checked AST to stdout using the following command
sail -ddump_tc_ast examples/overload.sail
. This will print the
following, which shows how the overloading has been resolved
function main () : unit = {
print_string("Hello, World!");
print_int(4)
}
This option can sometimes be quite useful for observing how overloading has been resolved. Since the overloadings are done in the order they are defined, it can be important to ensure that this order is correct. A common idiom in the standard library is to have versions of functions that guarantee more constraints about their output be overloaded with functions that accept more inputs but guarantee less about their results. For example, we might have two division functions:
val div1 : forall 'm 'n, 'n >= 0 & 'm > 0. (int('n), int('m)) -> {'o, 'o >= 0. int('o)}
val div2 : (int, int) -> option(int)
The first guarantees that if the first argument is greater than or equal to zero, and the second argument is greater than zero, then the result will be greater than or equal to zero. If we overload these definitions as
overload operator / = {div1, div2}
Then the first will be applied when the constraints on its inputs can be resolved, and therefore the guarantees on its output can be guaranteed, but the second will be used when this is not the case, and indeed, we will need to manually check for the division by zero case due to the option type. Notice that the return type can be different between different cases in the overload.
Mappings
Mappings are a feature of Sail that allow concise expression of bidirectional relationships between values that are common in ISA specifications: for example, bit-representations of an enum type, or the encoding-decoding of instructions.
They are defined similarly to functions, with a val
keyword to
specify the type and a definition, using a bidirectional arrow <->
rather than a function arrow ->
, and a separate mapping
definition.
val size_bits : word_width <-> bits(2)
mapping size_bits = {
BYTE <-> 0b00,
HALF <-> 0b01,
WORD <-> 0b10,
DOUBLE <-> 0b11
}
As a shorthand, you can also specify a mapping and its type simultaneously:
mapping size_bits2 : word_width <-> bits(2) = {
BYTE <-> 0b00,
HALF <-> 0b01,
WORD <-> 0b10,
DOUBLE <-> 0b11
}
Mappings are used simply by calling them as if they were functions: type inference will determine in which direction the mapping is applied. (This gives rise to the restriction that the types on either side of a mapping must be different.)
let width : word_width = size_bits(0b00);
let width : bits(2) = size_bits(BYTE);
Sometimes, because the subset of Sail allowed in bidirectional mapping clauses is quite restrictive, it can be useful to specify the forwards and backwards part of a mapping separately:
mapping size_bits3 : word_width <-> bits(2) = {
BYTE <-> 0b00,
HALF <-> 0b01,
WORD <-> 0b10,
forwards DOUBLE => 0b11, // forwards is left-to-right
backwards 0b11 => DOUBLE, // backwards is right-to-left
}
Sizeof and constraint
Sail allows for arbitrary type variables to be included within expressions. However, we can go slightly further than this, and include both arbitrary (type-level) numeric expressions in code, as well as type constraints. For example, if we have a function that takes two bitvectors as arguments, then there are several ways we could compute the sum of their lengths.
val f : forall 'n 'm. (bits('n), bits('m)) -> unit
function f(xs, ys) = {
let len = length(xs) + length(ys);
let len = 'n + 'm;
let len = sizeof('n + 'm);
()
}
Note that the second line is equivalent to
let len = sizeof('n) + sizeof('n)
There is also the constraint
keyword, which takes a type-level
constraint and allows it to be used as a boolean expression, so we
could write:
function f(xs, ys) = {
if constraint('n <= 'm) {
// Do something
}
}
rather than the equivalent test length(xs) <= length(ys)
.
This way of writing expressions can be succinct, and can also make it
very explicit what constraints will be generated during flow typing.
However, all the constraint and sizeof definitions must be re-written
to produce executable code, which can result in the generated theorem
prover output diverging (in appearance) somewhat from the source
input. In general, it is probably best to use sizeof
and
constraint
sparingly on type variables.
One common use for sizeof however, is to lower type-level integers down to the value level, for example:
// xlen is a type of kind 'Int'
type xlen = 64
val f : int -> unit
function xlen_example() = {
let v = sizeof(xlen);
f(v)
}
Exceptions
Perhaps surprisingly for a specification language, Sail has exception support. This is because exceptions as a language feature do sometimes appear in vendor ISA pseudocode (they are a feature in Arm’s ASL language), and such code would be very difficult to translate into Sail if Sail did not itself support exceptions. In practice Sail language-level exceptions end up being quite a nice tool for implementing processor exceptions in ISA specifications.
For exceptions we have two language features: throw
statements
and try
--catch
blocks. The throw keyword takes a value of
type exception
as an argument, which can be any user defined type
with that name. There is no built-in exception type, so to use
exceptions one must be set up on a per-project basis. Usually the
exception type will be a union, often a scattered union, which allows
for the exceptions to be declared throughout the specification as they
would be in OCaml, for example:
scattered union exception
union clause exception = Epair : (range(0, 255), range(0, 255))
union clause exception = Eunknown : string
function main() : unit -> unit = {
try {
throw(Eunknown("foo"))
} catch {
Eunknown(msg) => print_endline(msg),
_ => exit()
}
}
union clause exception = Eint : int
end exception
Scattered definitions
In a Sail specification, sometimes it is desirable to collect together the definitions relating to each machine instruction (or group thereof), e.g.~grouping the clauses of an AST type with the associated clauses of decode and execute functions, as in the A tutorial RISC-V style example section. Sail permits this with syntactic sugar for `scattered' definitions. Functions, mappings, unions, and enums can be scattered.
One begins a scattered definition by declaring the name and kind (either function or union) of the scattered definition, e.g.
// We can declare scattered types, for either unions or enums
scattered union U
scattered enum E
// For scattered functions and mappings, we have to provide a type signature with val
val foo : U -> bits(64)
scattered function foo
val bar : E <-> string
scattered mapping bar
This is then followed by clauses for either, which can be freely interleaved with other definitions. It is common to define both a scattered type (either union or enum), with a scattered function that operates on newly defined clauses of that type, as is shown below:
enum clause E = E_one
union clause U = U_one : bits(64)
function clause foo(U_one(x)) = x
mapping clause bar = E_one <-> "first member of E"
Finally the scattered definitions are ended with the end
keyword, like so:
end E
end U
end foo
end bar
Technically the end
keyword is not required, but it is good practice
to include it as it informs Sail that the clauses are now complete,
which forbids new clauses and means subsequent pattern completeness
checks no longer have to require extra wildcards to account for new
clauses being added.
Semantically, scattered definitions for types appear at the start of
their definition (note however, that this does not mean that a module
that requires just the start scattered union
definition can access
any constructors of a union defined in modules it does not require).
Scattered definitions for functions and mappings appear at the
end. A scattered function definition can be recursive, but mutually
recursive scattered function definitions should be avoided.
Preludes and default environment
By default Sail has almost no built-in types or functions, except for the primitive types described in this Chapter. This is because different vendor-pseudocodes have varying naming conventions and styles for even the most basic operators, so we aim to provide flexibility and avoid committing to any particular naming convention or set of built-ins. However, each Sail backend typically implements specific external names, so for a PowerPC ISA description one might have:
val EXTZ = "zero_extend" : ...
while for ARM, to mimic ASL, one would have
val ZeroExtend = "zero_extend" : ...
where each backend knows about the "zero_extend"
external name, but
the actual Sail functions are named appropriately for each vendor’s
pseudocode. As such each ISA spec written in Sail tends to have its
own prelude.
Modular Sail Specifications
Modules
Sail provides support for organizing large specifications into modules. Modules provide an access control mechanism, meaning a Sail definition in one module cannot access or use definitions provided by another module unless it explicitly requires the other module.
The module structure of a Sail project/specification is specified in a
separate .sail_project
file.
For a simple example, let’s assume we have two Sail files amod.sail
and
bmod.sail
:
amod.sail
val alfa : unit -> int
function alfa() = 3
bmod.sail
val bravo : unit -> unit
function bravo() = {
let x = alfa();
print_int("alfa returned: ", x)
}
We can use the following .sail_project
file:
simple_mod.sail_project
A {
files amod.sail
}
B {
requires A
files bmod.sail
}
This file defines two modules A
and B
, with module A
containing
the file amod.sail
and module B containing the file bmod.sail
.
Module B
requires module A
, so it can use the alfa
function
defined in A
. What would happen if we removed the requires line? We
would get the following error:
Type error:
../examples/bmod.sail:6.12-16:
6 | let x = alfa();
| ^--^
| Not in scope
|
| Try requiring module A to bring the following into scope for module B:
| ../examples/amod.sail:6.4-8:
| 6 |val alfa : unit -> int
| | ^--^ definition here in A
| ../examples/simple_mod_err.sail_project:5.0-1:
| 5 |B {
| |^ add 'requires A' within B here
This error tells us that alfa
is not in scope, but Sail knows it
exists as it has already checked module A
, so it points us at the
definition and suggests how we could resolve the error by adding the
requires line we just removed.
When using a .sail_project
file we do not have to pass all the files
on the command line, so we can invoke Sail simply as
sail simple_mod.sail_project
and it will know where to find amod.sail
and bmod.sail
relative to
the location of the project file.
A module can have more than one Sail file. These files are processed
sequentially in the order they are listed. This is exactly like what
happens when we pass multiple Sail files on the command line without a
.sail_project
file to define the module structure. A module can
therefore be thought of as a sequence of Sail files that is treated as
a single logical unit. As an example, we could add a third module to
our above file, which is comprised of three Sail files and depends on
A and B.
C {
requires A, B
files
foo.sail,
bar.sail,
baz.sail
}
Note that comments and trailing commas are allowed, and we could optionally delimit
the lists using [
and ]
, like so:
C {
// Require both our previous modules
requires [A, B]
/* Both types of Sail comments are allowed! */
files [
foo.sail,
bar.sail,
baz.sail,
]
}
If we wanted to we could define C
in a separate file, rather than
adding it to our previous file, and pass multiple project files to
Sail like so:
sail simple_mod.sail_project new_file_with_C.sail_project
These will be treated together as a single large project file. A use
case might be if you were defining an out-of-tree extension Xfoo
for
sail-riscv, you could do something like:
sail sail-riscv/riscv.sail_project src/Xfoo.sail_project
and the modules you define in Xfoo.sail_project
can require modules
from riscv.sail_project
, and also vice-versa, although it makes less
sense in this example.
Working with Makefiles
The --list-files
option can be used to list all the files within a
project file, which allows them to be used in a Makefile prerequisite.
As an example, to build the module examples in this very manual, we
use the rule below to generate documentation indexes (which our
Asciidoctor plugin consumes) for every .sail
file within a project
file.
.SECONDEXPANSION:
module_sail_doc/%.json: ../examples/%.sail_project $$(shell sail ../examples/%.sail_project -list_files)
mkdir -p module_sail_doc
sail --doc $(addprefix -doc_file ,$(shell sail $< --list-files)) --doc-embed plain --doc-bundle $(notdir $@) -o module_sail_doc $<
Conditional compilation within modules
We can use variables in our project files to control either file inclusion within a module or to control whether a module requires another or not. A variable can even contain a sequence of modules, that can then be used in a require statement, as shown in the following example:
variable ARCH = A64
variable ARCH_MODULES = if ARCH == A64 then arch64_only else []
arch_prelude {
files
prelude.sail,
if $ARCH == A32 then arch_xlen32.sail
else if $ARCH == A64 then [
arch_xlen64.sail,
arch_xlen64_helpers.sail
] else error("Invalid value for ARCH")
}
arch64_only {
requires arch_prelude
files
instructions64.sail
}
arch_instructions {
requires arch_prelude
files
instructions.sail
}
arch_end {
requires arch_instructions, $ARCH_MODULES
files
end.sail
}
Optional and default modules
Modules can be marked as either optional
or default
. Default
modules are those that form the base part of a specification, whereas
optional modules are intended to implement extensions which may or may
not be present. Default modules cannot require optional modules.
The Sail Grammar
This appendix contains a grammar for the Sail language that is automatically generated from the Menhir parser definition.
Note
|
This means that the grammar is stated in such a way that the parser generator can see it is free of LR shift/reduce and reduce/reduce conflicts, rather than being optimised for human readability. |
First, we need some lexical conventions:
-
ID
is any valid Sail identifier -
OPERATOR
is any valid Sail operator, as defined in Operators. -
TYPE_VARIABLE
is a valid Sail identifier with a leading single quote'
. -
NUMBER
is a non-empty sequence of decimal digits[0-9]+
. -
HEXADECIMAL_LITERAL
is0x[A-Ba-f0-9_]+
-
BINARY_LITERAL
is0b[0-1_]+
-
STRING_LITERAL
is a Sail string literal.
$[ATTRIBUTE]
and $LINE_DIRECTIVE
represent attributes and single
line directives. Examples of line directives would be things like
$include
, $ifdef
and so on. These start with a leading $
, are
followed by the directive name (which follows the same lexical rules
as a Sail identifier), is followed by one or more spaces, then
proceeds to the end of the line, with everything between the
identifier and the line ending being the line directive’s argument
(with leading and trailing whitespace removed). An attribute starts
with $[
and ends with ]
, and in between consists of an attribute
name, followed by at least one whitespace character, then any
arbitrary sequence of characters that does not contain ]
.
<id> ::= ID
| operator OPERATOR
| operator -
| operator |
| operator ^
| operator *
<op_no_caret> ::= OPERATOR
| -
| |
| *
| in
<op> ::= OPERATOR
| -
| |
| ^
| *
| in
<exp_op> ::= OPERATOR
| -
| |
| @
| ::
| ^
| *
<pat_op> ::= @
| ::
| ^
<typ_var> ::= TYPE_VARIABLE
<tyarg> ::= ( <typ_list> )
<prefix_typ_op> ::= epsilon
| 2^
| -
| *
<postfix_typ> ::= <atomic_typ>
<typ_no_caret> ::= <prefix_typ_op> <postfix_typ> (<op_no_caret> <prefix_typ_op> <postfix_typ>)*
<typ> ::= if <infix_typ> then <infix_typ> else <typ>
| <infix_typ>
<infix_typ> ::= <prefix_typ_op> <postfix_typ> (<op> <prefix_typ_op> <postfix_typ>)*
<atomic_typ> ::= <id>
| _
| <typ_var>
| <lit>
| dec
| inc
| <id> <tyarg>
| register ( <typ> )
| ( <typ> )
| ( <typ> , <typ_list> )
| { NUMBER (, NUMBER)* }
| { <kopt> . <typ> }
| { <kopt> , <typ> . <typ> }
<typ_list> ::= <typ> [,]
| <typ> , <typ_list>
<kind> ::= Int
| Type
| Order
| Bool
<kopt> ::= ( constant <typ_var> : <kind> )
| ( <typ_var> : <kind> )
| <typ_var>
<quantifier> ::= <kopt> , <typ>
| <kopt>
<effect> ::= <id>
<effect_set> ::= { <effect> (, <effect>)* }
| pure
<typschm> ::= <typ> -> <typ>
| forall <quantifier> . <typ> -> <typ>
| <typ> <-> <typ>
| forall <quantifier> . <typ> <-> <typ>
<pat1> ::= <atomic_pat> (<pat_op> <atomic_pat>)*
<pat> ::= <pat1>
| <attribute> <pat>
| <pat1> as <typ>
<pat_list> ::= <pat> [,]
| <pat> , <pat_list>
<atomic_pat> ::= _
| <lit>
| <id>
| <typ_var>
| <id> ()
| <id> [ NUMBER ]
| <id> [ NUMBER .. NUMBER ]
| <id> ( <pat_list> )
| <atomic_pat> : <typ_no_caret>
| ( <pat> )
| ( <pat> , <pat_list> )
| [ <pat_list> ]
| [| |]
| [| <pat_list> |]
| struct { <fpat> (, <fpat>)* }
<fpat> ::= <id> = <pat>
| <id>
| _
<lit> ::= true
| false
| ()
| NUMBER
| undefined
| bitzero
| bitone
| BINARY_LITERAL
| HEXADECIMAL_LITERAL
| STRING_LITERAL
<exp> ::= <exp0>
| <attribute> <exp>
| <exp0> = <exp>
| let <letbind> in <exp>
| var <atomic_exp> = <exp> in <exp>
| { <block> }
| return <exp>
| throw <exp>
| if <exp> then <exp> else <exp>
| if <exp> then <exp>
| match <exp> { <case_list> }
| try <exp> catch { <case_list> }
| foreach ( <id> ID <atomic_exp> ID <atomic_exp> by <atomic_exp> in <typ> ) <exp>
| foreach ( <id> ID <atomic_exp> ID <atomic_exp> by <atomic_exp> ) <exp>
| foreach ( <id> ID <atomic_exp> ID <atomic_exp> ) <exp>
| repeat [termination_measure { <exp> }] <exp> until <exp>
| while [termination_measure { <exp> }] <exp> do <exp>
<prefix_op> ::= epsilon
| 2^
| -
| *
<exp0> ::= <prefix_op> <atomic_exp> (<exp_op> <prefix_op> <atomic_exp>)*
<case> ::= <pat> => <exp>
| <pat> if <exp> => <exp>
<case_list> ::= <case>
| <case> ,
| <case> , <case_list>
<block> ::= <exp> [;]
| let <letbind> [;]
| let <letbind> ; <block>
| var <atomic_exp> = <exp> [;]
| var <atomic_exp> = <exp> ; <block>
| <exp> ; <block>
<letbind> ::= <pat> = <exp>
<atomic_exp> ::= <atomic_exp> : <atomic_typ>
| <lit>
| <id> -> <id> ()
| <id> -> <id> ( <exp_list> )
| <atomic_exp> . <id> ()
| <atomic_exp> . <id> ( <exp_list> )
| <atomic_exp> . <id>
| <id>
| <typ_var>
| ref <id>
| <id> ()
| <id> ( <exp_list> )
| sizeof ( <typ> )
| constraint ( <typ> )
| <atomic_exp> [ <exp> ]
| <atomic_exp> [ <exp> .. <exp> ]
| <atomic_exp> [ <exp> , <exp> ]
| struct { <fexp_exp_list> }
| { <exp> with <fexp_exp_list> }
| [ ]
| [ <exp_list> ]
| [ <exp> with <vector_update_list> ]
| [| |]
| [| <exp_list> |]
| ( <exp> )
| ( <exp> , <exp_list> )
<fexp_exp> ::= <atomic_exp> = <exp>
| <id>
<fexp_exp_list> ::= <fexp_exp>
| <fexp_exp> ,
| <fexp_exp> , <fexp_exp_list>
<exp_list> ::= <exp> [,]
| <exp> , <exp_list>
<vector_update> ::= <atomic_exp> = <exp>
| <atomic_exp> .. <atomic_exp> = <exp>
| <id>
<vector_update_list> ::= <vector_update> [,]
| <vector_update> , <vector_update_list>
<attribute_data_key_value> ::= ID = <attribute_data>
| STRING_LITERAL = <attribute_data>
<attribute_data> ::= { [<attribute_data_key_value> (, <attribute_data_key_value>)*] }
| NUMBER
| STRING_LITERAL
| ID
| true
| false
| [ [<attribute_data> (, <attribute_data>)*] ]
<attribute> ::= $[ATTRIBUTE] ]
| $[ATTRIBUTE] <attribute_data> ]
<funcl_annotation> ::= Private
| <attribute>
<funcl_patexp> ::= <pat> = <exp>
| ( <pat> if <exp> ) = <exp>
<funcl_patexp_typ> ::= <pat> = <exp>
| <pat> -> <typ> = <exp>
| forall <quantifier> . <pat> -> <typ> = <exp>
| ( <pat> if <exp> ) = <exp>
| ( <pat> if <exp> ) -> <typ> = <exp>
| forall <quantifier> . ( <pat> if <exp> ) -> <typ> = <exp>
<funcl> ::= <funcl_annotation> <id> <funcl_patexp>
| <id> <funcl_patexp>
<funcls> ::= <funcl_annotation> <id> <funcl_patexp_typ>
| <id> <funcl_patexp_typ>
| <funcl_annotation> <id> <funcl_patexp> and <funcl> (and <funcl>)*
| <id> <funcl_patexp> and <funcl> (and <funcl>)*
<funcl_typ> ::= forall <quantifier> . <typ>
| <typ>
<paren_index_range> ::= ( <paren_index_range> @ <paren_index_range> (@ <paren_index_range>)* )
| <atomic_index_range>
<atomic_index_range> ::= <typ>
| <typ> .. <typ>
| ( <typ> .. <typ> )
<r_id_def> ::= <id> : <paren_index_range> (@ <paren_index_range>)*
<r_def_body> ::= <r_id_def>
| <r_id_def> ,
| <r_id_def> , <r_def_body>
<param_kopt> ::= <typ_var> : <kind>
| <typ_var>
<typaram> ::= ( <param_kopt> (, <param_kopt>)* ) , <typ>
| ( <param_kopt> (, <param_kopt>)* )
<type_def> ::= type <id> <typaram> = <typ>
| type <id> = <typ>
| type <id> <typaram> -> <kind> = <typ>
| type <id> : <kind> = <typ>
| type <id> : <kind>
| struct <id> = { <struct_fields> }
| struct <id> <typaram> = { <struct_fields> }
| enum <id> = <id> (| <id>)*
| enum <id> = { <enum_comma> }
| enum <id> with <enum_functions> = { <enum_comma> }
| union <id> = { <type_unions> }
| union <id> <typaram> = { <type_unions> }
| bitfield <id> : <typ> = { <r_def_body> }
<enum_functions> ::= <id> -> <typ> , <enum_functions>
| <id> -> <typ> ,
| <id> -> <typ>
<enum_comma> ::= <id> [,]
| <id> => <exp> [,]
| <id> , <enum_comma>
| <id> => <exp> , <enum_comma>
<struct_field> ::= <id> : <typ>
<struct_fields> ::= <struct_field>
| <struct_field> ,
| <struct_field> , <struct_fields>
<type_union> ::= Private <type_union>
| <attribute> <type_union>
| <id> : <typ>
| <id> : { <struct_fields> }
<type_unions> ::= <type_union>
| <type_union> ,
| <type_union> , <type_unions>
<rec_measure> ::= { <pat> => <exp> }
<fun_def> ::= function [<rec_measure>] <funcls>
<mpat> ::= <atomic_mpat> (<pat_op> <atomic_mpat>)*
| <atomic_mpat> as <id>
<atomic_mpat> ::= <lit>
| <id>
| <id> [ NUMBER ]
| <id> [ NUMBER .. NUMBER ]
| <id> ()
| <id> ( <mpat> (, <mpat>)* )
| ( <mpat> )
| ( <mpat> , <mpat> (, <mpat>)* )
| [ <mpat> (, <mpat>)* ]
| [| |]
| [| <mpat> (, <mpat>)* |]
| <atomic_mpat> : <typ_no_caret>
| struct { <fmpat> (, <fmpat>)* }
<fmpat> ::= <id> = <mpat>
| <id>
<mpexp> ::= <mpat>
| <mpat> if <exp>
<mapcl> ::= <attribute> <mapcl>
| <mpexp> <-> <mpexp>
| <mpexp> => <exp>
| forwards <case>
| backwards <case>
<mapcl_list> ::= <mapcl> [,]
| <mapcl> , <mapcl_list>
<map_def> ::= mapping <id> = { <mapcl_list> }
| mapping <id> : <typschm> = { <mapcl_list> }
<let_def> ::= let <letbind>
| impure
| pure
<extern_binding> ::= <id> : STRING_LITERAL
| _ : STRING_LITERAL
<externs> ::= epsilon
| = STRING_LITERAL
| = { <extern_binding> (, <extern_binding>)* }
| = <pure_opt> STRING_LITERAL
| = <pure_opt> { <extern_binding> (, <extern_binding>)* }
<val_spec_def> ::= val STRING_LITERAL : <typschm>
| val <id> <externs> : <typschm>
<register_def> ::= register <id> : <typ>
| register <id> : <typ> = <exp>
<default_def> ::= default <kind> inc
| default <kind> dec
<scattered_def> ::= scattered enum <id>
| scattered union <id> <typaram>
| scattered union <id>
| scattered function <id>
| scattered mapping <id>
| scattered mapping <id> : <funcl_typ>
| enum clause <id> = <id>
| function clause <funcl>
| union clause <id> = <type_union>
| mapping clause <id> = <mapcl>
| end <id>
<loop_measure> ::= until <exp>
| repeat <exp>
| while <exp>
<subst> ::= <typ_var> = <typ>
| <id> = <id>
<instantiation_def> ::= instantiation <id>
| instantiation <id> with <subst> (, <subst>)*
<overload_def> ::= overload <id> = { <id> (, <id>)* }
| overload <id> = <id> (| <id>)*
<def_aux> ::= <fun_def>
| <map_def>
| FIXITY_DEF
| <val_spec_def>
| <instantiation_def>
| <type_def>
| <let_def>
| <register_def>
| <overload_def>
| <scattered_def>
| <default_def>
| constraint <typ>
| $LINE_DIRECTIVE
| termination_measure <id> <pat> = <exp>
| termination_measure <id> <loop_measure> (, <loop_measure>)*
<def> ::= Private <def>
| <attribute> <def>
| <def_aux>
References
[1] Intel, “Intel 64 and IA-32 Architectures Software Developer’s Manual.” software.intel.com/en-us/articles/intel-sdm , Dec. 2016.
[2] AMD, “AMD64 Architecture Programmer’s Manual Volume 1: Application Programming.” support.amd.com/TechDocs/24592.pdf , Oct. 2013.
[7] MIPS Technologies, Inc., “MIPS64 Architecture For Programmers. Volume II: The MIPS64 Instruction Set.” Jul. 2005.
[8] MIPS Technologies, Inc., “MIPS64 Architecture For Programmers. Volume III: The MIPS64 Privileged Resource Architecture.” Jul. 2005.
[9] D. P. Mulligan, S. Owens, K. E. Gray, T. Ridge, and P. Sewell, “Lem: reusable engineering of real-world semantics,” in Proceedings of ICFP 2014: the 19th ACM SIGPLAN International Conference on Functional Programming, 2014, pp. 175–188, doi: 10.1145/2628136.2628143.
[10] “Lem implementation.” bitbucket.org/Peter_Sewell/lem/ , 2017.
[11] R. N. M. Watson et al., “Bluespec Extensible RISC Implementation: BERI Hardware reference,” University of Cambridge, Computer Laboratory, UCAM-CL-TR-868, Apr. 2015. [Online]. Available: www.cl.cam.ac.uk/techreports/UCAM-CL-TR-868.pdf.
[12] J. Heinrich, MIPS R4000 Microprocessor User’s Manual (Second Edition). MIPS Technologies, Inc., 1994.
[13] MIPS Technologies, Inc., “MIPS32 Architecture For Programmers. Volume I: Introduction to the MIPS32 Architecture.” Mar. 2001.
[14] R. N. M. Watson et al., “Capability Hardware Enhanced RISC Instructions: CHERI Instruction-Set Architecture (Version 5),” University of Cambridge, Computer Laboratory, UCAM-CL-TR-891, Jun. 2016. [Online]. Available: www.cl.cam.ac.uk/techreports/UCAM-CL-TR-891.pdf.