The Strata DDM Manual
The Strata Dialect Definition Mechanism (DDM) is a set of tools for defining intermediate representations (IRs) for a wide variety of languages. The DDM combines features of a parser grammar and a datatype definition language to let developers quickly define intermediate representations that are easy to work with both programmatically and textually. Additionally, the DDM can automatically generate serialization and deserialization code for its dialects to enable distributed analysis across language and system boundaries.
The Strata DDM is implemented in Lean. Strata dialects and programs may be embedded in Lean or in standalone dialect files. In a future release, we will support working with Strata from the command line as well as other languages such as .Net-based language, JVM-based languages, Rust, Python and JavaScript/TypeScript.
Table of Contents
1. Strata Dialect Definitions
Dialects are the core mechanism in Strata used to declare and extend program
intermediate representations. A Strata dialect defines new syntactic categories,
operations, types, functions and metadata. Strata dialects are composable and
can be used to define extensions that can be used in multiple languages. There
is a builtin dialect, called Init that provides some basic declarations that
automatically are included in all other dialects.
1.1. Strata Dialect Concepts
Each Strata dialect introduces declarations that ultimately define the IR. There are five basic types of declarations:
-
A Syntactic Category declaration represents a concept in a Strata dialect definition such as an expression, type, statement, or binding. They are defined by operators.
-
A Operator declaration defines information that may be stored in a syntactic category, along with how it is textually presented. Operators have arguments that may reference categories or types, and use metadata to define how their arguments affect Strata typechecking.
-
A Type declaration declares a new builtin type for the dialect that all programs in a dialect may refer to. Types declarations do not introduce new syntactic categories in a dialect, but rather extend a dialect
Init.Typewith new types that may be referred to. This allows Strata to support type variables and polymorphism. -
A Function declaration introduces a new function into Strata's builtin expression syntactic category
Init.Expr. Functions have a user- definable syntax like operators, but can be polymorphic and are type checked after parsing. -
A Metadata declaration introduces a new attribute that may be attached to other Strata declarations. There are builtin metadata attributes used to control parsing and typechecking in dialects. Dialects may introduce new metadata to provide a mechanism for associating properties with declarations.
1.2. Strata Examples
As a simple example, we define a sequence of progressively more complex Strata dialects with Boolean and arithmetic expressions as well as a simple assertion language. Each example provides just enough detail to help readers understand the example code. The Dialect Language Reference contains additional detail on the commands.
dialect BoolDialect; // Introduce Boolean type type BoolType; // Introduce literals as constants. fn true_lit : BoolType => "true"; fn false_lit : BoolType => "false"; // Introduce basic Boolean operations. fn not_expr (a : BoolType) : BoolType => "-" a; fn and (a : BoolType, b : BoolType) : BoolType => @[prec(10), leftassoc] a " && " b; fn or (a : BoolType, b : BoolType) : BoolType => @[prec(8), leftassoc] a " || " b; fn imp (a : BoolType, b : BoolType) : BoolType => @[prec(8), leftassoc] a " ==> " b; // Introduce equality operations that work for arbitrary types. // The type is inferred. fn equal (tp : Type, a : tp, b : tp) : BoolType => @[prec(15)] a " == " b; fn not_equal (tp : Type, a : tp, b : tp) : BoolType => @[prec(15)] a " != " b;
We can then extend these operations with an Integer type and operations.
dialect Arith; import BoolDialect; type Int; fn neg_expr (a : Int) : Int => "- " a; fn add_expr (a : Int, b : Int) : Int => @[prec(25), leftassoc] a " + " b; fn sub_expr (a : Int, b : Int) : Int => @[prec(25), leftassoc] a " - " b; fn mul_expr (a : Int, b : Int) : Int => @[prec(30), leftassoc] a " * " b; fn exp_expr (a : Int, b : Int) : Int => @[prec(32), rightassoc] a " ^ " b; fn le (a : Int, b : Int) : BoolType => @[prec(15)] a " <= " b; fn lt (a : Int, b : Int) : BoolType => @[prec(15)] a " < " b; fn ge (a : Int, b : Int) : BoolType => @[prec(15)] a " >= " b; fn gt (a : Int, b : Int) : BoolType => @[prec(15)] a " > " b;
By itself, these dialects do not define a new language. To define a
language, one needs to extends a builtin Command category with
new commands. This is done with the following dialect that introduces
commands for assertions and defining functions:
dialect AssertLang;
import Arith; // Automatically imports BoolDialect dependency of Arith
// This introduces a new operator into the Command category.
op assert (b : BoolType) : Command => "assert " b ";";
// Introduce a category for arguments.
category ArgDecl;
@[declare(var, tp)]
op decl (var : Ident, tp : Type) : ArgDecl => var " : " tp;
category ArgDecls;
op decls (l : CommaSepBy ArgDecl) : ArgDecls => "(" l ")";
@[declareFn(name, args, tp)]
op defFun (name : Ident, args : ArgDecls, tp : Type, @[scope(args)] value : tp)
: Command => "def " name args "=" value ";";
// Now that we have binders, introduce quantifiers
fn forall_expr (args : ArgDecls, @[scope(args)] b : BoolType) : BoolType =>
"forall " args ", " b;
fn exists_expr (args : ArgDecls, @[scope(args)] b : BoolType) : BoolType =>
"exists " args ", " b;
With these command extensions, we are now ready to use the dialects in programs:
program AssertLang;
assert forall (a : Int), exists (b : Int), b > a;
// Assert Fermat's last theorem
assert forall (a : Int, b : Int, c : Int, n : Int),
a > 0 && b > 0 && c > 0 && n > 2 ==> a^n + b^n != c^n;
// Introduce function
def addMul(a : Int, b : Int, c : Int) = a * b + c;
assert forall (a : Int, b : Int, c : Int),
b >= c ==> addMul(a, b, b) >= addMul(a, c, c);
2. Dialect Language Reference
The Strata Dialect Definition Language is defined in the rest of this section.
The convention is to use monospace text for concrete syntax in the dialect definition
language and italic for identifiers and other expressions introduced as part of a
specific dialect.
Each dialect definition must begin with the dialect followed by the name of the dialect.
dialect name;.
Additional commands are defined in the following subsections.
2.1. Importing dialects
Imports bring declarations of another dialect into the current dialect being declared. This includes transitive imports of the dialect being imported.
import ident;
Imports the dialect ident.
2.2. Syntactic Categories
Syntactic categories are introduced by the category declaration:
category name;
Declares a new syntactic category name.
2.3. Operators
Operators extend syntactic categories with additional constructors for data. Each Strata operator has a name that is distinct from other names in the dialect, a list of arguments to the operator, the syntactic category being extended, and a syntax definition describing how to pretty-print the operator.
md op name(id1 : k1, id2 : k2, ...) : cat => syntax;
Declares a new operator name where
-
md is optional metadata (see Metadata).
-
(id1:k1,id2:k2,...)are optional bindings and may be omitted if the operator takes no arguments. Each identifier idN is the name of an argument while the kN annotation may refer to a syntax category or type. -
cat is the syntax category that is extended by this operator.
-
syntax is the syntax definition.
2.3.1. Examples
A simple operator is an assert operator that takes a Boolean expression and extends
a hypothetical Statement category.
op assert (b : BoolType) : Statement => "assert" b ";";
For expressions, Strata supports polymorphic types. An example operator that uses this is a polymorphic assignment operator in the Strata Core statement dialect.
op assign (tp : Type, v : Ident, e : tp) : Statement => v " := " e ";";
As an example, the following introduces an operator const that takes a single binding value
as argument and produce a Command. The scope and prec metadata attributes are
described in the Metadata.
@[scope(b)] op const (b : Binding) : Command => @[prec(10)] "const " b ";";
2.4. Types and Expressions
As mentioned, the builtin Init.Type and Init.Expr categories are not extended using
the op declaration. Instead one introduces new types with the type declaration and
extends the expression language with new functions using the fn declaration. The builtin
Init.Type and Init.Expr categories have some special support to simplify the
creation of typed IRs in Strata.
-
Strata supports type polymorphic functions and operations and has a support for type inference that allows it to automatically infer type arguments as long as an expression argument uses that type.
-
Unlike categories, user types may contain parameters and parameters may themselves be variables. This allows more general functions than can be supported by operators such as a polymorphic length function over lists.
-
Types use a standardized syntax that do not require syntax definitions for each type.
-
After parsing, expressions are type checked to ensure that the syntax is well formed.
2.4.1. Types
type name(id1 : Type, id2 : Type, ...);
Declares a new type with optional parameters with names given by the identifiers id1, id2, ...
The code below declares a type BoolType and a polymorphic Map type.
type BoolType; type Map (dom : Type, range : Type); -- Ex. Map Nat BoolType
2.4.2. Expressions
metadata fn name(id1 : k1, id2 : k2, ...) : type => syntax;
This introduces a new function
As an example, the code below introduces a polymorphic if expression.
fn if (tp : Type, c : BoolType, t : tp, f : tp) : tp => "if " c " then " t "else " f;
2.5. Metadata
The Strata Init dialect provides a builtin Init.Metadata category that allows metadata
to be declared and attached to other declarations in dialects. Predefined metadata attributes
are used in dialect definitions for
defining precedence and type checking, but additional metadata
attributes can be declared in dialects to build new capabilities on top of Strata. The goal
of this is to provide a user-controllable extension mechanism to Strata dialects so that
dialects themselves may be repurposed to new use cases.
metadata name (id1 : tp1, id2 : tp2, ... );
Declares a new metadata operation with the given name.
The type annotations in tp1 are currently restricted to Ident
(for referencing arguments), natural numbers (Num) and optional versions of
each (Option Ident and Option Num). More general support for metadata will
be added in a future release.
2.5.1. Syntax Definitions
How operations and functions are represented in Strata's textual format depends on syntax definitions. Each syntax definition is a sequence of tokens that describe the syntax of the operator or function.
-
A string literal such as
"assert "outputs the literal text as a token. Spacing should be used to ensure whitespace appears after a token. -
An argument name
argindicates the associated argument to the function or operator appears in output. The precedence can be set with the syntaxarg:prec. -
For multiline declarations, the function
indent(n,tokens)may be used. The tokens will appear indented when they appear on new lines.
There are three metadata annotations in the Strata dialect that may affect precedence in syntax definitions.
metadata prec(p : Nat); -- Controls the precedence of the outermost operator metadata leftassoc; -- Indicates operator is left-associative metadata rightassoc; -- Indicates operator is right associative.
As an example, a right associative implies operator can be defined with the syntax:
fn implies (p : Pred, q : Pred) : Pred => @[prec(20), rightassoc] p " ==> " q;
Without the rightassoc operator, this is equivalent to the definition:
fn implies (p : Pred, q : Pred) : Pred => @[prec(20)] p:20 " ==> " q:19;
2.6. Parsing, Type Checking and Pretty Printings
When parsing an operation or expression from source, there is an implicit variable context that contains bindings for variables in scope. Operators may transform this context while expressions are evaluated within a context. The changes to the context are controlled via metadata, and the Strata dialect makes three declarations for affecting this.
-- Specify result scope of argument used to evaluate -- argument or following code. metadata scope(name : Ident); -- Declares a new variable in the resulting scope. metadata declare(name : Ident, type : TypeOrCat); -- Declares a function in the resulting scope with name, -- arguments, and return type determined by the given arguments. metadata declareFn(name : Ident, args : Ident, type : Ident); -- Marks a list argument as requiring at least one element. metadata nonempty;
As an example, Strata Core's variable declaration syntax can be defined in Strata using the following code:
category Decl; -- Decl represents a single variable declaration that extends the context. @[declare(v, tp)] op var (v : Ident, tp : Type) : Decl => v ":" tp; category DeclList; @[scope(d)] op declAtom (d : Decl) : DeclList => d; @[scope(d)] op declPush (dl : DeclList, @[scope(dl)] d : Decl) : DeclList => dl "," d; category Statement; @[scope(dl)] op varStatement (dl : DeclList) : Statement => "var " dl ";";
2.6.1. Polymorphic Type Variables
The @[declareTVar] annotation allows polymorphic function declarations
where type parameters (like <a, b>)
need to be in scope when parsing parameter types and return types.
For example, function declarations in Strata.Core are defined as
the following:
category TypeVar;
@[declareTVar(name)]
op type_var (name : Ident) : TypeVar => name;
category TypeArgs;
@[scope(args)]
op type_args (args : CommaSepBy TypeVar) : TypeArgs => "<" args ">";
@[declareFn(name, b, r)]
op command_fndecl (name : Ident,
typeArgs : Option TypeArgs,
@[scope(typeArgs)] b : Bindings,
@[scope(typeArgs)] r : Type) : Command =>
"function " name typeArgs b ":" r ";";
This allows parsing declarations like function identity<a>(x: a): a.
2.7. The Init dialect
The Init dialect introduces a few syntactic categories that cannot currently
be defined in user definable dialects.
-
Init.TypeandInit.Exprrepresent types and expressions in Strata. Unlike other categories, these are extended through the specializedtypeandfncommands rather thanopdeclarations so that expressions can be type checked after parsing. See Types and Expressions for more details. -
Syntactic categories that in dialects other than
Initcurrently cannot take additional parameters. This support will be added in a future update, but to help users, theInitdeclares a few builtin parametric categories:-
Init.Option cdenotes an optional value ofc. Syntactically, eithercmay be read in or the empty string. -
Init.Seq cdenotes a sequence of values with categoryc. Each value ofcis separated by whitespace. When formatted, values are concatenated with no separator. -
Init.CommaSepBy cdenotes a comma-separated list of values of typec. When formatted, values are separated by,(comma followed by space). -
Init.SpaceSepBy cdenotes a space-separated list of values of typec. Parsing is identical toInit.Seq, but when formatted, values are separated by a single space. -
Init.SpacePrefixSepBy cdenotes a space-prefix-separated list of values of typec. Parsing is identical toInit.Seq, but when formatted, each value is prefixed with a space.
All list categories (
CommaSepBy,SpaceSepBy,SpacePrefixSepBy,Seq) can be marked with the@[nonempty]metadata attribute to require at least one element during parsing. For example:@[nonempty] items : SpaceSepBy Itemwill reject empty lists with a parse error. -
-
Parsing for primitive literals and identifiers cannot be directly in syntax definitions. To accommodate this, the
Initdialect introduces the syntactic categories for this:-
Init.Identrepresents identifiers. These are alphanumeric sequences that start with a letter that are not otherwise used as keywords in a dialect. -
Init.Strrepresents string literals. These are delimited by double quotes and use escaping for special characters. -
Init.Numrepresents numeric natural number literals.
-
2.8. Datatypes
The DDM has special support for defining dialects with algebraic datatypes (ADTs) similar to those found in functional programming languages. Datatypes allow one to define custom types with multiple constructors, each of which can have zero or more fields (constructor arguments).
Datatypes differ from other language constructs (e.g. types, operators), since they define several operations simultaneously. For example, an SMT datatype declaration defines (in addition to the type itself and the constructors) tester functions to identify to which constructor an instance belongs and accessor functions to extract the fields of a constructor. Dafny datatypes additionally produce an ordering relation, while Lean inductive types produce eliminator instances defining induction principles. The DDM enables automatic creation of (a subset of) these auxiliary functions via a configurable function template system.
2.8.1. Example
In the Strata Core dialect, the auxiliary functions are testers and accessors. That is, one can define the following datatype in Strata Core:
datatype IntList {
Nil(),
Cons(head: int, tail: IntList)
};
This declares a list type with two constructors (Nil and Cons) and two
fields (head and tail). The Core dialect automatically generates:
-
Constructors:
Nil : IntListandCons : int -> IntList -> IntList -
Testers:
IntList..isNil : IntList -> boolandIntList..isCons : IntList -> bool -
Accessors:
head : IntList -> intandtail : IntList -> IntList
2.8.2. Defining Datatype Syntax in a Dialect
To support datatypes in a dialect, you must define syntactic categories and operators with appropriate annotations. The Core dialect provides a complete example.
2.8.2.1. Constructor Syntax
Constructors define the variants of a datatype. Constructor fields are specified via Bindings like other function declarations.
category Constructor;
category ConstructorList;
@[constructor(name, fields)]
op constructor_mk (name : Ident, fields : Option (CommaSepBy Binding)) : Constructor =>
name "(" fields ")";
@[constructorListAtom(c)]
op constructorListAtom (c : Constructor) : ConstructorList => c;
@[constructorListPush(cl, c)]
op constructorListPush (cl : ConstructorList, c : Constructor) : ConstructorList =>
cl "," c;
The annotations:
-
@[constructor(name, fields)]marks this operation as a constructor definition, wherefieldsis aBindingsargument containing the constructor arguments -
@[constructorListAtom(c)]marks a single-element constructor list -
@[constructorListPush(cl, c)]marks an operation that extends a constructor list
2.8.2.2. Datatype Command
The main datatype declaration uses @[declareDatatype] to tie everything
together. It is best illustrated with an example (from the Core dialect):
@[declareDatatype(name, typeParams, constructors,
perConstructor([.datatype, .literal "..is", .constructor], [.datatype], .builtin "bool"),
perField([.field], [.datatype], .fieldType))]
op command_datatype (name : Ident,
typeParams : Option Bindings,
@[scopeDatatype(name, typeParams)] constructors : ConstructorList)
: Command =>
"datatype " name typeParams " {" constructors "}" ";\n";
@[declareDatatype] declares a datatype command operator given the datatype
name, the optional type parameters, the constructor list, and zero or more
function templates to expand. Note that the @[scopeDatatype]
annotation brings the datatype name and type parameters into scope when
parsing the constructors, enabling recursive type references.
2.8.2.3. Function Templates
Function templates specify patterns for generating auxiliary functions from datatype
declarations.
Currently there are two supported templates: perConstructor generates one
function per constructor, while perField generates one function per field
(note that the DDM enforces that fields are unique across all constructors in
a datatype).
perConstructor(namePattern, paramTypes, returnType)
perField(namePattern, paramTypes, returnType)
Each template has three components:
-
namePattern is an array of name parts:
.datatype,.constructor, or.literal "str" -
paramTypes is an array of type references for parameters
-
returnType is a type reference for the return type
Name patterns consist of the following components:
-
.datatype- Expands to the datatype name -
.constructor- Expands to the constructor name inperConstructor -
.field- Expands to the field name inperField -
.literal "str"- A literal string
Type references consist of the following components:
-
.datatype- The datatype type (e.g.,IntList) -
.fieldType- The type of the current field inperField -
.builtin "bool"- A built-in type from the dialect
2.8.2.3.1. Example Templates
The Strata Core dialect uses two templates:
The tester template generates IntList..isNil : IntList -> bool:
perConstructor([.datatype, .literal "..is", .constructor], [.datatype], .builtin "bool")
The accessor template generates head : IntList -> int:
perField([.field], [.datatype], .fieldType)
An alternative indexer template could generate IntList$idx$Nil : IntList -> int:
perConstructor([.datatype, .literal "$idx$", .constructor], [.datatype], .builtin "int")
3. Lean Integration
Strata provides two complementary ways to work with dialect ASTs in Lean.
The generic AST is a set of dialect-agnostic types that can represent
programs in any dialect. It is used by the DDM's parser, pretty-printer,
serializer, and other infrastructure that must operate uniformly across
dialects. The #strata_gen command takes the opposite approach: given a
specific dialect, it generates Lean inductive types with one constructor per
operator, enabling direct pattern matching and type-safe construction.
Conversion functions (toAst/ofAst) bridge the two representations.
3.1. The Generic AST
The generic AST is defined in Strata.DDM.AST. All types are parameterized
by an annotation type α (typically SourceRange for source locations).
The core types are listed below in the same order as the dialect concepts they represent:
-
SyntaxCatF α— a reference to a syntactic category, identified by its qualified name. -
OperationF α— an application of a dialect operator. Contains the operator's qualified name and an array of arguments. -
ArgF α— a single argument to an operation. This is a sum type whose variants cover all the kinds of data an operator can receive: nested operations, expressions, types, identifiers, numeric literals, string literals, decimal literals, byte arrays, category references, optional values, and sequences. -
TypeExprF α— a type expression. Variants cover dialect-defined types (ident), bound and polymorphic type variables (bvar,tvar), global type references (fvar), and function types (arrow). -
ExprF α— a typed expression. Expressions are represented in curried form: a head (fnfor a named dialect function,bvarfor a bound variable, orfvarfor a free variable) applied to arguments viaapp. Type arguments found bound and free variables are implicit and omitted.
For the common case where annotations are source locations, Strata defines abbreviations:
abbrev SyntaxCat := SyntaxCatF SourceRange abbrev Operation := OperationF SourceRange abbrev Arg := ArgF SourceRange abbrev TypeExpr := TypeExprF SourceRange abbrev Expr := ExprF SourceRange
3.1.1. How DDM Concepts Map to the Generic AST
The following table summarizes how the DDM concepts described in earlier sections are represented in the generic AST.
DDM Concept | Generic AST Type | Notes |
|---|---|---|
Syntactic category |
|
Identified by qualified name (e.g. |
Operator application |
|
Qualified name + array of |
Type expression |
| Concrete types, type variables, function types |
Expression |
|
Curried: head ( |
Identifier literal |
|
e.g. variable names from |
Numeric literal |
|
From |
String literal |
|
From |
|
|
Array of |
|
|
|
The generic AST is dialect-agnostic: operators are identified by their
QualifiedIdent (a dialect name paired with an operator name) rather than
by distinct Lean constructors. This uniformity makes it the natural
representation for DDM infrastructure such as the parser, formatter, and
serializer, but it means that working with a specific dialect requires
string-based matching on operator names.
3.2. The #strata_gen Command
The #strata_gen command bridges Strata dialect definitions and Lean by
automatically generating Lean inductive types and conversion functions from a
dialect definition. This lets developers work with dialect ASTs as ordinary
Lean values — constructing, pattern matching, and transforming them — while
retaining the ability to convert back to the generic AST representation
for serialization, pretty-printing, and cross-dialect interoperability.
3.2.1. Basic Syntax
import Strata.DDM.Integration.Lean namespace MyDialect #strata_gen MyDialect end MyDialect
#strata_gen requires import Strata.DDM.Integration.Lean and takes a single
argument: the name of a dialect that has already been defined (via
#dialect ... #end or #load_dialect). The command should typically be placed
inside a namespace block so that the generated types and functions are scoped
under that namespace.
3.2.1.1. Tracing
To see what #strata_gen generates, enable the Strata.generator trace
option immediately before the command:
set_option trace.Strata.generator true in #strata_gen MyDialect
This prints each generated definition (inductives, toAst, ofAst) and the
dependency groups used to order them. For example, running #strata_gen on the
AssertLang dialect defined earlier in this manual produces:
[Strata.generator] Generating AssertLangType [Strata.generator] Generating AssertLangType.toAst [Strata.generator] Generating AssertLangType.ofAst [Strata.generator] Generating ArgDecl [Strata.generator] Generating ArgDecl.toAst [Strata.generator] Generating ArgDecl.ofAst [Strata.generator] Generating ArgDecls [Strata.generator] Generating ArgDecls.toAst [Strata.generator] Generating ArgDecls.ofAst [Strata.generator] Generating Expr [Strata.generator] Generating Expr.toAst [Strata.generator] Generating Expr.ofAst [Strata.generator] Generating Command [Strata.generator] Generating Command.toAst [Strata.generator] Generating Command.ofAst [Strata.generator] Declarations group: [Init.Type] [Strata.generator] Declarations group: [AssertLang.ArgDecl] [Strata.generator] Declarations group: [AssertLang.ArgDecls] [Strata.generator] Declarations group: [Init.Expr] [Strata.generator] Declarations group: [Init.Command]
Each category produces an inductive type, a toAst function, and an ofAst
function. The "Declarations group" lines show the topologically sorted groups
in the order they are emitted to Lean.
3.2.2. What Gets Generated
For each syntactic category used by a dialect, #strata_gen generates four
things:
-
An inductive type parameterized by an annotation type
α. -
An
Inhabitedinstance (when possible). -
A
toAstfunction that converts from the generated type to Strata's generic AST representation (ExprF,TypeExprF, orOperationF). -
An
ofAstfunction that converts from the generic AST back to the generated type (returningOfAstM, an error monad).
Every generated constructor includes an ann : α field that carries
source-location or other annotation data through transformations.
3.2.2.1. Determining Which Categories Are Used
Not every category in a dialect's transitive imports needs code generated for
it. #strata_gen computes the set of used categories by starting from the
categories and types directly declared or referenced in the dialect and
following argument dependencies transitively. For example, if a dialect
declares only op assert (b : BoolType) : Command => ..., the generator will
produce types for Command, Expr (since BoolType is a type in
Init.Expr), and Type — but not for categories like Binding that are never
referenced.
Primitive categories (Init.Ident, Init.Num, Init.Str, etc.) map
directly to Lean types (String, Nat, String, etc.) and do not produce
their own inductive definitions.
3.2.2.2. Handling Mutually Recursive Categories
Categories can be mutually recursive. For example:
category MutA; category MutB; op opA (b : MutB) : MutA => "opA " b; op opB (a : MutA) : MutB => "opB " a;
#strata_gen uses Tarjan's algorithm to identify strongly connected
components — groups of categories that reference each other. Each group is
emitted as a Lean mutual ... end block so that the inductive types, toAst
functions, and ofAst functions can refer to one another. Groups are
processed in topological order so that a group's dependencies are always
defined before the group itself.
3.2.3. Example Walkthrough
Consider a small SQL dialect:
dialect SQL;
op create (name : Ident, c : CommaSepBy Ident) : Command =>
"CREATE " name "(" c ")" ";\n";
op drop (name : Ident) : Command =>
"DROP " name ";\n";
category Columns;
op colStar : Columns => "*";
op colList (c : CommaSepBy Ident) : Columns => c;
op select (col : Columns, table : Ident) : Command =>
"SELECT " col " FROM " table ";\n";
After #strata_gen SQL, the generator produces two inductive types.
Command gets one constructor per operation that targets it:
inductive Command (α : Type) : Type where | create (ann : α) (name : Ann String α) (c : Ann (Array String) α) : Command α | drop (ann : α) (name : Ann String α) : Command α | select (ann : α) (col : Columns α) (table : Ann String α) : Command α deriving Repr
Columns gets its own constructors:
inductive Columns (α : Type) : Type where | colStar (ann : α) : Columns α | colList (ann : α) (c : Ann (Array String) α) : Columns α deriving Repr
Notice that:
-
Identarguments becomeAnn String α(aStringcarrying an annotation). -
CommaSepBy Identarguments becomeAnn (Array String) α. -
The custom category
Columnsappears directly as a type inCommand.select.
The generated toAst function converts from the dialect-specific type to the
generic AST, and ofAst converts back:
-- Convert generated type → generic AST
Command.toAst : {α : Type} → [Inhabited α] → Command α → OperationF α
-- Convert generic AST → generated type
Command.ofAst : {α : Type} → [Inhabited α] → [Repr α]
→ OperationF α → OfAstM (Command α)
3.2.3.1. Expression and Type Categories
Dialects that declare types (via type) or functions (via fn) also get
generated Expr and Type inductives. These include builtin constructors
in addition to dialect-defined ones. For instance, the Arith dialect's
generated Expr type includes:
inductive Expr (α : Type) : Type where -- Builtin constructors (always present for Init.Expr) | fvar (ann : α) (idx : Nat) : Expr α | bvar (ann : α) (idx : Nat) : Expr α | app (ann : α) (fn : Expr α) (arg : Expr α) : Expr α -- Dialect-defined constructors | btrue (ann : α) : Expr α | bfalse (ann : α) : Expr α | intLit (ann : α) (n : Ann Nat α) : Expr α | add (ann : α) (x : Expr α) (y : Expr α) : Expr α | mul (ann : α) (x : Expr α) (y : Expr α) : Expr α -- ... (additional constructors omitted) deriving Repr
The fvar, bvar, and app constructors correspond to free variables, bound
variables (de Bruijn indices), and function application in Strata's core
expression model.
3.2.4. Post-Generation Patterns
After running #strata_gen, there are several common patterns for working
with the generated types.
3.2.4.1. Deriving Typeclass Instances
Generated types automatically derive Repr. Other instances can be derived
after generation:
deriving instance BEq for Command, Columns
For larger dialects with many categories, this can be done in bulk:
deriving instance BEq for SpecConstant, QualifiedIdent, Symbol, SMTSort, Term, Command
Support for automatically deriving additional instances during generation will be added in a future release.
3.2.4.2. Verifying Generated Types
Use #print and #check to inspect what was generated:
#print Command -- Shows inductive definition and constructors
#print Columns
#check Columns.colStar -- Columns.colStar : {α : Type} → α → Columns α
3.2.4.3. Round-Trip Testing
A simple round-trip test can verify that toAst and ofAst are inverses:
def testRoundTrip (e : Expr Unit) : Bool := match e |> Expr.toAst |> Expr.ofAst with | .error _ => false | .ok g => e == g #guard testRoundTrip <| Expr.btrue () #guard testRoundTrip <| Expr.app () (.fvar () 0) (.btrue ())
3.2.4.4. Dependent Dialects
When a dialect imports another, #strata_gen automatically resolves the full
transitive closure of dependencies. Each invocation generates its own set of
types — if ArithChecker imports Arith and uses Init.Expr, it will produce a
new ArithChecker.Expr that includes all of Arith's constructors. The two
Expr types are isomorphic but distinct:
namespace Arith #strata_gen Arith -- Generates Arith.Expr, Arith.ArithType, ... end Arith namespace ArithChecker #strata_gen ArithChecker -- Generates ArithChecker.Expr, ArithChecker.Command, ... end ArithChecker -- ArithChecker.Expr is isomorphic to Arith.Expr (same constructors) -- but they are separate types.
4. Command Line Use
This will be described once available.