Skip to main content

MEP 13. Algebraic Data Types and Match

FieldValue
MEP13
TitleAlgebraic Data Types and Match
AuthorMochi core
StatusDraft
TypeStandards Track
Created2026-05-08

Abstract

Mochi has product types (structs) and sum types (tagged unions). The grammar accepts both. Pattern matching exists but does not enforce exhaustiveness or irredundancy today. This MEP documents the typing rules, proposes the missing checks, and lists the fixtures that pin both the current behaviour and the target behaviour.

Motivation

Match exhaustiveness is the single largest class of type error that Mochi catches at runtime instead of at compile time. Adding a variant to a union and forgetting to update one of the matches on it produces a runtime crash that the type checker should have caught. This MEP closes that gap.

Specification

Surface

Mochi has two ADT shapes.

Product types: structs.

type Point {
x: int
y: int
}

Sum types: tagged unions.

type Shape =
Circle(r: float)
| Square(side: float)
| Rect(w: float, h: float)

Aliases:

type Id = int
type Name = string

The surface is documented in MEP 2 (grammar) and MEP 3 (AST). This MEP focuses on typing and pattern matching obligations.

Struct typing

A struct declaration introduces:

  • A StructType in the type environment under the declared name.
  • A constructor of type fun(field1: T1, ...): Name accessed as the type name in expression position.
  • A struct literal form Name{field1: e1, field2: e2}. Order is flexible; the literal matches by field name.

Rules:

  • All fields must be present in a literal. There are no defaults today.
  • A field type mismatch raises T008 or T026 depending on shape.
  • Field access s.f requires s : StructType{...} containing f. Otherwise T026 (unknown field) or T027 (not a struct).

Methods (type T { fun foo(...) }) attach to the struct via the Methods map and become callable as t.foo(...). The receiver is implicit.

Union typing

A union declaration introduces:

  • A UnionType{Name, Variants}.
  • One constructor function per variant. Circle(r: float) : Shape.
  • Each variant is also a StructType under its name with the variant's fields, so let c: Circle = Circle(1.0) parses but the canonical type is Shape.

Rules:

  • A variant constructor must be called with the declared arity and field types.
  • A bare variant name (no parens) is rejected today, although the grammar may allow it. Document and pin the behaviour.
  • Recursion in variants (type List = Cons(head: int, tail: List) | Nil) is allowed because variants resolve in a shared map during the type decl walk.

Match typing

match e { p1 => r1, ..., pn => rn }.

Pattern shapes accepted today:

  • Literal: 1, "x", true, false. Matches by equality. Binds nothing.
  • Identifier: matches anything, binds the name in the arm body.
  • Underscore: matches anything, binds nothing.
  • Variant call: Circle(r), Rect(w, h). Matches a value of the named variant and binds field positions to the listed identifiers. The number of bindings must match the variant arity.

Result type. The arm result type is the join of ExprType(ri). With the current loose join this often degrades to any when arms differ. Tightening the join to a least upper bound is part of the v0.11.0 work.

Exhaustiveness

Today: not enforced. A match on a Shape with only Circle(_) arms is accepted. The runtime raises an error when no arm matches.

Target. Compute coverage during the match check:

  • For a scrutinee of UnionType U:
    • Walk arms, mark each named variant covered.
    • A bare identifier or underscore arm marks the rest covered, but only if it is the last arm.
    • At the end, if any variant remains uncovered and there is no catch-all, raise an error with a list of missing variants.
  • For a scrutinee of any other type, exhaustiveness is not required. A wildcard or an identifier arm always covers the rest.

The required diagnostic message:

match is not exhaustive: missing cases for Square, Rect

Irredundancy

Detect arms that cannot match because an earlier arm subsumes them:

  • Two arms with the same literal pattern: error.
  • A wildcard before any variant arm: error (the variant arm is unreachable).
  • Two arms with the same variant pattern: error if the bindings are disjoint or if either arm has no guard. Since we do not have pattern guards yet, the duplicate case always fires the error.

Recursive ADTs

Support type List = Cons(head: int, tail: List) | Nil. Tests:

  • A literal Cons(1, Cons(2, Nil)) type checks.
  • Length defined as a recursive function over the list type checks.
  • A match on the list with both variants covered is exhaustive.
  • A match missing Nil is rejected.

Generic ADTs (future)

The grammar does not support type List<T> = Cons(head: T, tail: List<T>) | Nil today. The required pieces:

  • Generic type declaration syntax.
  • Generic constructor types parameterised by the type parameters.
  • Generic match patterns that accept a type argument list at the variant call.

We mark this as future work. The parser changes are non-trivial and MEP 12 (polymorphism) needs to land first.

Fixtures to author

Existing positive fixtures:

  • user_types_basic, user_type_selector, nested_type_decl, match_expr, union_alias, match_union_variant, match_literal_pattern, struct_field_access.

New for v0.11.0:

  • adt_struct_field_present_required (error, missing field).
  • adt_struct_extra_field (error, unknown field, T026).
  • adt_union_constructor_arity (error, exists as match_variant_arity).
  • adt_union_constructor_field_type (error).
  • adt_match_exhaustive_simple (valid).
  • adt_match_missing_one (error, after enforcement lands).
  • adt_match_missing_two (error, after enforcement lands; checks both names appear in the message).
  • adt_match_redundant_variant (error, after irredundancy lands).
  • adt_match_wildcard_before_variant (error, after irredundancy lands).
  • adt_match_wildcard_last_ok (valid).
  • adt_recursive_list_length (valid).
  • adt_recursive_match_missing_nil (error).

Rationale

Match exhaustiveness is the textbook benefit of sum types. Without it, a large part of the value of declaring a union evaporates. Irredundancy is the other half: redundant arms hide bugs.

Recursive types are necessary for any non-trivial use of unions. The shared-variants resolution path we already have makes this cheap to support.

We defer generic ADTs because they require both grammar and inference work. MEP 12 needs to land first.

Backwards Compatibility

Adding exhaustiveness checking is a breaking change for any program that relies on the runtime fall-through. Programs will need an explicit catch-all arm or coverage of every variant.

Irredundancy rejection is also a breaking change but a smaller one: the redundant arms it rejects could not have fired in the first place.

Reference Implementation

  • types/check.go:1179-1207 — union type construction with shared variants map.
  • types/check.go around lines 2209 to 2284 — match arm walk.
  • Existing fixtures: tests/types/valid/match_union_variant.mochi, tests/types/errors/match_variant_arity.mochi.

Open Questions

  • Pattern guards. case x => x > 0. Out of scope for v0.11.0; revisit if the test surface demands it.
  • Or-patterns. Circle(_) | Square(_) => .... Useful for catch-some-but-not-all cases. Defer.
  • Nested patterns. Cons(1, _) over recursive types. Already partially supported; needs fixtures.

References

  • Benjamin C. Pierce, Types and Programming Languages, chapters 11 and 20.
  • Luc Maranget, "Compiling pattern matching to good decision trees", 2008.

This document is placed in the public domain.