GEP-23


Metadata
Number

GEP-23

Title

Monadic comprehensions

Version

4

Type

Feature

Status

Final

Comment

Delivered in Groovy 6.0 as an incubating feature via the DO macro; subject to refinement during the 6.x line

Leader

Paul King

Created

2026-05-19

Last modification

2026-05-22

Abstract: Monadic comprehensions

DO is a comprehension macro that rewrites a sequence of name-binding generators followed by a body into a chain of bind operations on a participating carrier type. It provides Scala-style for-comprehension and Haskell-style do-notation ergonomics for any type with monadic shape — Optional, Stream, CompletableFuture, Groovy’s Awaitable and DataflowVariable, common Functional Java and Vavr types, and user-defined carriers that opt in.

The following artefacts are introduced:

  • the DO macro in the macro library, performing a purely syntactic tree rewrite at compile time;

  • the @groovy.transform.Monadic annotation, by which a user type opts in and may declare non-conventional bind/map method names;

  • groovy.typecheckers.MonadicChecker — a type-checking extension that enforces the monadic shape under @CompileStatic/@TypeChecked and supplies the static types that flow through the rewritten chain;

  • groovy.typecheckers.MonadicShapeChecker — a sibling type-checking extension that lints hand-written flatMap/map chains over the same carrier set, independent of DO.

The proposal is deliberately narrow. It does not introduce higher-kinded types, a Functor/Applicative/Monad interface hierarchy, monad transformers, or automatic pure/unit lifting. It generalises a composition pattern Groovy already commits to for Awaitable so that the same notation is available to any carrier with the same shape.

This GEP specifies the language semantics. Worked, tutorial-style examples live in the language specification chapter on monadic comprehensions; this document is intentionally terse and prescriptive.

Motivation

Composing values that live inside a carrier — an Optional that may be absent, an Awaitable that will complete later, a validation result that may have failed — is normally expressed either as hand-written flatMap/then chains, which nest deeply and obscure data flow as the number of participants grows, or, for the asynchronous case, as imperative async { …​ await x; await y; …​ }, which reads well but does not extend to non-async carriers.

Other languages addressed this with notation that lives one level above the underlying carrier: Scala’s for-comprehensions, Haskell’s do-notation, F#'s computation expressions. Each desugars a sequence of name-binding generators into a chain of bind operations; the notation is uniform across carriers and the carrier-specific behaviour is delivered entirely by the methods of the carrier itself.

Groovy already has the facilities to deliver this outcome without inventing an abstraction hierarchy: compile-time macros for the syntactic rewrite, and type-checking extensions (GEP-8) for teaching static compilation new structural rules without altering the type system. DO combines the two.

A side-by-side comparison with Scala for, Haskell do, Kotlin coroutine-based composition, and F# computation expressions is given in Comparison with related constructs below.

Specification

Surface syntax

DO takes a comma-separated list of generators followed by a closure body:

def result = DO(x in m1,
                y in f(x),
                z in g(x, y)) {
    body(x, y, z)
}

Each generator has the form name in expression. The bound name is in scope in the source expression of every subsequent generator and in the body. Every generator’s source expression, and the body, must evaluate to a value of the same carrier type.

The all-uppercase name follows the convention of the existing macro-library entries (SV, NV, and friends). The uppercase form signals to readers that the call is rewritten at compile time and is not an ordinary method invocation. DO is a contextual name and remains usable as an ordinary identifier elsewhere.

Desugaring

Every generator becomes a bind; the body is the innermost closure body. The rewrite is:

Source Expansion

DO(x in m1) { body }

m1 bound with { x -> body }

DO(x in m1, y in f(x)) { body }

m1 bound with { x -> f(x) bound with { y -> body } }

DO(x in m1, y in f(x), z in g(x, y)) { body }

m1 bound with { x -> f(x) bound with { y -> g(x, y) bound with { z -> body } } }

Because the macro expands before type information is available, it does not emit a carrier-specific method name directly; it emits calls to a bind dispatcher (see Runtime model). The body must itself yield a value of the carrier type — there is no implicit lifting of a plain value back into the carrier in this version (see Non-goals).

Short-circuiting is delivered by the carrier, not by the macro: an empty or failed carrier propagates through the chain and the body is never evaluated.

The monadic shape

A type M qualifies as a carrier when it exposes a bind operation of the shape <B> M<B> bind(fn) where fn maps the element type to M<B>, and, for the map role, an analogous <B> M<B> map(fn). The structural check is generous about the surface and strict about the algebra: the function argument may be declared as java.util.function.Function, a Closure, or any other single-abstract-method interface; the generator closure is adapted to whichever the carrier declares.

The monad laws (left identity, right identity, associativity) are not enforced by the compiler. This matches the treatment of @groovy.transform.Reducer/@groovy.transform.Associative: structural participation, algebraic-law obligation on the implementer, intended to be backed by tests.

Participating carriers

Participation is resolved in the following order, the first match winning:

  1. Standard allow-list. A fixed table covers stdlib and Groovy-core carriers whose method names diverge from the structural convention, matched by type:

    Carrier Source bind map

    java.util.Optional

    JDK

    flatMap

    map

    java.util.stream.Stream

    JDK

    flatMap

    map

    java.util.concurrent.CompletableFuture

    JDK

    thenCompose

    thenApply

    java.util.concurrent.CompletionStage

    JDK

    thenCompose

    thenApply

    groovy.concurrent.Awaitable

    Groovy 6

    thenCompose

    then

    groovy.concurrent.DataflowVariable

    Groovy 6

    thenCompose

    then

    Note
    Awaitable and DataflowVariable bind via thenCompose; their then method is the map operation, not bind.
  2. Standard allow-list (by name). Common Functional Java carriers — fj.data.Option, fj.data.List, fj.data.Stream, fj.data.Validation and fj.P1 — are recognised by fully-qualified name using that library’s bind/map convention. Groovy takes no dependency on Functional Java; the names are matched reflectively and the generator closure is adapted to fj.F. fj.data.Either is not directly monadic in Functional Java (bind lives on its .right()/.left() projections) and is not a carrier.

    The Vavr control carriers — io.vavr.control.Option, io.vavr.control.Try, io.vavr.control.Either (right-biased), and io.vavr.control.Validation — are likewise recognised by fully-qualified name. Vavr’s carriers follow the structural flatMap/map convention, so they would also be picked up by the structural-match rule below; the explicit name entries are retained so that the names appear in the standard allow-list, the participation check succeeds without the structural probe, and MonadicChecker errors point readers at a documented carrier set when they get the shape wrong. As with Functional Java, Groovy takes no dependency on Vavr; the names are matched reflectively.

  3. Structural match. A type offering a single-argument flatMap (and, for the map role, map) qualifies without further declaration. This covers third-party libraries and user types that follow the convention.

  4. @Monadic opt-in. A type annotated @groovy.transform.Monadic participates even where its methods diverge from the structural convention. The annotation may declare alternative names:

    @Monadic(bind = 'chain', map = 'transform')
    class Result<A> {
        <B> Result<B> chain(Function<A, Result<B>> f) { ... }
        <B> Result<B> transform(Function<A, B> f) { ... }
    }

    When both attributes are omitted the annotation merely opts the type in and the structural defaults (flatMap, map) apply. The annotation is matched by simple name, in the manner of @Reducer/@Associative.

@Monadic is a carrier-author concern, not a user-application concern. Application code composing values typically uses Optional, Awaitable, Validation, Try and so on directly — with participation either inferred structurally, registered by name in the core allow-list (stdlib, FJ, Vavr), or declared once by the carrier author. The @Monadic annotation, like @Reducer / @Associative / @Pure elsewhere in Groovy 6, is intended primarily to be written by library authors on the public types they own; user code is then verified against those declarations without acquiring annotations of its own. The user-configurable registration channel deferred to Groovy 7.0 fills the remaining gap — third-party carriers whose owners have not (yet) applied @Monadic and whose method names are non-structural.

Behaviour under @CompileStatic

Under dynamic Groovy, the rewritten code is ordinary method dispatch; a value that does not respond to the resolved bind method fails at runtime with the usual MissingMethodException.

DO under static compilation: MonadicChecker

Under @CompileStatic/@TypeChecked, the groovy.typecheckers.MonadicChecker type-checking extension (GEP-8) is activated via the extensions member:

@CompileStatic(extensions = 'groovy.typecheckers.MonadicChecker')

The extension:

  • rejects, at compile time, any DO whose carrier fails all participation tests, with an error naming the offending type and the missing method shape;

  • types each generator’s bound name as the carrier’s element type, so the body type-checks;

  • restores the comprehension’s result type, so chained and nested use type-checks rather than degrading to Object;

  • for trusted carriers (allow-list or @Monadic), enforces the closure-return contract that the dispatcher’s erased (Object, Closure):Object signature hides from STC: a bind closure must yield the same carrier (catching a bare-value body and a cross-carrier body, including in nested DO), and a hand-written Comprehensions.map closure must not yield the same carrier (catching the M<M<T>> foot-gun). Structural-only carriers are not asserted against, matching the permissive treatment of participation.

Dependent generators (a later generator whose source expression uses an earlier bound name) and nested comprehensions are supported under static compilation.

Native chains: MonadicShapeChecker

A sibling extension groovy.typecheckers.MonadicShapeChecker lints hand-written flatMap/map/thenCompose/thenApply chains over the same carrier set. It is independent of DO; use it on codebases that mix or favour native chains.

@CompileStatic(extensions = 'groovy.typecheckers.MonadicShapeChecker')

It flags three high-confidence problems:

  • bind returning a non-carrier — e.g. Optional.flatMap { it + 1 }, where Groovy’s SAM coercion can let an Integer-returning closure slip past STC;

  • bind returning a different carrier — e.g. Stream.flatMap { Optional.of(it) };

  • map returning the same carrier — e.g. Optional.map { Optional.of(it) }, the classic M<M<T>> foot-gun.

Carriers and method-name conventions are read from the same registry as MonadicChecker; @Monadic-annotated types also participate. Calls routed through the bind/map dispatcher are skipped (that is `MonadicChecker’s domain). A strict mode additionally flags chains whose function-return type cannot be statically resolved.

The two extensions are complementary but independent: MonadicChecker repairs erasure on the DO-macro dispatcher and asserts the dispatcher’s closure-return shape; MonadicShapeChecker asserts the same shape on native chains the dispatcher never sees. Code that mixes DO with native chains may opt into both.

Scoping, return, break, continue

The desugared form turns the body into a chain of closure bodies. The consequences are the same as for any closure-bodied rewrite, and the same as the @Parallel for-loop transform documents:

  • break and continue are not supported inside the body and produce a compile error. The corresponding monadic notion — short-circuiting — is delivered by the carrier and propagates naturally through the chain.

  • return inside a generator source expression or the body returns from the enclosing closure, following the standard Groovy closure rule.

  • Names bound by earlier generators are visible to later generators and to the body by closure capture.

  • The body closure must not declare parameters; generator names are already in scope.

Error semantics

An exception thrown inside a generator source expression or inside the body propagates through the bind chain according to the carrier’s own rules; the macro introduces no try/catch. For Awaitable and CompletableFuture the exception is captured into the resulting carrier and surfaces on await/join. For Optional and Stream the JDK semantics apply. For @Monadic user types the exception path is whatever the type implements.

Runtime model

The macro emits calls to a bind/map dispatcher rather than to a carrier-specific method, because at the point of expansion the carrier type is unknown. The dispatcher resolves the bind/map method per the participation rules and invokes it, adapting the generator closure to the declared functional-interface or Closure parameter.

The macro and the type-checking extension are compile-time only: the macro expands and vanishes, and the extension runs only during static compilation. The dispatcher is runtime support and resides in the core runtime, so a program compiled with DO requires only the core groovy jar at run time. The macro library and the type-checkers module are compile-time dependencies.

Relationship to the concurrency primitives

DO is a value-composition notation; it complements rather than competes with Groovy’s concurrency surface.

  • Awaitable / DataflowVariable are primary motivating carriers. Imperative async/await is the right tool when code reads as a sequence of dependent steps to run now; DO is the right tool when the composed Awaitable is the deliverable and will be combined further (raced, gathered, scheduled) before being awaited.

  • Parallel collections and @Parallel are deliberately not DO participants. collectParallel/collectManyParallel are map/flatMap with parallel semantics; emitting them from a comprehension would silently change the execution model. Collection participation, where offered, is sequential.

  • for await deserves a specific note because of the naming overlap. DO and for await operate at different layers: DO composes one result from a chain of dependent monadic steps and produces a single carrier value; for await consumes a stream-shaped source (a Flow.Publisher, an AsyncChannel, a generator) item by item. The two compose cleanly — a DO chain can live inside the body of a for await loop when each pulled item is itself worth composing further, and an Awaitable produced by DO can be one element of an upstream publisher consumed by for await — but neither is a substitute for the other. Pick DO when the question is "three dependent steps produce one answer"; pick for await when the question is "a stream of events to process as they arrive".

  • Generators, channels, actors and agents are stream or message-passing tools, orthogonal to DO. A composed Awaitable produced by DO may of course be consumed by an outer async/await, or yielded into a channel for downstream consumers.

The language specification’s tool-selection matrix carries a row for DO.

Monadic comprehensions occupy a design space shared with Scala’s for, Haskell’s do, F#'s computation expressions, and Kotlin’s coroutine-based composition. The table cross-references the load-bearing features of DO against the closest analogue in each.

Feature Groovy DO Scala for Haskell do F# computation expr.

Underlying abstraction

Allow-list (by type and by name) + structural duck typing + @Monadic; no abstraction hierarchy.

Name-based desugaring to flatMap/map/withFilter/foreach; no type-class.

The Monad type class, backed by higher-kinded types.

Builder methods (Bind, Return, …) on a user-supplied builder type.

Implicit pure/return

No. The body must yield a carrier value.

Inferred via the trailing yield form.

Yes; return is a type-class member.

Yes; Return on the builder.

Guards / filtering

No.

Yes; if guards desugar to withFilter.

Yes; via MonadPlus.

Yes; via builder methods.

Mixed-carrier composition

No; one carrier per DO. Nest for more.

No; transformers/libraries fill the gap.

No; transformers fill the gap.

Yes, via Bind overloads on the builder.

Carrier-shape enforcement

@CompileStatic via a type-checking extension; runtime MissingMethodException otherwise.

Compile-time via name resolution.

Compile-time via the type class.

Compile-time via builder-method resolution.

Carrier-method-name conventions

Structural (flatMap/map) + allow-list + @Monadic override.

Fixed names (flatMap/map/withFilter/foreach).

Fixed type-class operations.

Fixed builder-method names.

A few qualitative observations follow:

  • Abstraction commitment is the dividing line. Haskell and Scala invest in language-level abstraction so a single do/for form works generically. Groovy aligns most closely with Scala — no type-class, name- based desugaring, structural participation — the differences being the allow-list (for stdlib/third-party carriers with non-conventional names) and the @Monadic opt-in (for user types).

  • Implicit pure is an inference problem, not a design one. Scala and F# synthesise pure because they have target-typed inference; Groovy declines the inference in this version and requires the body to produce a carrier value explicitly.

  • @Monadic has no exact analogue. Scala does not need it because the conventional method names are universal in its ecosystem; F# does not need it because the builder is the opt-in. Groovy needs it because the conventional names are not universally followed across the JVM ecosystem, and structural matching alone is too narrow.

Cross-version evolution

Version Year Change

6.0

2026

Initial release (incubating). The DO macro; the bind/map dispatcher and carrier registry in the core runtime; @groovy.transform.Monadic; the groovy.typecheckers.MonadicChecker type-checking extension (participation, inference, and closure-return shape under @CompileStatic); the sibling groovy.typecheckers.MonadicShapeChecker extension for native flatMap/map chains; the standard allow-list (Optional, Stream, CompletableFuture, CompletionStage, Awaitable, DataflowVariable) and the by-name recognition of common Functional Java and Vavr carriers (io.vavr.control.{Option, Try, Either, Validation}); general single-abstract-method coercion of the generator closure.

7.0

TBD

Deferred from 6.0: user-configurable carrier registration, symmetric across the type checker and the runtime dispatcher. The 6.0 release ships with the built-in allow-list only; a configuration channel that lets applications and libraries extend the registry — covering both the runtime bind/map dispatcher and MonadicChecker/MonadicShapeChecker without divergence — is targeted for Groovy 7.0. No other spec-level changes are planned at the time of writing. This GEP is the canonical location to record future changes.

Non-goals and potential future extensions

The following are deliberately out of scope. Any of them would warrant a follow-up revision of this document or a successor GEP.

  • Higher-kinded types. No language-level support for abstracting over type constructors.

  • Implicit pure/unit lifting in the body. This version requires the body to produce a carrier value. A future revision may add it when target-type inference can be made to flow reliably through the rewritten chain.

  • Guard clauses (if filters inside DO). Not all carriers support filtering; adding guards would restrict the participant set or require carrier-specific filter desugaring.

  • Mixed-carrier comprehensions. Each DO works over a single carrier; nesting is the workaround.

  • Parallel desugaring. A form emitting collectParallel/ collectManyParallel would change the execution model and is not offered.

  • Synthesis of flatMap/map from a single user-supplied bind via the @Monadic annotation, in the spirit of generated boilerplate.

  • A Functor/Applicative/Monad interface hierarchy in groovy.lang. Structural participation plus @Monadic covers the same ground without committing the language to a hierarchy that is awkward without higher-kinded types.

  • User-configurable carrier registration. The built-in allow-list ships in 6.0; a configuration channel symmetric across the checker and the runtime dispatcher is deferred until Groovy 7.0 (see Cross-version evolution). Until then, third-party carriers participate either by following the structural flatMap/map convention or by carrying @groovy.transform.Monadic; carriers with non-conventional method names that cannot be annotated (because the type is owned by another project) require a registry entry, which Groovy 6.0 only supports by extending the standard allow-list in core. The Vavr and Functional Java entries added in version 3 of this document are examples of that core-only path.

  • GEP-8: type-checking extensions — the mechanism the @CompileStatic rules rely on.

  • GEP-22: Traits — referenced for the structural-participation pattern and the spec-style format adopted here.

  • The language specification chapter on monadic comprehensions contains worked tutorial examples that complement this spec-only document.

  • The Groovy specification chapters on async/await, dataflow, and parallel collections — the concurrency carriers and the tool-selection matrix that DO extends.

  • Scala for-comprehensions and Haskell do-notation — the closest analogue notations.

  • Yallop & White, Lightweight Higher-Kinded Polymorphism — the formal basis for the higher-kinded-type simulations this proposal makes unnecessary for the common case.

Reference implementation

  • org.apache.groovy.macrolib.MacroLibGroovyMethods — hosts the DO macro method, registered alongside the other macro-library entries; parses the generator list, validates the name in expression shape, and emits the nested bind chain. Compile-time only.

  • org.apache.groovy.runtime.Comprehensions — the runtime bind/map dispatcher; resolves participation and invokes the carrier method, adapting the generator closure to the declared parameter type. Core runtime support.

  • org.apache.groovy.runtime.MonadicCarrierRegistry — the standard allow-list, both type-keyed and name-keyed; shared by the dispatcher and the type checker.

  • groovy.transform.Monadic — the opt-in annotation; a pure marker with optional bind/map string attributes and no AST transformation.

  • groovy.typecheckers.MonadicChecker — the type-checking extension that enforces the receiver and closure-return monadic shape and supplies static types under @CompileStatic.

  • groovy.typecheckers.MonadicShapeChecker — sibling type-checking extension that lints hand-written native flatMap/map chains against the same carrier set; independent of DO.

Public API:

  • groovy.transform.Monadic (since 6.0.0)

  • the DO macro, in the groovy-macro-library module (since 6.0.0)

  • groovy.typecheckers.MonadicChecker, in the groovy-typecheckers module (since 6.0.0)

  • groovy.typecheckers.MonadicShapeChecker, in the groovy-typecheckers module (since 6.0.0)

Representative JIRA issues

  • GROOVY-12021: Initial implementation of the DO monadic-comprehension macro and related type checkers.

Update history

4 (2026-05-22) Status moved from Draft to Final on merge of PR #2545 (GROOVY-12021). The DO macro, the bind/map dispatcher, the @Monadic annotation, the standard carrier registry (including the Functional Java and Vavr by-name entries), MonadicChecker and MonadicShapeChecker all ship in Groovy 6.0 as incubating. No spec-level changes from version 3; the user-configurable carrier registration channel remains deferred to Groovy 7.0.

3 (2026-05-21) Extends the standard by-name allow-list to recognise the Vavr control carriers (io.vavr.control.Option, io.vavr.control.Try, io.vavr.control.Either, io.vavr.control.Validation). Vavr’s carriers follow the structural flatMap/map convention, so the addition is a documentation and participation-test convenience rather than new dispatcher logic; Groovy takes no dependency on Vavr.

2 (2026-05-20) Adds closure-return shape enforcement to MonadicChecker (rejects bare-value and cross-carrier DO bodies for trusted carriers via the dispatcher); introduces the sibling groovy.typecheckers.MonadicShapeChecker extension for native flatMap/map/thenCompose/thenApply chains across the same carrier set.

1 (2026-05-19) Initial draft. Specifies the DO macro, the @Monadic annotation, the type-checking extension, the standard carrier allow-list (including by-name recognition of Functional Java), the runtime model and its dependency footprint, the relationship to the Groovy concurrency primitives, the comparison with Scala for, Haskell do, Kotlin coroutines and F# computation expressions, and the deliberate non-goals.