Circular Image

C.B. Poulsen

info

Please Note

19 records found

Journal article (2025) - Daniel A.A. Pelsmaeker, Aron Zwaan, Casper Bach, Arjan J. Mooij
Modern Integrated Development Environments (IDEs) offer automated refactorings to aid programmers in developing and maintaining software. However, implementing sound automated refactorings is challenging, as refactorings may inadvertently introduce name-binding errors or cause references to resolve to incorrect declarations. To address these issues, previous work by Schäfer et al. proposed replacing concrete references with locked references to separate binding preservation from transformation. Locked references vacuously resolve to a specific declaration, and after transformation must be replaced with concrete references that also resolve to that declaration. Synthesizing these references requires a faithful inverse of the name lookup functions of the underlying language. ...
Conference paper (2024) - Ivan Todorov, Casper Bach Poulsen
Expressive logics, such as the modal μ-calculus, can be used to specify and verify functional requirements of program models. While such verification is useful, a key challenge is to guarantee that the model being verified actually corresponds to the (typically effectful) program it is supposed to. We explore an approach that bridges this gap between effectful programming and functional requirement verification. Using dependently-typed programming in Agda, we develop an embedding of the modal μ-calculus for defining and verifying functional properties of possibly-non-terminating effectful programs which we represent in Agda using the coinductive free monad. ...
Conference paper (2024) - Luka Miljak, Casper Bach Poulsen, Rosilde Corvino
Software engineers should be able to apply massive code refactorings to maintain large legacy code bases. A key aspect of developing restructurings is matching and transforming code snippets using abstract syntax trees (ASTs). Matching on ASTs is typically done through AST patterns with holes. AST patterns can be extended to become metapatterns, which increase their expressivity. Metapattern examples include disjunctions, descendant patterns, and patches where we inline transformations into the pattern itself. Despite their expressivity, abstract syntax (meta)patterns can become verbose and require restructuring engineers to be intimately familiar with the data types that define the AST. A better approach is to use concrete syntax patterns, which allows us to denote our patterns in the syntax of the object language. Previous work has shown that we can use external black-box parsers of the object language to compile concrete syntax patterns for arbitrary languages. In this paper, we scale this compilation method to support concrete syntax metapatterns, which allows for a more declarative way of expressing restructurings. We evaluate this method through an implementation written in Kotlin. ...
Conference paper (2024) - Aron Zwaan, Casper Bach Poulsen
Many programming languages allow programmers to regulate accessibility; i.e., annotating a declaration with keywords such as export and private to indicate where it can be accessed. Despite the importance of name accessibility for, e.g., compilers, editor auto-completion and tooling, and automated refactorings, few existing type systems provide a formal account of name accessibility. We present a declarative, executable, and language-parametric model for name accessibility, which provides a formal specification of name accessibility in Java, C#, C++, Rust, and Eiffel. We achieve this by defining name accessibility as a predicate on resolution paths through scope graphs. Since scope graphs are a language-independent model of name resolution, our model provides a uniform approach to defining different accessibility policies for different languages. Our model is implemented in Statix, a logic language for executable type system specification using scope graphs. We evaluate its correctness on a test suite that compares it with the C#, Java, and Rust compilers, and show we can synthesize access modifiers in programs with holes accurately. ...
Conference paper (2023) - Casper Bach Poulsen, Aron Zwaan, Paul Hübner
An important aspect of type checking is name resolution — i.e., determining the types of names by resolving them to a matching declaration. For most languages, we can give typing rules that define name resolution in a way that abstracts from what order different units of code should be checked in. However, implementations of type checkers in practice typically use multiple phases to ensure that declarations of resolvable names are available before names are resolved. This gives rise to a gap between typing rules that abstract from order of type checking and multi-phased type checkers that rely on explicit ordering.

This paper introduces techniques that reduce this gap. First, we introduce a monadic interface for phased name resolution which detects and rejects type checking runs with name resolution phasing errors where names were wrongly resolved because some declarations were not available when they were supposed to be. Second, building on recent work by Gibbons et al., we use applicative functors to compositionally map abstract syntax trees onto (phased) monadic computations that represent typing constraints. These techniques reduce the gap between type checker implementations and typing rules in the sense that (1) both are given by compositional mappings over abstract syntax trees, and (2) type checker cases consist of computations that roughly correspond to typing rule premises, except these are composed using monadic combinators. We demonstrate our approach by implementing type checkers for Mini-ML with Damas-Hindley-Milner type inference, and LM, a toy module language with a challenging import resolution policy.
...
Conference paper (2023) - Casper Bach Poulsen
Substitution is a common and popular approach to implementing name binding in definitional interpreters. A common pitfall of implementing substitution functions is variable capture. The traditional approach to avoiding variable capture is to rename variables. However, traditional renaming makes for an inefficient interpretation strategy. Furthermore, for applications where partially-interpreted terms are user facing it can be confusing if names in uninterpreted parts of the program have been changed. In this paper we explore two techniques for implementing capture avoiding substitution in definitional interpreters to avoid renaming. ...

Modular Elaboration of Higher-Order Algebraic Effects

Journal article (2023) - Casper Bach Poulsen, Cas Van Der Rest
Algebraic effects and handlers is an increasingly popular approach to programming with effects. An attraction of the approach is its modularity: effectful programs are written against an interface of declared operations, which allows the implementation of these operations to be defined and refined without changing or recompiling programs written against the interface. However, higher-order operations (i.e., operations that take computations as arguments) break this modularity. While it is possible to encode higher-order operations by elaborating them into more primitive algebraic effects and handlers, such elaborations are typically not modular. In particular, operations defined by elaboration are typically not a part of any effect interface, so we cannot define and refine their implementation without changing or recompiling programs. To resolve this problem, a recent line of research focuses on developing new and improved effect handlers. In this paper we present a (surprisingly) simple alternative solution to the modularity problem with higher-order operations: we modularize the previously non-modular elaborations commonly used to encode higher-order operations. Our solution is as expressive as the state of the art in effects and handlers. ...
Conference paper (2023) - Luka Miljak, Casper Bach Poulsen, Flip van Spaendonck
The goal of automated refactoring is to reduce maintenance effort. To realize this, programmers need to be able to trust or manually check that refactorings actually preserve behavior. To allow programmers to focus on such checks, automated refactorings should preserve program well-typedness. However, historically automated refactorings in popular IDEs could break well-typedness. The reason is that modern languages have complex name binding semantics which makes it hard to guarantee well-typedness in general. In recent work, scope graphs have been proposed as a uniform model for name binding. The model supports complex name binding patterns, and its uniformity makes it attractive to consider for verifying that refactorings preserve well-typedness. This paper explores how to prove that refactorings preserve well-typedness, using scope graphs. We consider a simple refactoring for merging modules in a toy module language, and prove that this refactoring preserves well-typedness. We give a generic template for proving well-typedness preservation using scope graphs, and discuss how this template relates to refactorings more generally. ...
Code completion is an editor service in IDEs that proposes code fragments for the user to insert at the caret position in their code. Code completion should be sound and complete. It should be sound, such that it only proposes fragments that do not violate the syntactic and static semantic rules of the language. It should be complete, such that it proposes all valid fragments so that code completion can be used to construct all programs. To realize soundness and completeness, code completion should be informed by the language definition. In practice, the implementation of code completion is an additional effort in the implementation of a language.

In this paper, we develop a framework for language-parametric semantic code completion for statically typed programming languages based on their specification of syntax and static semantics, realizing the implementation of a code completion editor service with minimal additional effort. The framework builds on the SDF3 syntax definition formalism and the Statix static semantics specification language. The algorithm reinterprets the static semantics definition to find sound expansions of predicates and solutions to name resolution queries in scope graphs. This allows a search strategy to explore the solution space and synthesize completion proposals. The implementation of the strategy language and code completion algorithm extend the implementation of the Statix solver, and can be used for any language defined in Statix. We demonstrate soundness and completeness of the completion proposal synthesis, and evaluate its performance. ...
Specifying and mechanically verifying type safe programming languages requires significant effort. This effort can in theory be reduced by defining and reusing pre-verified, modular components. In practice, however, existing approaches to modular mechanical verification require many times as much specification code as plain, monolithic definitions. This makes it hard to develop new reusable components, and makes existing component specifications hard to grasp. We present an alternative approach based on intrinsically-typed interpreters, which reduces the size and complexity of modular specifications as compared to existing approaches. Furthermore, we introduce a new abstraction for safe-by-construction specification and composition of pre-verified type safe language components: language fragments. Language fragments are about as concise and easy to develop as plain, monolithic intrinsically-typed interpreters, but require about 10 times less code than previous approaches to modular mechanical verification of type safety. ...
Conference paper (2021) - Birthe van den Berg, Tom Schrijvers, Casper Bach Poulsen, Nicolas Wu
The development of programming languages can be quite complicated and costly. Hence, much effort has been devoted to the modular definition of language features that can be reused in various combinations to define new languages and experiment with their semantics. A notable outcome of these efforts is the algebra-based “datatypes à la carte” (DTC) approach. When combined with algebraic effects, DTC can model a wide range of common language features. Unfortunately, the current state of the art does not cover modular definitions of advanced control-flow mechanisms that defer execution to an appropriate point, such as call-by-name and call-by-need evaluation, as well as (multi-)staging. This paper defines latent effects, a generic class of such control-flow mechanisms. We demonstrate how function abstractions, lazy computations and a MetaML-like staging can all be expressed in a modular fashion using latent effects, and how they can be combined in various ways to obtain complex semantics. We provide a full Haskell implementation of our effects and handlers with a range of examples. ...
An intrinsically-typed definitional interpreter is a concise specification of dynamic semantics, that is executable and type safe by construction. Unfortunately, scaling intrinsically-typed definitional interpreters to more complicated object languages often results in definitions that are cluttered with manual proof work. For linearly-typed languages (including session-typed languages) one has to prove that the interpreter, as well as all the operations on semantic components, treat values linearly. We present new methods and tools that make it possible to implement intrinsically-typed definitional interpreters for linearly-typed languages in a way that hides the majority of the manual proof work. Inspired by separation logic, we develop reusable and composable abstractions for programming with linear operations using dependent types. Using these abstractions, we define interpreters for linear lambda calculi with strong references, concurrency, and session-typed communication in Agda ...

Sound scheduling of name resolution in type checkers derived from declarative specifications

There is a large gap between the specification of type systems and the implementation of their type checkers, which impedes reasoning about the soundness of the type checker with respect to the specification. A vision to close this gap is to automatically obtain type checkers from declarative programming language specifications. This moves the burden of proving correctness from a case-by-case basis for concrete languages to a single correctness proof for the specification language. This vision is obstructed by an aspect common to all programming languages: name resolution. Naming and scoping are pervasive and complex aspects of the static semantics of programming languages. Implementations of type checkers for languages with name binding features such as modules, imports, classes, and inheritance interleave collection of binding information (i.e., declarations, scoping structure, and imports) and querying that information. This requires scheduling those two aspects in such a way that query answers are stable-i.e., they are computed only after all relevant binding structure has been collected. Type checkers for concrete languages accomplish stability using language-specific knowledge about the type system. In this paper we give a language-independent characterization of necessary and sufficient conditions to guarantee stability of name and type queries during type checking in terms of critical edges in an incomplete scope graph. We use critical edges to give a formal small-step operational semantics to a declarative specification language for type systems, that achieves soundness by delaying queries that may depend on missing information. This yields type checkers for the specified languages that are sound by construction-i.e., they schedule queries so that the answers are stable, and only accept programs that are name-and type-correct according to the declarative language specification. We implement this approach, and evaluate it against specifications of a small module and record language, as well as subsets of Java and Scala. ...
Symbolic execution is a technique for automatic software validation and verification. New symbolic executors regularly appear for both existing and new languages and such symbolic executors are generally manually (re)implemented each time we want to support a new language. We propose to automatically generate symbolic executors from language definitions, and present a technique for mechanically (but as yet, manually) deriving a symbolic executor from a definitional interpreter. The idea is that language designers define their language as a monadic definitional interpreter, where the monad of the interpreter defines the meaning of branch points. Developing a symbolic executor for a language is a matter of changing the monadic interpretation of branch points. In this paper, we illustrate the technique on a language with recursive functions and pattern matching, and use the derived symbolic executor to automatically generate test cases for definitional interpreters implemented in our defined language. ...
Conference paper (2018) - Sven Keidel, Casper Poulsen, Sebastian Erdweg
Abstract interpretation is a technique for developing static analyses. Yet, proving abstract interpreters sound is challenging for interesting analyses, because of the high proof complexity and proof effort. To reduce complexity and effort, we propose a framework for abstract interpreters that makes their soundness proof compositional. Key to our approach is to capture the similarities between concrete and abstract interpreters in a single shared interpreter, parameterized over an arrow-based interface. In our framework, a soundness proof is reduced to proving reusable soundness lemmas over the concrete and abstract instances of this interface; the soundness of the overall interpreters follows from a generic theorem. To further reduce proof effort, we explore the relationship between soundness and parametricity. Parametricity not only provides us with useful guidelines for how to design non-leaky interfaces for shared interpreters, but also provides us soundness of shared pure functions as free theorems. We implemented our framework in Haskell and developed a k-CFA analysis for PCF and a tree-shape analysis for Stratego. We were able to prove both analyses sound compositionally with manageable complexity and effort, compared to a conventional soundness proof. ...
Journal article (2018) - Casper Bach Poulsen, Arjen Rouvoet, Andrew Tolmach, Robbert Krebbers, Eelco Visser
A definitional interpreter defines the semantics of an object language in terms of the (well-known) semantics of a host language, enabling understanding and validation of the semantics through execution. Combining a definitional interpreter with a separate type system requires a separate type safety proof. An alternative approach, at least for pure object languages, is to use a dependently-typed language to encode the object language type system in the definition of the abstract syntax. Using such intrinsically-typed abstract syntax definitions allows the host language type checker to verify automatically that the interpreter satisfies type safety. Does this approach scale to larger and more realistic object languages, and in particular to languages with mutable state and objects? In this paper, we describe and demonstrate techniques and libraries in Agda that successfully scale up intrinsically-typed definitional interpreters to handle rich object languages with non-trivial binding structures and mutable state. While the resulting interpreters are certainly more complex than the simply-typed lambda-calculus interpreter we start with, we claim that they still meet the goals of being concise, comprehensible, and executable, while guaranteeing type safety for more elaborate object languages. We make the following contributions: (1) A _dependent-passing style_ technique for hiding the weakening of indexed values as they propagate through monadic code. (2) An Agda library for programming with scope graphs and frames, which provides a uniform approach to dealing with name binding in intrinsically-typed interpreters. (3) Case studies of intrinsically-typed definitional interpreters for the simply-typed lambda-calculus with references (STLC+Ref) and for a large subset of Middleweight Java (MJ). ...
Scope graphs are a promising generic framework to model the binding structures of programming languages, bridging formalization and implementation, supporting the definition of type checkers and the automation of type safety proofs. However, previous work on scope graphs has been limited to simple, nominal type systems. In this paper, we show that viewing scopes as types enables us to model the internal structure of types in a range of non-simple type systems (including structural records and generic classes) using the generic representation of scopes. Further, we show that relations between such types can be expressed in terms of generalized scope graph queries. We extend scope graphs with scoped relations and queries. We introduce Statix, a new domain-specific meta-language for the specification of static semantics, based on scope graphs and constraints. We evaluate the scopes as types approach and the Statix design in case studies of the simply-typed lambda calculus with records, System F, and Featherweight Generic Java.Scope graphs are a promising generic framework to model the binding structures of programming languages, bridging formalization and implementation, supporting the definition of type checkers and the automation of type safety proofs. However, previous work on scope graphs has been limited to simple, nominal type systems. In this paper, we show that viewing scopes as types enables us to model the internal structure of types in a range of non-simple type systems (including structural records and generic classes) using the generic representation of scopes. Further, we show that relations between such types can be expressed in terms of generalized scope graph queries. We extend scope graphs with scoped relations and queries. We introduce Statix, a new domain-specific meta-language for the specification of static semantics, based on scope graphs and constraints. We evaluate the scopes as types approach and the Statix design in case studies of the simply-typed lambda calculus with records, System F, and Featherweight Generic Java. ...

A uniform model for memory layout in dynamic semantics

Conference paper (2016) - Casper Bach Poulsen, Pierre Néron, Andrew Tolmach, Eelco Visser
Semantic specifications do not make a systematic connection between the names and scopes in the static structure of a program and memory layout, and access during its execution. In this paper we introduce a systematic approach to the alignment of names in static semantics and memory in dynamic semantics, building on the scope graph framework for name resolution. We develop a uniform memory model consisting of frames that instantiate the scopes in the scope graph of a program. This provides a language-independent correspondence between static scopes and run-time memory layout, and between static resolution paths and run-time memory access paths. The approach scales to a range of binding features, supports straightforward type soundness proofs, and provides the basis for a language-independent specification of sound reachability-based garbage collection. ...

A Uniform Model for Memory Layout in Dynamic Semantics (Artifact)

Other (2016) - Casper Bach Poulsen, Pierre Neron, Andrew Tolmach, Eelco Visser
Our paper introduces a systematic approach to the alignment of names in the static structure of a program, and memory layout and access during its execution. We develop a uniform memory model consisting of frames that instantiate the scopes in the scope graph of a program. This provides a language-independent correspondence between static scopes and run-time memory layout, and between static resolution paths and run-time memory access paths. The approach scales to a range of binding features, supports straightforward type soundness proofs, and provides the basis for a language-independent specification of sound reachability-based garbage collection. This Coq artifact showcases how our uniform model for memory layout in dynamic semantics provides structure to type soundness proofs. The artifact contains type soundness proofs mechanized in Coq for (supersets of) all languages in the paper. The type soundness proofs rely on a language-independent framework formalizing scope graphs and frame heaps. ...