EV

E. Visser

info

Please Note

21 records found

Master thesis (2022) - T. Molendijk, E. Visser, C.B. Poulsen, A. van Deursen, A. Tolmach
The dynamic semantics of a programming language formally describe the runtime behavior of any given program. In this thesis, we present Dynamix, a meta-language for dynamic semantics. By writing a specification for a language in Dynamix, a compiler for the language can be derived automatically.

Dynamix specifications compile source programs to the Tim intermediate representation, a language-agnostic target IR designed to be able to be efficiently interpreted or compiled. Dynamix and Tim make use of the continuation-passing style to abstract over control flow, giving language developers fine-grained access to control flow primitives in their specification. A novel abstraction in Dynamix allows the construction of these CPS terms without the traditional friction involved. Dynamix is fully typed and integrated within the Spoofax language workbench. This allows language developers to interact directly with other parts of the workbench, including automatic type signature generation and the ability to query the results of static analysis.

Through case studies for miniStratego and ChocoPy with exceptions, we show that Dynamix is capable of succinctly representing a wide range of source language features and paradigms. Current performance is acceptable, with the foundations for a future efficient compiler for the Tim IR already in place. ...
Build systems speed up builds by reusing build step outputs from previous builds when possible.
This requires precise definitions of the dependencies for build steps.
PIE is a build system with precise dependencies, but its task definitions in Java are verbose.
The PIE DSL allows pipeline developers to write concise definitions of PIE tasks, but the PIE framework has evolved and the PIE DSL cannot express many tasks and projects.

This thesis presents PIE DSL 2, which improves on PIE DSL 1 in three areas.
It extends the language itself with a module system, generics and support for suppliers and injected values, which allows it to express more tasks within the DSL.
There are four improvements for the code base.
The first two are a specification of the static semantics in Statix and a new compiler backend that compiles to an AST instead of using string interpolation, both of which extend the features for the DSL that can be expressed.
The second pair is constructors for semantic types and tests, which improve development speed of the DSL.
The final area we improve is the user experience, which we improve by adding documentation for expressions and types in the PIE DSL.

We compare PIE DSL 2 to Java in a case study.
Only a single task can be expressed in the DSL, which means that the boilerplate is not reduced.
Furthermore, the Java ecosystem has better error detection except for nullability.
Finally, the PIE DSL is simpler than Java, but only when the full pipeline is supported by the DSL.
We conclude that the DSL is not better than Java for full projects yet, but for tasks that it can express it is a slight improvement over Java.
This leads to the hypothesis that it has potential to become better if it can express enough tasks.

Due to time constraints, the case study did not use the latest version of the DSL.
In theory the latest version of the DSL can express 11 of the 19 tasks, but this has not been verified experimentally.

Overall, this thesis makes improvements to the PIE DSL and its environment, but that has not translated to the DSL being better than Java. ...

The Use of a Domain-Specific Language for Route Finding in Digital Printing Systems

Master thesis (2022) - B. van Walraven, A.E. Zaidman, B.P. Ahrens, Marvin Brunner, J. Denkers, E. Visser
Digital printing systems allow for the production of a large variety of different products. Making production plans for all these different products is challenging. One of the challenging aspects of making these production plans is choosing the right sequence of machines, to produce the desired intent. This is challenging due to three aspects: the large number of interdependent variables in the problem instances, the variability of machines, and the search for the best solution from a large set of valid solutions. In this thesis, we implement and evaluate the use of a domain-specific language (DSL) called RSX (Routing Space eXploration), to assist in choosing a sequence of machines. We do this together with an industrial partner. For RSX we use a model-driven approach, and it can be used to model the devices, production steps, and product properties of the digital printing domain. It transforms those into a constraint model described in the MiniZinc language, which is used as input for a constraint solver. We present the implementation of the RSX language and MiniZinc constraint model, and we evaluate the language coverage, accuracy, and performance. From these evaluations, we conclude that RSX can be used to model a number of cases, which were characteristic in the context of our industrial partner. Furthermore, we conclude that RSX can compile and solve the evaluated cases in the order of a few seconds and that the implementation is accurate, such that it can be used as a proof of concept.
...
The Scannerless Generalized-LR (SGLR) parsing algorithm allows parsing of declarative and modular syntax definitions. However, SGLR is notorious for having low performance, negatively impacting its adoption in practice. This paper presents several performance optimizations for JSGLR2, which is an implementation of SGLR. All optimizations are implemented and evaluated in parallel, which is possible due to JSGLR2's modular architecture. The evaluation is performed using existing sources from three different languages. A combined speed-up of 9% up to 44% is achieved, improving the practicality of JSGLR2. ...
Master thesis (2021) - Maarten P. Sijm, J. Denkers, E. Visser, C. Bach Poulsen, T. van der Storm, P.D. Mosses
The Scannerless Generalized LR (SGLR) parsing algorithm supports the development of composed languages seamlessly but does not support incremental parsing. The Incremental Generalized LR (IGLR) parsing algorithm, on the other hand, does not support the seamless composition of languages. This thesis presents the Incremental Scannerless Generalized LR (ISGLR) parsing algorithm and investigates the effects of combining the SGLR and IGLR parsing algorithms. While the algorithmic differences are orthogonal, the fact that scannerless parsing relies on non-deterministic parsing for disambiguation has a negative impact on incrementality. Nonetheless, we show that the ISGLR parsing algorithm performs better than the batch SGLR parsing algorithm in typical scenarios. On average, the ISGLR parser can reuse 99% of a previous parse result. When parsing from scratch, the ISGLR parser has a 24% run time overhead compared to the SGLR parser, but when parsing incrementally for changes that are smaller than 1% of the input size on average, it has a 9× speedup. ...

Language-parametric Renaming in Spoofax

A refactoring is a program transformation that improves the design of the source code, while preserving its behavior. Most modern IDEs offer a number of automated refactorings as editor services. The Rename refactoring is the most-commonly applied refactoring and is used to change the identifier of a program entity such as a variable, a function, or a type. Correctly implementing refactorings is notoriously complex and these state-of-the-art implementations are known to be faulty and too restrictive. When developing a new programming language, it is both difficult and time-consuming to implement sound and complete automated refactoring transformations. Language-parametric definitions of refactorings that can be reused by instantiation with the syntax and semantics of a language, allow the development effort of refactorings to be amortized across language implementations. In this thesis, we developed a language-parametric Rename refactoring algorithm that works on an abstract model of a program's name binding structure. We implemented the algorithm in the Spoofax language workbench, building on the language-parametric representation of name binding with scope graphs and using generic traversals in the Stratego transformation language. We evaluated the algorithm with five different languages implemented in Spoofax, which uses both NaBL2 and Statix to declare their static semantics and name binding rules. As a result, Spoofax now provides an automated Rename refactoring that works for any language developed with the language workbench using NaBL2/Statix. ...
Master thesis (2021) - B.R.A. Bot, C.B. Poulsen, E. Visser, B. Ozkan
Compilers translate high-level source code into low-level machine code. To represent source code a compiler uses a language called the intermediate representation (IR). An IR for the compilation of functional languages is continuation-passing style (CPS). It provides convenient abstractions for both data flow and control flow. However, CPS conversion is hard to write and the transformations on CPS are untyped. In this thesis we develop an IR based on CPS using the command tree data structure. Command trees allow us to express compiler transformations typically, declaratively, and modularly. The monadic nature of command trees allows us to bind commands together in a succinct manner. We test the usefulness of the new IR by building two versions of the LamToWat compiler that translates the lamdba calculus into WebAssembly. The first version will use a CPS IR and the second version a command tree IR. ...

Using Parse Tree Repairing for Showing Safety and Completeness of Associativity and Priority Rules

Master thesis (2021) - L. Miljak, E. Visser, Robbert Krebbers, N. Yorke-Smith
Context-free grammars (CFGs) provide a well-known formalism for the specification of programming languages. They describe the structure of a program in terms of parse trees. One major issue of CFGs is ambiguity, where one sentence can sometimes have multiple different parse trees. Some formalisms like SDF3 or YACC allow annotating a grammar with disambiguation rules, such as priority or associativity. Disambiguation rules filter out certain parse trees, making a grammar less ambiguous. Giving a formal semantics for these disambiguation rules is still an ongoing research topic. In this thesis we verify an existing semantics for these rules by Souza Amorim and Visser (2019) for a subset of expression grammars. These grammars may contain infix, prefix, and postfix expressions. We verify the semantics by proving that it is both safe and complete. Safety states adding disambiguation rules will not change the underlying language of the grammar, meaning each sentence in the language will have at least one valid parse tree that does not get filtered out. Completeness guarantees that a grammar is unambiguous, meaning that each sentence in the language will have at most one valid parse tree that does not get filtered out. We have mechanized the proofs in the Coq Proof Assistant, increasing the confidence in their correctness. As part of the proofs, we also provide a verified implementation for disambiguation rules. ...

Designing and Using a Frame-Based Virtual Machine

This thesis introduces the FrameVM virtual machine and the Framed language. This language gives developers a target to compile to which concisely follows the scopes-as-frames model. This model allows language developers to derive the memory model based on the scope graphs. The core building blocks of Framed are frames, which contain all data including code. To demonstrate the viability of this model this paper also introduces a compiler from Scheme to Framed, focussing on complex control structures such as call-with-current-continuation and closures. As a result we aim to show that Framed is usable as a target language for compiling, even though it does not have a stack nor registers.
...
Master thesis (2021) - Jens de Waard, E. Visser, G. Gousios, R.J. Krebbers, S. Keidel
Abstract interpretation is a way of approximating the semantics of a computer program, in which we derive properties of those programs without actually performing the necessary computations for running the program, through the use of an abstract interpreter. To be able to trust the result of the abstract interpretation, we would to able to prove the soundness of the approximations of the interpreter. Previous work by Keidel et al. has shown that the soundness proofs of an entire abstract interpreter can be simplified by decomposing the interpreter by implementing concrete and abstract interpreters as instantiations of a generic interpreter. The goal of this thesis is to explore and implement mechanical proofs of soundness of such interpreters. To this end, we have used the interactive proof assistant Coq to implement a generic interpeter for a simple imperative language and instantiate it both concrete and abstract versions. The abstract interpreter is automatically proven sound via the use of Coq's automatic proof capabilities and typeclass system. Both the interpreted language and the used abstractions can be expanded to allow for more features. Soundness proofs can then be written for just the new components, those proofs will then be automatically resolved by Coq. ...
Static Analysis is of indispensable value for the robustness of software systems and the efficiency of developers. Moreover, many modern-day software systems are composed of interacting subsystems written in different programming languages. However, in most cases no static validation of these interactions is applied. In this thesis, we identify the Cross-Language Static Semantics Problem, which is defined as "How to provide a formal and executable specification of the static semantics of interactions between parts of a software system written in different languages?" We investigate current solutions to this problem, and propose criteria to which an all-encompassing solution to this problem must adhere. After that, we present a design pattern for the Statix meta-DSL for static semantics specification that allows to model loosely coupled, composable type system specifications. This pattern entails that the semantic concepts of a particular domain are encoded in an interface specification library, which is integrated in the type system of concrete languages. This allows controlled but automated composition of type systems. We show that, under some well-formedness criteria, the system provides correct results. A runtime, executing composed specifications, is implemented using PIE pipelines for partial incrementality, and integrated in the command-line interface and Eclipse IDE platforms, using the Spoofax 3 Framework. This allows using multi-language analysis in concrete projects. The design pattern, and the accompanying runtime are validated using two case studies. These case studies show that the approach is effective, even in a case where there is an impedance mismatch in the data models of the involved languages. ...
Doctoral thesis (2021) - A.J. Rouvoet, E. Visser, R.J. Krebbers
Programming language implementations bridge the gap between what the program developer sees and understands, and what the computer executes. Hence, it is crucial for the reliability of software that language implementations are correct. Correctness of an implementation is judged with respect to a criterion. In this thesis, we focus on the criterion type correctness, striking a balance between the difficulty of the assessment of the criterion and its usefulness to rule out errors throughout a programming language implementation. If both the front- and the back-end fulfill their role in maintaining the type contract between the programmer and the language implementation, then unexpected type errors will not occur when the program is executed. To verify type correctness throughout a language implementation, we want to establish it formally. That is, we aim to give a specification of program typing in a formal language, and to give a mathematical proof that every part of the language implementation satisfies the necessary property to make the whole implementation type-correct. Type checkers ought to be sound and only accept programs that are indeed typeable according to the specification of the language. Interpreters should be type safe, and reduce expressions to values of the same type. Program compilers should preserve well-typing when they transform programs. These properties are essential for implementations of typed programming languages, ensuring that the typing of the source program is a meaningful notion that can be trusted by the programmer to prevent certain errors from occurring during program execution. A conventional formal type- ...

Declarative dynamic semantics on a VM using scopes as frames

Over the years virtual machines (VMs) have been created to abstract over computer hardware. This simplified code generation and allowed for easy portability between hardware platforms. These VMs are however highly tailored to a particular runtime model. This improves the execution speed, but places restrictions on the types of languages that the VM supports. In this thesis the Frame VM was developed as a VM that supports many different types of languages in a principled way. Achieving this is done by basing the VM on language independent models of memory and control flow. Usage of the scopes-as-frames paradigm and control frames resulted in an instruction set that is relatively small at its core, but does allow for the construction of complex control flow. As an effect, many different programming languages can be compiled to the Frame VM. In addition to this VM, a Domain Specific Language (DSL) for executable semantics of programming languages was created. This language, Dynamix, allows for a modular approach to writing the semantics of a language. Additionally, Dynamix provides a meta-compiler that uses these semantics of a language to compile programs to the Frame VM. To validate the Frame VM, direct compilers for Rust and Prolog have been created in a student project and compilers for Scheme and Tiger were created using Dynamix. Using these semantics of Scheme and Tiger, it was possible to execute programs containing usage of call/cc and a suite of Tiger benchmark programs. Furthermore, the control flow of Tiger was extended with exceptions and generator functions. This extension did not require any changes to the existing semantics, showing the modularity of control achieved when using Dynamix and the Frame VM. ...
The Spoofax Testing Language (SPT) is the existing solution for testing in the Spoofax language workbench. It allows developers of domain specific languages to write their test cases declaratively. As it aims to be implementation agnostic, developers don't need to concern themselves with the details of the artifacts generated by Spoofax, and can write their tests before implementing their language. However, the previous implementation has become slow and unusable for larger test suites and can not be executed programatically. This means it can't be used for continuous integration and automated regression testing.
As Spoofax was redesigned to become more robust and platform independent, the previous SPT is no longer compatible. We took this opportunity to redesign SPT. In this thesis we will discuss the benefits of a testing approach like SPT, how far along it is on the path of testing any language, and what is required to make it usable by modern day developers. We will analyze the problems that SPT had to tackle, how it solved them, and which problems still remain.
Finally, we present and evaluate our new design and implementation to solve some of these remaining problems. We created a platform independent, real-time performant, easily extendable architecture that allows SPT to be used for automated tasks such as continuous integration and the automated grading of students' domain specific languages. ...

A Modular and Incremental Approach for Type Checking and Name Binding using Scope Graphs

Statix is a language which generates a type checker from a declarative specification. However, Statix is not fast enough for quick feedback in IDEs because it always has to reanalyze all files. In this thesis, we improve the analysis time of Statix by applying the ideas of separate compilation to create a model for incremental analysis. Statix uses a scope graph for representing the scoping and declarations of a project. We split the scope graph over multiple smaller modules which can be analyzed in isolation. Our model automatically detects dependencies between modules and supports creating scope graph diffs to determine modules affected by changes with high precision. The modules from our model can be solved in parallel, already yielding performance improvements of up to 40% compared to the original implementation. Finally, we implement an optimistic strategy with our model and show that it is effective whenever changes are small, reducing analysis time by up to 5 times for large projects. ...

Template programming for Siemens SCL

Programmable Logic Controllers (PLCs) are used to control large scale systems with thousands of I/O devices. Writing and maintaining the logic for each of these is a cumbersome task, which is well suited to be abstracted through templating. For this purpose, CERN developed the Unicos Application Builder (UAB). While UAB is successful at templating, it provides no guarantees over the validity of the outcome, leading to erroneous generated code. This is where SCL-T comes in. It builds on the foundation of UAB to facilitate meta-programming for Siemens’ SCL. Unlike its predecessor, it guarantees syntactic correctness and also draw conclusions regarding the semantic validity of the generated code. Its architecture has been designed in such a way that support for other PLC languages can be added using the same meta-language, reducing the cost of a having a meta-programming language tailored for a specific PLC language. ...
Master thesis (2018) - Olaf Maas, Eelco Visser, Peter Mosses, Jan S. Rellermeyer
Language Workbenches are instruments developers use to create new domain-specific languages. They provide tools to rapidly develop, test and deploy new languages. Currently, workbenches support deployment in desktop-based integrated development environments. Setting up these environments can be a hurdle for the often non-technical users of these languages. Web-Based IDEs could be a solution in this case, but workbenches are currently not able to deploy languages in these environments.

This work presents the first step towards language workbenches in Web IDEs by creating a language parametric runtime for the browser which serves as a back-end for Spoofax. Combined with an editor, this runtime is the basis for the generation of entirely client-side language playgrounds based on Spoofax specifications. For parsing, this runtime has similar performance characteristics as the existing Spoofax implementation. Code execution in this runtime can be used in environments where performance is not critical. ...
Master thesis (2018) - Daniël Pelsmaeker, Eelco Visser, Gabriël Konat, Sebastian Erdweg, Rafael Bidarra
Implementing the syntax and semantics of a new (domain-specific) programming language is a lot of work, which is worsened by the additional work needed to add support for the language to an editor such as Eclipse or VS Code. Lack of such support can impede language usability and adoption, as developers prefer different editors. However, supporting M editors for N languages requires M * N implementations to be built and maintained, which is known as the IDE portability problem. Portable editor services aim to reduce this to M + N implementations, which leads to the main question of this thesis: how can we make the editor services of languages portable across editors?
Language definitions made in the Spoofax language workbench can automatically expose their editor services in any editor that Spoofax supports. Therefore, we evaluate the portability of Spoofax Core, the editor-agnostic core of Spoofax, through an implementation of the workbench in the IntelliJ editor.
To get portability for editor services of languages in general, we first investigate how editor services can be added to the most popular editors, and explore their features, documentation, and API. From this, we derive a platform-agnostic model for portable editor services: AESI, the Adaptable Editor Services Interface. AESI describes the maximum set of common editor service features supported by the editors we investigated, while at the same time imposing minimal requirements upon any implementation of these editor services. We evaluate AESI by providing two language implementations, and adapting AESI to Eclipse, IntelliJ, and VS Code. Finally, we compare AESI with two other solutions to the IDE portability problem: LSP and Monto. ...
Master thesis (2018) - Tim Rensen, Eelco Visser, Nava Tintarev, Sebastian Erdweg
LeQuest develops interactive e-training modules to improve the competence regarding medical technology of medical professionals. The medical technology is analysed by LeQuest to develop training modules, but the analysis process and writing associated information can be performed more efficiently. This would reduce the required time and resources which could be invested in additional trainings and quality improvements. In the end, this will lead to an improvement regarding the patient's safety in health institutions. This work empirically evaluated the Spoofax Workbench by conducting an industrial case-study which consists of the design, implementation and evaluation of a domain-specific language (DSL). The LeQuest DSL is used as a tool for transforming the current analysis process into a more formalized process which does allow for objective observations, measurements and quantifiable information. Although the LeQuest DSL is not integrated in the current work-flow yet, the evaluation has shown that it is expected that the overall quality and efficiency of the analysis process will increase after the introduction of the DSL. ...
Master thesis (2018) - Martijn Dwars, Eelco Visser, Hendrik van Antwerpen
Testing is the most commonly used technique for raising confidence in the correctness of a piece of software, but constructing an effective test suite is expensive and prone to error. Property-based testing partly automates this process by testing whether a property holds for all randomly generated inputs, but its effectiveness depends upon the ability to automatically generate random test inputs. When using property-based testing to test a compiler backend, the problem becomes that of generating random programs that pass the parsing and analysis phase. We present SPG (SPoofax Generator), a language-parametric generator of random well-formed terms. We describe three experiments in which we evaluate the effectiveness of SPG at discovering different kinds of compiler bugs. Furthermore, we analyze why the generator fails to detect certain compiler bugs and provide several ideas for future work. The results show that random testing can be a cost-effective technique to find bugs in small programming languages (such as DSLs), but its application to practical programming languages requires further research. ...