SJ

S. Juhošová

info

Please Note

11 records found

An Analysis of jqwik Usage in Open-Source Repositories

Property-based testing (PBT) verifies software correctness by checking that specific properties hold across a wide variety of randomly generated inputs. Despite its apparent usefulness, we lack an overview of how PBT is utilized in the Java ecosystem. In this study, we investigated seven repositories using the jqwik framework. We analyzed 84 PBTs to understand the types of properties developers typically test, their use of generators and shrinkers, and the role of PBT in broader testing strategies. Our findings show that PBT remains a small part of most test suites, often representing less than 2% of the total number of tests. Developers tend to focus on mutation, invariant, and round trip as the properties they test, frequently using custom generators combined with filtering but never implementing custom shrinkers. We conclude that property-based testing is not being utilized to its full potential in Java projects and highlight areas for future research, including the impact of filtering, the potential of custom shrinkers, and the overall effectiveness of property-based testing. ...

In-depth study on how developers use Property-Based Testing in Python using Hypothesis

Property-based testing (PBT) allows developers to specify high-level properties that should hold for a range of inputs, which are then automatically generated by the testing framework. Despite its theoretical appeal, PBT is not widely used in the real world. To better understand how PBT is used in practice, we present a qualitative and quantitative study of 87 property-based tests written with the Hypothesis framework in seven widely used Python projects, including cpython, pandas, and jax.

Our analysis reveals that while PBTs are relatively rare in these repositories, they are often simple in structure and highly effective at expressing functional properties. The most common patterns include round-trip checks and comparisons against test oracles, which account for a significant portion of the test suite. We also observed a high rate of custom generator usage (39.1%), but no use of custom shrinking, suggesting strong defaults in Hypothesis. The dataset was variable in stylistic and structural choices, ranging from single-assertion tests to complex hardware-dependent cases.

This study provides new insights into the practical use of PBT in Python, expands on prior quantitative work with qualitative findings, and identifies concrete implications for improving testing frameworks and educational resources. We conclude with recommendations for supporting broader adoption of PBT and outline directions for future research, including cross-language comparison, automated annotation, and temporal analysis of test evolution. ...

A case study of the ‘quickcheck‘ crate used in open source repositories

Property-based testing (PBT) is a method of verifying software correctness in which a property, a statement about the behavior of the program which should always hold true, is verified with a large number of arbitrary inputs. While it is a powerful method, properties can be complex and fail on obscure inputs, making them difficult to reason about. We qualitatively analyze a large number of property-based tests (PBTs) written with Quickcheck for Rust from public repositories, in order to gather insights about how PBT is utilized in practice. We find that the majority of analyzed PBTs verify by comparing against a known correct implementation, composing inverse operations to form an identity, or asserting a desired contract about a system's state. These findings offer valuable direction to the development of PBT, aiming to tailor PBT frameworks to developer needs and make them more natural to work with. ...

An Analysis of QuickCheck usage in Open-Source Haskell Projects

Property-Based Testing (PBT) with QuickCheck has become a cornerstone of reliable software development in Haskell, yet there is little systematic understanding of how developers employ it in production quality libraries. In this study, we perform an empirical analysis of QuickCheck usage in nine representative open-source Haskell projects (aeson, attoparsec, containers, hashable, lens, megaparsec, pandoc-type, text and vector). We extract thousands of
QuickCheck properties and manually sample 217 of them to classify property types such as invariants, roundtrip checks, algorithm correctness, and idempotence. We also analyze all extracted properties to identify patterns in generator and shrinking strategies. Our Results show that invariants and test-oracle tests dominate the sampled properties, while custom generators are used in 55.8% of all properties and custom shrinkers in only 25%. We discuss how these patterns vary across different domains (data structures, text processing, optics, hashing) and highlight the tension between default and custom test infrastructure. Finally, we reflect on the implications for tool support, education, and future research particularly automatic generator inference and improved shrinking utilities to lower
the barrier to effective PBT in large codebases. ...
Proof assistants are tools that allow programmers to write formal proofs. However, despite the improvements made in recent years, there is still a limited number of published user studies in improving the usability of dependently-typed proof assistants. This is especially true for investigating the effects of enhancing error messages on novice programmers, even though it is a popular topic for imperative languages.

This thesis aimed to help fill this gap in knowledge by using Agda as a research vector. We implemented hint enhancements for the error messages displayed upon three common mistakes: forgetting whitespace, using confusable Unicode characters, and supplying too few arguments to a function. A between-participants user study was then conducted with 70 students to determine the effects that these error messages had on the usability of Agda.

Results showed statistically significant improvements in the number of compiling submissions, and the rated "helpfulness" of the error messages (when compared to the original messages without hints). However, we were not able to determine if there was any statistically significant impact on the overall speed at which students resolved the errors. Furthermore, while we identified decreases in the ratings of "incorrect" hints, they did not appear to significantly influence the success rate, or time taken to fix an error. We concluded that enhancing error messages with hints is a promising avenue for improving the usability of dependently-typed proof assistants. ...

Advancing Type-Checker Reliability with the Correct-by-Construction Approach for a Toy Language with Checked Exceptions

Bachelor thesis (2024) - M.A. Kicior, J.G.H. Cockx, S. Juhošová, T. Durieux
The Correct-by-Construction (CbC) programming paradigm has gained increasing attention, particularly with the rise of dependently typed languages. The CbC approach is often characterized as rigid, rule-based construction process making it suitable for critical infrastructure like type-checkers, which are prone to bugs as they become more complex. However, the specific advantages and disadvantages of using the CbC approach for developing type-checkers in comparison to traditional programming languages remain unclear. Therefore, we investigate the development of a type-checker for a toy programming language extended with checked exceptions using the dependently typed programming language Agda. The results show that the CbC approach, combined with dependently typed languages, is highly effective for the very precise task of type-checker development. Despite the steep learning curve associated with these languages, this method offers notable benefits in ensuring the correctness and reliability of type-checkers. We conclude that this approach is a viable strategy for similar projects in the future. ...
Bachelor thesis (2024) - M. Ristić, J.G.H. Cockx, S. Juhošová, T. Durieux
Type-checkers are used to verify certain attributes of programs are correct. Avoiding bugs in type-checkers is especially important when accepting faulty programs has serious real-world consequences. Correct-by-construction programming aims to prevent such bugs by embedding proofs of a program's correctness within the program itself. However, this approach introduces different challenges, such as added complexity. Whether the benefits of correct-by-construction type-checkers outweigh these challenges for more complex language features remains uncertain. This paper investigates correct-by-construction type-checking for a toy language with polymorphic algebraic data types and pattern matching. We do this by implementing a type-checker in the dependently typed language Agda. We show that this approach guarantees a type-checker that does not accept ill-typed terms. Furthermore, we reflect on the challenges of this approach and argue that this approach should be used when a guarantee of correctness is required. ...

Typechecking records with depth and width subtyping

Bachelor thesis (2024) - K.J. Ciaś, J.G.H. Cockx, S. Juhošová, T. Durieux
Typecheckers help avoid bugs in code by catching errors early. Their implementation can, however, be incorrect, leading to inconsistencies in their operation. This research explores how we can use Agda and correct-by-construction programming to create a typechecker guaranteed to be correct in its implementation. For this purpose, I based a toy language on the simply typed lambda calculus extended with records and subtyping. The resulting typechecker is proven to be sound and complete with respect to the typing and subtyping rules of the toy language. This paper compares the correct-by-construction method to existing typecheckers. The new approach offers a greater degree of trust in its implementation but comes at the cost of being more demanding to develop and maintain. ...
Bachelor thesis (2024) - V. Szabó, J.G.H. Cockx, S. Juhošová, T. Durieux
As programming languages become ever more sophisticated, there is growing interest in powerful and expressive type systems. Substructural type systems increase expressiveness by allowing the programmer to reason about the number of times and the order in which variables can be used. This can be used to model different kinds of memory allocation and to construct more expressive interfaces. However, implementing complex type systems requires complex type checkers – this complexity increases the chances of bugs in the type checker itself, which could lead to invalid programs being accepted or valid programs being rejected. The consequences of such bugs can range from programmer frustration to critical errors in production systems.

In this thesis, we develop a correct-by-construction type checker for a toy language with a substructural type system. We use Agda’s dependent type system to intrinsically ensure the soundness and completeness of the type checker. We discuss the advantages and disadvantages of the correct-by-construction approach and find that its complexity likely restricts it to specific use cases. ...

An evaluation of Correct-by-Construction programming in Agda for bug-free type inference algorithms

Bachelor thesis (2024) - V. Pikand, S. Juhošová, J.G.H. Cockx, T. Durieux
Static type systems ensure code correctness by aligning implementations with defined type signatures. Despite their benefits in preventing common errors, complex type systems increase the likelihood of bugs within type checkers. Correct-by-Construction (CbC) programming offers a solution by using precise types to create intrinsically verified type checkers, ensuring soundness and potentially completeness. This paper evaluates the use of CbC programming for implementing type inference for the simply typed λ-calculus, focusing on Hindley-Milner (HM) and bidirectional type inference. Implementations in Agda, a dependently typed language, reveal that while CbC programming can eliminate bugs, it introduces significant complexity. HM type inference proves challenging in terms of soundness and completeness, whereas bidirectional type inference is easier to implement but still complex to prove complete. The study highlights the trade-offs of CbC programming, suggesting it is more suited for research and in-depth understanding rather than pragmatic programming. ...