JK

J.B. Katzy

info

Please Note

2 records found

Master thesis (2022) - J.B. Katzy, M. Finavaro Aniche, S.A.M. Mir
We explored the effect of augmenting a standard language model’s architecture (BERT) with a structural component based on the Abstract Syntax Trees (ASTs) of the source code. We created a universal abstract syntax tree structure that can be applied to multiple languages to enable the model to work in a multilingual setting. We adapted the general graph transformer architecture to function as the structural component of the transformer. Furthermore, we extended the Embeddings from Language Models (ELMo) style embeddings to work in a multilingual setting when working with incomplete source code. The final results showed that the multilingual setting was beneficial to achieving higher quality embeddings for the embedding model, however, monolingual models performed better in most cases for the transformer model. The addition of ASTs resulted in increased performance in the best performing models on all languages, while also reducing the need for a pre-training task to achieve the best performance. The largest increase in performance for a Java model compared to its baseline counterpart was 3.0% on average on the test set, the largest increase in performance for a Julia model compared to its baseline counterpart was 1.1% on average on the test set, and the largest increase in performance of a CPP model compared to its baseline counterpart was 5.7% on average on the test set. ...
Bachelor thesis (2018) - Jonathan Katzy, Tim Rietveld, Jaap-Jan van der Steeg, Erik Wiegel, Birna van Riemsdijk, Huijuan Wang, Stefan Dorresteijn, Roel Bloo, Catholijn Jonker
As Machine Learning is becoming more accessible to small businesses, thanks to the rapid advance in computing power, smaller start-ups such as Sjauf (a ride sharing start-up) are starting to get interested in implementing Machine Learning solutions in their product. Sjauf needed a system that could automatically tell its customers how much a certain trip would cost them. Using this information multiple different models were developed and integrated into an ensemble. This ensemble as well as the models used by it were then used for price prediction. This project is a proof of concept to show that Machine Learning is capable of solving this problem in real time.

After researching state of the art Machine Learning models for price recommendation, the architecture of the system was designed. The supplied data was preprocessed, after which a custom Genetic Algorithm was developed for optimising models and ensembles. After validation on real-life company data, a comparison using empirical metrics was conducted. We use these empirical metrics to show that a bagging ensemble is the most efficient and accurate model for this purpose. This bagging ensemble outperformed the currently implemented functions, whilst adhering to the set boundaries on response times. Lastly, recommendations are made to the company with an overview of potential future work in this subject.
...