JN
J.W. Nelen
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
With the advent of large language models (LLMs), developing solutions for Natural Language Processing (NLP) tasks has become more approachable. However, these models are opaque, which presents several challenges, such as prompt engineering, quality assessment, and error analysis. Explainability methods can have several potential benefits, such as improving accuracy, increasing trust, and assessing quality. However, limited research exists on how explainability techniques can be applied to LLMs in practice, particularly using human-centred methodologies. Therefore, this study takes a user-centered approach, investigating the needs and challenges of the NLP data scientist and developing an explainability tool to address these needs. This approach is done by conducting a formative study to deepen our understanding of the user, combined with relevant literature. The observations from the formative study were used to develop a tool tailored to the user’s specific needs. This development was done by creating requirements and a design based on the findings of the formative study, followed by a proof of concept implementation. User satisfaction was assessed through practical interviews with a fairness dataset, providing insights into the usefulness and usability of the explanation techniques and the tool. The tool implements three explanation techniques: uncertainty, token-level feature attribution, and contrastive explanations. These can be viewed using a web application separated from the Python development environment, making it easy to interact with. Other key features are that it can be easily integrated into the user’s existing workflow, is usable in practice and can be presented to different stakeholders within the project. The evaluation concluded that the tool fits the workflow and does indeed help the NLP data scientist to understand the model. However, the evaluation also showed that the explainability techniques did not provide the necessary insights to achieve the user’s goal, mainly to improve the model’s accuracy and make the error analysis actionable. More research should be done to see which other explainability techniques could provide insights that would lead to objectively better performance of these models. Finally, more explainability techniques should be developed that do not focus on debugging the model but rather on revealing its behaviour and thus providing a better understanding of how to improve it.
...
With the advent of large language models (LLMs), developing solutions for Natural Language Processing (NLP) tasks has become more approachable. However, these models are opaque, which presents several challenges, such as prompt engineering, quality assessment, and error analysis. Explainability methods can have several potential benefits, such as improving accuracy, increasing trust, and assessing quality. However, limited research exists on how explainability techniques can be applied to LLMs in practice, particularly using human-centred methodologies. Therefore, this study takes a user-centered approach, investigating the needs and challenges of the NLP data scientist and developing an explainability tool to address these needs. This approach is done by conducting a formative study to deepen our understanding of the user, combined with relevant literature. The observations from the formative study were used to develop a tool tailored to the user’s specific needs. This development was done by creating requirements and a design based on the findings of the formative study, followed by a proof of concept implementation. User satisfaction was assessed through practical interviews with a fairness dataset, providing insights into the usefulness and usability of the explanation techniques and the tool. The tool implements three explanation techniques: uncertainty, token-level feature attribution, and contrastive explanations. These can be viewed using a web application separated from the Python development environment, making it easy to interact with. Other key features are that it can be easily integrated into the user’s existing workflow, is usable in practice and can be presented to different stakeholders within the project. The evaluation concluded that the tool fits the workflow and does indeed help the NLP data scientist to understand the model. However, the evaluation also showed that the explainability techniques did not provide the necessary insights to achieve the user’s goal, mainly to improve the model’s accuracy and make the error analysis actionable. More research should be done to see which other explainability techniques could provide insights that would lead to objectively better performance of these models. Finally, more explainability techniques should be developed that do not focus on debugging the model but rather on revealing its behaviour and thus providing a better understanding of how to improve it.
Bachelor thesis
(2021)
-
M.P.C. van der Werf, J.W. Nelen, T.R.D. van Graft, J.M. Nederlof, C.C.S. Liem
Bluetick offers a juridical research platform that enables lawyers to search for cases and jurisprudence efficiently. Most Dutch legal alternatives are still old-fashioned search engines. Bluetick wants to move towards a zero-search-based approach where the system learns about the user's preference and provides them with recommendations. For the user, this means that they have to spend less time searching for cases while still finding all the relevant material. To reach this goal of zero-search, the quality of the recommendations must be high. Therefore improvements in this area are believed to result in a more lucrative product.
This report describes the process of improving the version of the recommender system that was already implemented by Bluetick. The main contributions are evaluated by their effect on the recommender system, and their role in creating a more maintainable, extensible and transparent product.
The first contribution of the team was a refactor of the old system. Using classes and interfaces, the new version makes it easier to do advanced computations on the results, while the interface makes it easier for Bluetick to add additional parts on which recommendations can be based. Secondly, similar to many existing webshops, the new system provides the user with insight into why items are recommended. Lastly, the user is now able to provide the system with relevant law articles at the start, so that the recommender system can give recommendations before the first search. ...
This report describes the process of improving the version of the recommender system that was already implemented by Bluetick. The main contributions are evaluated by their effect on the recommender system, and their role in creating a more maintainable, extensible and transparent product.
The first contribution of the team was a refactor of the old system. Using classes and interfaces, the new version makes it easier to do advanced computations on the results, while the interface makes it easier for Bluetick to add additional parts on which recommendations can be based. Secondly, similar to many existing webshops, the new system provides the user with insight into why items are recommended. Lastly, the user is now able to provide the system with relevant law articles at the start, so that the recommender system can give recommendations before the first search. ...
Bluetick offers a juridical research platform that enables lawyers to search for cases and jurisprudence efficiently. Most Dutch legal alternatives are still old-fashioned search engines. Bluetick wants to move towards a zero-search-based approach where the system learns about the user's preference and provides them with recommendations. For the user, this means that they have to spend less time searching for cases while still finding all the relevant material. To reach this goal of zero-search, the quality of the recommendations must be high. Therefore improvements in this area are believed to result in a more lucrative product.
This report describes the process of improving the version of the recommender system that was already implemented by Bluetick. The main contributions are evaluated by their effect on the recommender system, and their role in creating a more maintainable, extensible and transparent product.
The first contribution of the team was a refactor of the old system. Using classes and interfaces, the new version makes it easier to do advanced computations on the results, while the interface makes it easier for Bluetick to add additional parts on which recommendations can be based. Secondly, similar to many existing webshops, the new system provides the user with insight into why items are recommended. Lastly, the user is now able to provide the system with relevant law articles at the start, so that the recommender system can give recommendations before the first search.
This report describes the process of improving the version of the recommender system that was already implemented by Bluetick. The main contributions are evaluated by their effect on the recommender system, and their role in creating a more maintainable, extensible and transparent product.
The first contribution of the team was a refactor of the old system. Using classes and interfaces, the new version makes it easier to do advanced computations on the results, while the interface makes it easier for Bluetick to add additional parts on which recommendations can be based. Secondly, similar to many existing webshops, the new system provides the user with insight into why items are recommended. Lastly, the user is now able to provide the system with relevant law articles at the start, so that the recommender system can give recommendations before the first search.