WV
W.S. Volkers
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
2 records found
1
Individualizing mechanical ventilation treatment regimes remains a challenge in the intensive care unit (ICU). Reinforcement Learning (RL) offers the potential to improve patient outcomes and reduce mortality risk, by optimizing ventilation treatment regimes. We focus on the Offline RL setting, using Offline Policy Evaluation (OPE), specifically importance sampling (IS), to evaluate policies learned from observational data. Using a running example, we illustrate how a large difference between the learned policy and actual clinical behavior (behavior policy) limits the reliability of IS-based OPE. To assess this reliability, we use the Effective Sample Size (ESS) as a diagnostic. To achieve reliable evaluation, we apply policy shaping, by incorporating a divergence constraint in the policy learning objective, aiming to reduce the difference between the evaluation and behavior policy. We consider both a Kullback-Leibler (KL) divergence constraint and introduce a new constraint, the ESS divergence. Since effective OPE relies on an accurate estimate of the true behavior policy, we address how such an estimate is acquired. Various classifiers for estimating the behavior policy are systematically evaluated, focusing on both discrimination and calibration performance. Empirical results show the difficulty of learning policies that outperform existing clinical practices and generalize well to unseen patients. Although policy shaping improves the reliability of policy evaluations, no policies that consistently outperform clinician practice were found. The KL divergence constraint generalized better to unseen patients than the ESS divergence, which achieved large ESS without actually reducing the difference between the evaluation and behavior policy. We underscore the necessity of a cautious approach to applying RL in healthcare, and advocate that assessing OPE reliability and behavior policy calibration becomes standard practice, to ensure that only effective and reliable RL policies are considered for real-world clinical trials.
...
Individualizing mechanical ventilation treatment regimes remains a challenge in the intensive care unit (ICU). Reinforcement Learning (RL) offers the potential to improve patient outcomes and reduce mortality risk, by optimizing ventilation treatment regimes. We focus on the Offline RL setting, using Offline Policy Evaluation (OPE), specifically importance sampling (IS), to evaluate policies learned from observational data. Using a running example, we illustrate how a large difference between the learned policy and actual clinical behavior (behavior policy) limits the reliability of IS-based OPE. To assess this reliability, we use the Effective Sample Size (ESS) as a diagnostic. To achieve reliable evaluation, we apply policy shaping, by incorporating a divergence constraint in the policy learning objective, aiming to reduce the difference between the evaluation and behavior policy. We consider both a Kullback-Leibler (KL) divergence constraint and introduce a new constraint, the ESS divergence. Since effective OPE relies on an accurate estimate of the true behavior policy, we address how such an estimate is acquired. Various classifiers for estimating the behavior policy are systematically evaluated, focusing on both discrimination and calibration performance. Empirical results show the difficulty of learning policies that outperform existing clinical practices and generalize well to unseen patients. Although policy shaping improves the reliability of policy evaluations, no policies that consistently outperform clinician practice were found. The KL divergence constraint generalized better to unseen patients than the ESS divergence, which achieved large ESS without actually reducing the difference between the evaluation and behavior policy. We underscore the necessity of a cautious approach to applying RL in healthcare, and advocate that assessing OPE reliability and behavior policy calibration becomes standard practice, to ensure that only effective and reliable RL policies are considered for real-world clinical trials.
B2B Customer Insight Tool
Automated Data Analytics to improve the Deal Analytics workflow
Bachelor thesis
(2020)
-
T.J. Langhout, S.A.J. van Leeuwen, C.I. Ort, W.S. Volkers, D.H.J. Epema, M. Kerkhof, L. Gunneweg, T. Boevink
The Deal Analytics group of PricewaterhouseCoopers Amsterdam has requested a tool for automatising the business-to-business customer analysis. This analysis was performed manually, which left room for performance improvement. This report discusses how the a product was developed which automates the analyses After two weeks of initial research, a complete system was designed and implemented in the subsequent nine weeks. The tool consists of two distinct parts: a front-end and a back-end. The front-end allows the user to customise the analysis to its own preferences, and communicates with the back-end to efficiently perform the analysis. With the help of user evaluations, the front-end has been designed such that it is usable by any PwC employee within the Deals branch.The back-end uses data analysis techniques and machine learning to analyse customer behaviour. Strong points and growth opportunities of a company are found using techniques such as customer segmentation, regression analysis, and cross-sell analysis. The product has been tested using a variety of techniques to ensure that the software does not crash on unexpected input. The final product is evaluated based on the requirements, design goals and success criteria set at the start of the project and can be considered successful.
...
The Deal Analytics group of PricewaterhouseCoopers Amsterdam has requested a tool for automatising the business-to-business customer analysis. This analysis was performed manually, which left room for performance improvement. This report discusses how the a product was developed which automates the analyses After two weeks of initial research, a complete system was designed and implemented in the subsequent nine weeks. The tool consists of two distinct parts: a front-end and a back-end. The front-end allows the user to customise the analysis to its own preferences, and communicates with the back-end to efficiently perform the analysis. With the help of user evaluations, the front-end has been designed such that it is usable by any PwC employee within the Deals branch.The back-end uses data analysis techniques and machine learning to analyse customer behaviour. Strong points and growth opportunities of a company are found using techniques such as customer segmentation, regression analysis, and cross-sell analysis. The product has been tested using a variety of techniques to ensure that the software does not crash on unexpected input. The final product is evaluated based on the requirements, design goals and success criteria set at the start of the project and can be considered successful.