ML

M. Loog

info

Please Note

85 records found

Learning curves show the expected performance with respect to training set size. This is often used to evaluate and compare models, tune hyper-parameters and determine how much data is needed for a specific performance. However, the distributional properties of performance are fr ...
Dementia risk scores are commonly used tools to estimate the risk of developing Alzheimer's disease and dementia. We lack an overview of what risk scores are used for, what is claimed they ought to be used for, and whether they are suitable for these applications. To address this ...
Neural networks are typically initialized such that the hidden pre-activations’ theoretical variance remains constant to avoid the vanishing and exploding gradient problem. This condition is necessary to train very deep networks, but numerous analyses show this to be insufficient ...
Learning curves describe how the performance of a model evolves with increasing training data. Although more data is generally expected to improve model performance, in practice models can exhibit non-monotonic behavior where additional data leads to performance degradation. Samp ...
Introduction: Dynamic survival analysis has become an effective approach for predicting time-to-event outcomes based on longitudinal data in neurology, cognitive health, and other health-related domains. With advancements in machine learning, several new methods have been introdu ...
Learning curves depict how a model’s expected performance changes with varying training set sizes, unlike training curves, showing a gradient-based model’s performance with respect to training epochs. Extrapolating learning curves can be useful for determining the performance gai ...

Also for k-means

More data does not imply better performance

Arguably, a desirable feature of a learner is that its performance gets better with an increasing amount of training data, at least in expectation. This issue has received renewed attention in recent years and some curious and surprising findings have been reported on. In essence ...

Social Processes

Self-supervised Meta-learning Over Conversational Groups for Forecasting Nonverbal Social Cues

Free-standing social conversations constitute a yet underexplored setting for human behavior forecasting. While the task of predicting pedestrian trajectories has received much recent attention, an intrinsic difference between these settings is how groups form and disband. Eviden ...

Percolate

An Exponential Family JIVE Model to Design DNA-Based Predictors of Drug Response

Motivation: Anti-cancer drugs may elicit resistance or sensitivity through mechanisms which involve several genomic layers. Nevertheless, we have demonstrated that gene expression contains most of the predictive capacity compared to the remaining omic data types. Unfortunately, t ...
Many methods for Model-based Reinforcement learning (MBRL) in Markov decision processes (MDPs) provide guarantees for both the accuracy of the model they can deliver and the learning efficiency. At the same time, state abstraction techniques allow for a reduction of the size of a ...

LCDB 1.0

An Extensive Learning Curves Database for Classification Tasks

The use of learning curves for decision making in supervised machine learning is standard practice, yet understanding of their behavior is rather limited. To facilitate a deepening of our knowledge, we introduce the Learning Curve Database (LCDB), which contains empirical learnin ...
Estimating uncertainty of machine learning models is essential to assess the quality of the predictions that these models provide. However, there are several factors that influence the quality of uncertainty estimates, one of which is the amount of model misspecification. Model m ...
Learning curves provide insight into the dependence of a learner's generalization performance on the training set size. This important tool can be used for model selection, to predict the effect of more training data, and to reduce the computational complexity of model training a ...
Model-based reinforcement learning methods are promising since they can increase sample efficiency while simultaneously improving generalizability. Learning can also be made more efficient through state abstraction, which delivers more compact models. Model-based reinforcement le ...
We illustrate the detrimental effect, such as overconfident decisions, that exponential behavior can have in methods like classical LDA and logistic regression. We then show how polynomiality can remedy the situation. This, among others, leads purposefully to random-level perform ...
Semi-supervised learning is the learning setting in which we have both labeled and unlabeled data at our disposal. This survey covers theoretical results for this setting and maps out the benefits of unlabeled data in classification and regression tasks. Most methods that use unl ...
Though much effort has been spent on designing new active learning algorithms, little attention has been paid to the initialization problem of active learning, i.e., how to find a set of labeled samples which contains at least one instance per category. This work identifies the i ...
Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we deve ...
Consider a domain-adaptive supervised learning setting, where a classifier learns from labeled data in a source domain and unlabeled data in a target domain to predict the corresponding target labels. If the classifier’s assumption on the relationship between domains (e.g. covari ...
We investigate to which extent one can recover class probabilities within the empirical risk minimization (ERM) paradigm. We extend existing results and emphasize the tight relations between empirical risk minimization and class probability estimation. Following previous literatu ...