On Sample-Wise Strict Monotonicity with a Gradient Update

Conference Paper (2026)
Author(s)

O. Taylan Turan (TU Delft - Pattern Recognition and Bioinformatics)

Marco Loog (Radboud University Nijmegen)

David M.J. Tax (TU Delft - Pattern Recognition and Bioinformatics)

Research Group
Pattern Recognition and Bioinformatics
DOI related publication
https://doi.org/10.1007/978-3-032-23833-7_6 Final published version
More Info
expand_more
Publication Year
2026
Language
English
Research Group
Pattern Recognition and Bioinformatics
Pages (from-to)
72-83
Publisher
Springer
ISBN (print)
978-3-032-23832-0
ISBN (electronic)
978-3-032-23833-7
Event
24th International Symposium on Intelligent Data Analysis, IDA 2026 (2026-04-22 - 2026-04-24), Leiden, Netherlands
Downloads counter
6
Reuse Rights

Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons.

Abstract

Learning curves describe how the performance of a model evolves with increasing training data. Although more data is generally expected to improve model performance, in practice models can exhibit non-monotonic behavior where additional data leads to performance degradation. Sample-wise double descent is one particular example. We address the question of how a learner can have a provably monotone learning curve. For isotropic Gaussian covariates under a Gaussian noise model and a linear predictor, we prove that a single step of steepest descent guarantees sample-wise monotonicity in the learning curve, if the step size does not exceed an upper bound. Furthermore, we present a practical procedure that ensures monotonicity without explicit regularization or cross-validation, using initialization from the previous training set size. Experiments on real-world datasets demonstrate that this method achieves monotone behavior and improved sample efficiency compared to ordinary least squares and optimally regularized ridge regression. We also explore extensions to binary classification, where monotonicity depends on the chosen performance metric. While our guarantees are derived under simplifying assumptions, they provide both theoretical and practical insights for constructing monotone learners and for understanding and mitigating sample-wise double descent behavior.

Files

978-3-032-23833-7_6.pdf
(pdf | 0.707 Mb)
Taverne
warning

File under embargo until 18-10-2026