AN

Alexandra Neagu

info

Please Note

2 records found

An Experimental Journey into How Depth Shapes Generalisation in Vision Models

Convolutional neural networks (CNNs) trained on RGB images (red, green, blue channels) often exhibit sharp performance degradation under distribution shifts, as they tend to rely on superficial appearance cues such as background or texture. While depth information is known to provide complementary geometric signals that can improve robustness, most existing approaches assume access to ground-truth depth or rely on complex RGB-D architectures, limiting their applicability in practice.

In this work, we investigate whether estimated depth, obtained from a monocular RGB image, can serve as a simple and effective auxiliary signal to improve out-of-distribution (OOD) generalisation in standard CNN classifiers. Using both controlled toy experiments and real-world evaluations on the NICO++ benchmark, we compare RGB-only models against RGB-D variants that incorporate a single predicted depth channel via minimal fusion. Our results show that pseudo-depth consistently reduces OOD performance gaps across multiple CNN backbones, without degrading in-distribution accuracy. We further demonstrate that these gains persist under moderate corruption of the depth signal and disappear when geometric structure is entirely removed, indicating that the improvements stem from meaningful geometric information rather than the mere presence of an additional input channel. Furthermore, we analyse these effects through class-resolved confusion matrices and qualitative input-level examples, showing that depth specifically attenuates structured semantic confusions under domain shift.

Taken together, our findings suggest that even imperfect, predicted depth can act as a lightweight geometric inductive bias, helping CNN classifiers move away from brittle appearance-based shortcuts and toward more robust representations under domain shift.

https://gitlab.ewi.tudelft.nl/in5000/janvangemert/alexandraioana ...
In recent years, significant progress has been made in the field of natural language processing (NLP) through the development of large language models (LLMs) like BERT and ChatGPT. These models have showcased remarkable abilities across a range of NLP tasks. However, effectively harnessing their potential requires meticulous prompt engineering and a comprehensive understanding of their limitations.

Additionally, LLMs have attracted attention in the educational domain for their potential to enhance learning and teaching experiences, particularly in fostering the development of computational thinking skills.
This paper aims to explore the potential of leveraging NLP and prompt engineering techniques to generate successful solutions to coding problems following initial failures. Furthermore, the research explores the potential applications of NLP techniques in teaching and learning practices involving LLMs and their potential drawbacks in this context. ...