Authored

5 records found

t-EVA

Time-Efficient t-SNE Video Annotation

Video understanding has received more attention in the past few years due to the availability of several large-scale video datasets. However, annotating large-scale video datasets are cost-intensive. In this work, we propose a time-efficient video annotation method using spatio-t ...

Hallucination In Object Detection

A Study In Visual Part VERIFICATION

We show that object detectors can hallucinate and detect missing objects; potentially even accurately localized at their expected, but non-existing, position. This is particularly problematic for applications that rely on visual part verification: detecting if an object part is p ...

PUNet

Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations

Popular approaches to classifying action segments in long, realistic, untrimmed videos start with high quality action proposals. Current action proposal methods based on deep learning are trained on labeled video segments. Obtaining annotated segments for untrimmed videos is time ...

PUNet

Temporal Action Proposal Generation With Positive Unlabeled Learning Using Key Frame Annotations

Popular approaches to classifying action segments in long, realistic, untrimmed videos start with high quality action proposals. Current action proposal methods based on deep learning are trained on labeled video segments. Obtaining annotated segments for untrimmed videos is time ...

On translation invariance in CNNs

Convolutional layers can exploit absolute spatial location

In this paper we challenge the common assumption that convolutional layers in modern CNNs are translation invariant. We show that CNNs can and will exploit the absolute spatial location by learning filters that respond exclusively to particular absolute locations by exploiting im ...

Contributed

15 records found

Architectural Innovations for Efficient Denoising and Classification

A Manual vs. Neural Architecture Search Comparison

In this paper, we combine image denoising and classification, aiming to enhance human perception of noisy images captured by edge devices, like security cameras. Since edge devices have little computational power, we also optimize for efficiency by proposing a novel architecture ...

One model, denoise them all!

A Comprehensive Investigation of Denoising Transfer Learning

Deep convolutional neural networks (CNNs) have achieved current state-of-the-art in image denoising, but require large datasets for training. Their performance remains limited on smaller real-noise datasets. In this paper, we investigate robust deep learning denoising using trans ...

Tilting at windmills

Data augmentation for deep pose estimation does not help with occlusions

Occlusion degrades the performance of human pose estimation. In this paper, we introduce targeted keypoint and body part occlusion attacks. The effects of the attacks are systematically analyzed on the best-performing methods. In addition, we propose occlusion specific data augme ...

Combining denoising and object detection

An analysis to provide insights in combining denoising with object detection

Automated imaging systems, critical in domains like medical imaging, autonomous driving, and security, experience noise from camera sensors and electronic circuits in bad or dark lighting conditions. This impacts downstream tasks, including object detection. However, an analysis ...

Object Roughly There: CAM - based Weakly Supervised Object Detection

Reducing the labelling efforts for deep learned object detectors

Highly performing object detectors require large training datasets, which entail class and bounding box annotations. To reduce the labelling effort of curating such datasets, Weakly Supervised Object Detection is concerned with training object detectors from only class labels. Th ...

Effects of adding unlabeled training data through pseudo-labeling

Reducing labeling effort for deep learned object detectors

Pseudo-labeling involves training models on a small amount of labeled data and then using those models' predictions on unlabeled data as labels for further training, which therefore decreases the required labeling effort. In this paper, we investigate the effects of pseudo-labeli ...

Effects of adding unlabeled training data through pseudo-labeling

Reducing labeling effort for deep learned object detectors

Pseudo-labeling involves training models on a small amount of labeled data and then using those models' predictions on unlabeled data as labels for further training, which therefore decreases the required labeling effort. In this paper, we investigate the effects of pseudo-labeli ...

The effect of grouping classes into hierarchical structures for object detection

Reducing labelling effort for deep learned object detectors

A way to reduce labelling effort and improve accuracy for object detection is class grouping. In this research, we experiment with creating hierarchical tree structures of grouped classes (super-classes). Our objective is to find out what the effects are of grouping classes in te ...

The effect of grouping classes into hierarchical structures for object detection

Reducing labelling effort for deep learned object detectors

A way to reduce labelling effort and improve accuracy for object detection is class grouping. In this research, we experiment with creating hierarchical tree structures of grouped classes (super-classes). Our objective is to find out what the effects are of grouping classes in te ...

Identifying Labeling Errors Without Access to Ground Truth

Exploring Ensemble Methods for Error Detection and Rectification

Object detection heavily relies on accurate annotations, which are costly to obtain but crucial for model performance. Annotation errors can severely impact the reliability of detection models. In response to this challenge, we introduce EnsembAudit (EA), a novel framework design ...

Identifying Labeling Errors Without Access to Ground Truth

Exploring Ensemble Methods for Error Detection and Rectification

Object detection heavily relies on accurate annotations, which are costly to obtain but crucial for model performance. Annotation errors can severely impact the reliability of detection models. In response to this challenge, we introduce EnsembAudit (EA), a novel framework design ...
Object detectors, much like humans, perform less well on small than on large objects. Because of this, the object size distribution of a dataset influences the average precision a network achieves on that dataset. Therefore, the object size/precision curve of a network might be a ...
This thesis presents a novel self-supervised approach of learning visual representations from videos containing human actions. Our approach tackles the complex problem of learning without the need of labeled data by exploring to what extent the ideas successfully used for images ...
Creating big datasets is often difficult or expensive which causes people to augment their dataset with rendered images. This often fails to significantly improve accuracy due to a difference in distribution between real and rendered datasets. This paper shows that the gap betwee ...
A good action proposal method should generate proposals with high recall and high temporal overlap with groundtruth. The quality of the proposals relies on the labeled data available during training. Obtaining labeled data for untrimmed videos is a time consuming, expensive and e ...