Circular Image

J.C. van Gemert

info

Please Note

79 records found

HAVANA

Hierarchical Stochastic Neighbor Embedding for Accelerated Video ANnotAtions

Video annotation is a critical and time-consuming task in computer vision research and applications. This paper presents a novel annotation pipeline that uses pre-extracted features and dimensionality reduction to accelerate the temporal video annotation process. Our approach use ...

MSD

A Benchmark Dataset for Floor Plan Generation of Building Complexes

Diverse and realistic floor plan data are essential for the development of useful computer-aided methods in architectural design. Today’s large-scale floor plan datasets predominantly feature simple floor plan layouts, typically representing single-apartment dwellings only. To co ...

CleanUMamba

A Compact Mamba Network for Speech Denoising using Channel Pruning

This paper presents CleanUMamba, a time-domain neural network architecture designed for real-time causal audio denoising directly applied to raw waveforms. CleanUMamba leverages a U-Net encoder-decoder structure, incorporating the Mamba state-space model in the bottleneck layer. ...
Quantitative cardiac magnetic resonance imaging (MRI) is an increasingly important diagnostic tool for cardiovascular diseases. Yet, co-registration of all baseline images within the quantitative MRI sequence is essential for the accuracy and precision of quantitative maps. Howev ...
Objects can take up an arbitrary number of pixels in an image: Objects come in different sizes, and, photographs of these objects may be taken at various distances to the camera. These pixel size variations are problematic for CNNs, causing them to learn separate filters for scal ...

Learn & drop

Fast learning of cnns based on layer dropping

This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, t ...
Many real-world applications, from sport analysis to surveillance, benefit from automatic long-term action recognition. In the current deep learning paradigm for automatic action recognition, it is imperative that models are trained and tested on datasets and tasks that evaluate ...
Activity progress prediction aims to estimate what percentage of an activity has been completed. Currently this is done with machine learning approaches, trained and evaluated on complicated and realistic video datasets. The videos in these datasets vary drastically in length and ...

SSIG

A Visually-Guided Graph Edit Distance for Floor Plan Similarity

We propose a simple yet effective metric that measures structural similarity between visual instances of architectural floor plans, without the need for learning. Qualitatively, our experiments show that the retrieval results are similar to deeply learned methods. Effectively com ...
Objective: Myasthenia gravis (MG) is an autoimmune disease leading to fatigable muscle weakness. Extra-ocular and bulbar muscles are most commonly affected. We aimed to investigate whether facial weakness can be quantified automatically and used for diagnosis and disease monitori ...
Strawberries are profitable fruits, yet they have a short shelf life. Therefore, it is crucial to anticipate their quality and harvest them at the best time, which is vital not only for finding the appropriate market but also for minimizing food and economic waste. To this end, n ...

Objects do not disappear

Video object detection by single-frame object location anticipation

Objects in videos are typically characterized by continuous smooth motion. We exploit continuous smooth motion in three ways. 1) Improved accuracy by using object motion as an additional source of supervision, which we obtain by anticipating object locations from a static keyfram ...
Binary Neural Networks (BNNs) are compact and efficient by using binary weights instead of real-valued weights. Current BNNs use latent real-valued weights during training, where hyper-parameters are inherited from real-valued networks. The interpretation of several of these hype ...
In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin, and where they end. Training and testing current state-of- the-art deep learning models requires access to large amounts of data and computational power. How ...
In this paper we show how Group Equivariant Convolutional Neural Networks use subsampling to learn to break equivariance to the rotation and reflection symmetries. We focus on the 2D rotations and reflections and investigate the impact of the broken equivariance on network perfor ...

Video BagNet

Short temporal receptive fields increase robustness in long-term action recognition

Previous work on long-term video action recognition relies on deep 3D-convolutional models that have a large temporal receptive field (RF). We argue that these models are not always the best choice for temporal modeling in videos. A large temporal receptive field allows the model ...
Equivariance w.r.t. geometric transformations in neural networks improves data efficiency, parameter efficiency and robustness to out-of-domain perspective shifts. When equivariance is not designed into a neural network, the network can still learn equivariant functions from the ...
Literature on medical imaging segmentation claims that hybrid UNet models containing both Transformer and convolutional blocks perform better than purely convolutional UNet models. This recently touted success of hybrid Transformers warrants an investigation into which of its com ...
Deep learning algorithms are increasingly employed at the edge. However, edge devices are resource constrained and thus require efficient deployment of deep neural networks. Pruning methods are a key tool for edge deployment as they can improve storage, compute, memory bandwidth, ...

LAB

Learnable Activation Binarizer for Binary Neural Networks

Binary Neural Networks (BNNs) are receiving an up-surge of attention for bringing power-hungry deep learning towards edge devices. The traditional wisdom in this space is to employ sign(.) for binarizing feature maps. We argue and illustrate that sign(.) is a uniqueness bottlenec ...