Authored

1 records found

In this paper we show how Group Equivariant Convolutional Neural Networks use subsampling to learn to break equivariance to the rotation and reflection symmetries. We focus on the 2D rotations and reflections and investigate the impact of the broken equivariance on network perfor ...

Contributed

19 records found

Benchmarking Data and Computational Efficiency of ActionFormer on Temporal Action Localization Tasks

Analysing the Performance and Generalizability of ActionFormer in Resource-constrained Environments

In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin and where they end. Training and testing current state-of-the-art, deep learning models is done assuming access to large amounts of data and computational pow ...

Efficient Video Action Recognition

How well does TriDet perform and generalize in a limited compute power and data setting?

In temporal action localization, given an input video, the goal is to predict the action that is present in the video, along with its temporal boundaries. Several powerful models have been proposed throughout the years, with transformer-based models achieving state-of-the-art per ...

Algal Bloom Forecasting using Remote Sensing

Discovering the most predictive data modalities for Algal Bloom Forecasting

An algal bloom is defined as a rapid increase in common algae (phytoplankton) abundance in water bodies and it can occur when a group of certain environmental factors is combined. If the algae populations grow out of control, such algal blooms become problematic and cause damage ...

Efficient Temporal Action Localization via Vision-Language Modelling

An Empirical Study on the STALE Model's Efficiency and Generalizability in Resource-constrained Environments

Temporal Action Localization (TAL) aims to localize the start and end times of actions in untrimmed videos and classify the corresponding action types. TAL plays an important role in understanding video. Existing TAL approaches heavily rely on deep learning and require large-scal ...

TemporalMaxer Performance in the Face of Constraint: A Study in Temporal Action Localization

A Comprehensive Analysis on the Adaptability of TemporalMaxer in Resource-Scarce Environments

This paper presents an analysis of the data and compute efficiency of the TemporalMaxer deep learning model in the context of temporal action localization (TAL), which involves accurately detecting the start and end times of specific video actions. The study explores the performa ...

Using and Abusing Equivariance

Investigating Differences between Exact and Approximate Equivariance in Computer Vision

In this work we show how Group Equivariant Convolutional Neural Networks use subsampling to learn to break equivariance to their symmetries. We focus on the 2D roto-translation group and investigate the impact of broken equivariance on network performance. We show that changing ...

Algal Bloom Forecasting in a Classification and Regression Setting

Implementing a UNet Architecture to evaluate the differences between both settings

Forecasting algal blooms using remote sensing data is less labour-intensive and has better cover- age in time and space than direct water sampling. The paper implements a deep learning technique, the UNet Architecture, to predict the chlorophyll concentration, which is a good ind ...
Data collection by means of crowdsourcing can be costly or produce inaccurate results. Methods have been proposed for solving these problems. However, it remains unclear what methods work best in scenarios with multiple similar objects of interest present in the same image, which ...
Data collection and annotation have proven to be a bottleneck for computer vision applications. When faced with the task of data creation, alternative methods to traditional data collection should be considered, as time and cost may be reduced signif- icantly. We introduce three ...
The term ”Algal Bloom” refers to the accumulation of algae in a confined geological space. They may harm human health and negatively affect ecological systems around the area. Thus, forecasting algal blooms could mitigate the environmental and socio-economical damages. Particular ...
This research presents a method for forecasting algal blooms using remote sensing with spatially and temporally sparse satellite data. The method involves the use of multiple interpolation methods to interpolate the sparse input data. The approach is shown to be effective in pred ...
The natural world is long-tailed: rare classes are observed orders of magnitudes less frequently than common ones, leading to highly-imbalanced data where rare classes can have only handfuls of examples. Learning from few examples is a known challenge for deep learning based clas ...
In the field of ecology, camera traps are important tools to collect information on the wildlife of certain areas. The problem that arises with many camera traps is that they can collect more images than a human can realistically go trough all by themselves. To help classify thes ...
The possibility to improve an existing method by making (part of) it learnable is explored in this research. The work that this research extends added prior knowledge to a Convolutional Neural Network (CNN) to improve its performance when dealing with an illumination shift. The m ...
This research paper analyses the effect that using frequency information can have on object detectors. The latter are complex networks that learn information about objects from images and are then able to predict the location of these objects in new, unseen images. There are, how ...
Camera traps are used around the world to provide data on species, population sizes and how species are interacting. However this creates a lot of work in identifying which animal was actually spotted near the camera. Attempts have been made to use deep-learning to identify anima ...
Wheat is a widely used ingredient for food products. To increase the productionand quality of wheat, the density of ’wheat heads’ in a farm can be studied. Accuratelylocating wheat heads in images can be challenging. A lot of work has taken place insupervised semantic segmentatio ...
Color Invariant Convolution (CIConv) is a learnable Convolutional Neural Network (CNN) layer that reduces the distribution shift between the source and target set in the CNN under an illumination-based domain shift. We explore the semantic segmentation performance for daynight do ...
The benchmarks for the accuracy of the best performing object detectors to date are usually based on homogeneous datasets, including objects such as vehicles, people, animals and foods. This excludes a whole set of scenarios containing small, cluttered and rotated objects. This p ...