Authored

1 records found

Color is a crucial visual cue readily exploited by Convolutional Neural Networks (CNNs) for object recognition. However, CNNs struggle if there is data imbalance between color variations introduced by accidental recording conditions. Color invariance addresses this issue but does ...

Contributed

19 records found

Benchmarking Data and Computational Efficiency of ActionFormer on Temporal Action Localization Tasks

Analysing the Performance and Generalizability of ActionFormer in Resource-constrained Environments

In temporal action localization, given an input video, the goal is to predict which actions it contains, where they begin and where they end. Training and testing current state-of-the-art, deep learning models is done assuming access to large amounts of data and computational pow ...

Efficient Video Action Recognition

How well does TriDet perform and generalize in a limited compute power and data setting?

In temporal action localization, given an input video, the goal is to predict the action that is present in the video, along with its temporal boundaries. Several powerful models have been proposed throughout the years, with transformer-based models achieving state-of-the-art per ...

Algal Bloom Forecasting

Classical Machine Learning versus Deep-Learning

The aim of this paper is to find out which Machine Learning (ML) model predicts the concentration of Chlorophyll-a, in the Palmar lake in Uruguay best. Currently there are no such models to predict the growth in this lake. The algorithms which will be compared in this paper are a ...

Weight Swapping

A new method for Supervised Domain Adaptation in Computer Vision using Discrete Optimization

Training Convolutional Neural Network (CNN) models is difficult when there is a lack of labeled training data and no unlabeled data is available. A popular method for this is domain adaptation where the weights of a pre-trained CNN model are transferred to the problem setup. The ...

Algal Bloom Forecasting using Remote Sensing

Discovering the most predictive data modalities for Algal Bloom Forecasting

An algal bloom is defined as a rapid increase in common algae (phytoplankton) abundance in water bodies and it can occur when a group of certain environmental factors is combined. If the algae populations grow out of control, such algal blooms become problematic and cause damage ...

Efficient Temporal Action Localization via Vision-Language Modelling

An Empirical Study on the STALE Model's Efficiency and Generalizability in Resource-constrained Environments

Temporal Action Localization (TAL) aims to localize the start and end times of actions in untrimmed videos and classify the corresponding action types. TAL plays an important role in understanding video. Existing TAL approaches heavily rely on deep learning and require large-scal ...

TemporalMaxer Performance in the Face of Constraint: A Study in Temporal Action Localization

A Comprehensive Analysis on the Adaptability of TemporalMaxer in Resource-Scarce Environments

This paper presents an analysis of the data and compute efficiency of the TemporalMaxer deep learning model in the context of temporal action localization (TAL), which involves accurately detecting the start and end times of specific video actions. The study explores the performa ...

Efficient Temporal Action Localization model development practices

A review and analysis of models and a guide of best methods

Temporal Action Localization (TAL) is an important problem in computer vision with uses in video surveillance and recommendation, healthcare, entertainment, and human-computer interaction. Being an inherently data-heavy process, TAL has been bound by the availability of computing ...

Algal Bloom Forecasting in a Classification and Regression Setting

Implementing a UNet Architecture to evaluate the differences between both settings

Forecasting algal blooms using remote sensing data is less labour-intensive and has better cover- age in time and space than direct water sampling. The paper implements a deep learning technique, the UNet Architecture, to predict the chlorophyll concentration, which is a good ind ...
The term ”Algal Bloom” refers to the accumulation of algae in a confined geological space. They may harm human health and negatively affect ecological systems around the area. Thus, forecasting algal blooms could mitigate the environmental and socio-economical damages. Particular ...
This research presents a method for forecasting algal blooms using remote sensing with spatially and temporally sparse satellite data. The method involves the use of multiple interpolation methods to interpolate the sparse input data. The approach is shown to be effective in pred ...
In real-life scenarios, there are many variations in sizes of objects of the same category and the objects are not always placed at a fixed distance from the camera. This results in objects taking up an arbitrary size of pixels in the image. Vanilla CNNs are by design only transl ...
Convolutional Neural Networks (CNNs) benefit from fine-grained details in high-resolution images, but these images are not always easily available as data collection can be expensive or time-consuming. Transfer learning pre-trains models on data from a related domain before fine- ...
The natural world is long-tailed: rare classes are observed orders of magnitudes less frequently than common ones, leading to highly-imbalanced data where rare classes can have only handfuls of examples. Learning from few examples is a known challenge for deep learning based clas ...
In the field of ecology, camera traps are important tools to collect information on the wildlife of certain areas. The problem that arises with many camera traps is that they can collect more images than a human can realistically go trough all by themselves. To help classify thes ...
Camera traps are used around the world to provide data on species, population sizes and how species are interacting. However this creates a lot of work in identifying which animal was actually spotted near the camera. Attempts have been made to use deep-learning to identify anima ...
To alleviate lower classification performance on rare classes in imbalanced datasets, a possible solution is to augment the underrepresented classes with synthetic samples. Domain adaptation can be incorporated in a classifier to decrease the domain discrepancy between real and s ...
Aside from developing methods to embed the equivariant priors into the architectures, one can also study how the networks learn equivariant properties. In this work, we conduct a study on the influence of different factors on learned equivariance. We propose a method to quantify ...
Location information is essential for the ViT model. Image data has three types of location information: absolute location, relative direction, and relative distance. Various position embeddings methods have been used to introduce location information to the ViT model. Some exist ...