D. Barokas Profeta

info

Please Note

<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>

Conference paper (1)

Master thesis (1)

2 records found

MoReSo

A DNN Framework Expediting Content-based Video Image Retrieval (CBVIR)

Conference paper (2024) - Sinian Li, Doruk Barokas Profeta, Justin Dauwels

With the exponential growth of video data, individuals, particularly scholars in the fields of history and sociology, are increasingly reliant on video materials. However, the task of locating specific frames within videos remains a laborious and time-consuming endeavor. Advanced machine learning-assisted video processing techniques have emerged, including text-based video searches, video summarization, real-time object detection, and person re-identification. However, distinct from these, the main challenge of retrieving video frames based on given visual content is how to efficiently and accurately pinpoint the instance occurrences. To expedite the process while maintaining retrieval performance, we propose a two-stage approach, combining KeyFrame Extraction (KFE) and Content-based Image Retrieval (CBIR), underpinned a DNN-empowered framework called MoReSo. Our innovations include 1) the integration of improved statistical features with dynamic clustering in the KFE stage and 2) the development of the MoReSo framework, which consists of MobileNet and ResNet backbones with SOA layer to jointly represent video frames, achieving 2.67x increase in efficiency compared to existing solutions. Our framework is evaluated on two datasets: the annotated EHM Historical Database provided by digital history researchers and the widely-used image retrieval benchmark datasets, the Oxford and Paris datasets. The experimental results showcase that the proposed framework and scheme excel among other models in the CBVIR task. We make our code available for further exploration through our GitHub repository. This repository contains the implementation of our model and CBVIR system with a GUI prototype. ...

Efficient Content-Based Image Retrieval from Videos using Compact Deep Learning Networks with Re-ranking

Master thesis (2023) - D. Barokas Profeta, J.H.G. Dauwels

The rise of streaming and video technologies has underscored the significance of efficient access and navigation of digital content, particularly for scholars in fields like history and art. Scholars actively seek streamlined approaches to index, retrieve, and explore digital content, with a focus on locating specific instances. The process of searching for specific instances in video search is complex that requires the analysis of video sequences and the identification of relevant video segments. Advanced techniques and algorithms are necessary to ensure effective content-based retrieval of the required information.

In response to the escalating demand for accurate and swift access to relevant visual data within the vast spectrum of video resources, our research has been dedicated to the development of novel, efficient content-based image retrieval methods tailored for videos by integrating deep learning methodologies. Our comprehensive system contains two crucial components: keyframe extraction and content-based image retrieval. Keyframe extraction involves identifying significant frames within videos, while content-based image retrieval enables the retrieval of similar frames to a query image through feature extraction and ranking.

A unique aspect of our research lies in the exploration and analysis of a diverse range of feature extraction techniques derived from compact deep learning networks. We have compared our proposed method with state-of-the-art retrieval systems, evaluating performance metrics in terms of both accuracy and speed. Our method harnesses the power of compact deep learning network features in the initial ranking stage, effectively sublisting frames, and subsequently introduces re-ranking using a larger network. This innovative approach promises to deliver the best of both worlds: exceptional efficiency without compromising retrieval accuracy. ...