Underwater detection has garnered increasing attention in recent years due to its broad and impactful applications in marine ecological research, underwater structural inspection, archaeological exploration, and deep-sea resource extraction. However, despite the proliferation of
...
Underwater detection has garnered increasing attention in recent years due to its broad and impactful applications in marine ecological research, underwater structural inspection, archaeological exploration, and deep-sea resource extraction. However, despite the proliferation of research in this domain, a comprehensive methodology that addresses both object identification and localization in underwater scenarios remains absent. Existing studies tend to treat these two tasks separately, often omitting the practical implementation details necessary for real-world deployment. This fragmented approach limits the effectiveness and adaptability of underwater detection systems, particularly in dynamic or unpredictable marine conditions.
To bridge this gap, this report presents a detailed exploration of a camera-based framework for simultaneous object identification and localization in underwater environments. The proposed system leverages a Region-based Convolutional Neural Network (RCNN) for object identification, offering a favorable trade-off between precision and computational efficiency. RCNN's architecture enables it to effectively handle the complex visual features typically present in underwater imagery. For the localization task, two complementary strategies are employed: the Metric3D depth estimation algorithm, which utilizes learned monocular cues to infer depth maps with high accuracy, and a geometry-based method rooted in camera imaging principles, which estimates object distance based on intrinsic and extrinsic camera parameters. The former localization method (Metric 3D) is more computationally expensive, but it provides more robust results and easier applications.
Experimental evaluations demonstrate that the proposed integrated approach achieves robust performance in various underwater conditions. The RCNN consistently delivers accurate object classifications, while the localization strategies offer flexibility and reliability depending on the computational and environmental constraints.
Overall, this research contributes a novel and practical solution for real-time underwater object detection by unifying identification and localization. The proposed system enables safer navigation, more precise manipulation, and greater situational awareness. By addressing the methodological gaps in existing literature and emphasizing real-world applicability, this work contributes to the research of intelligent underwater operation and automation.