Gathering a Machine Learning dataset for object detection from a satellite-platform

On bandwidth-efficient gathering of a Machine Learning dataset for Object Detection with Faster-RCNN from a satellite-platform

More Info
expand_more

Abstract

In the past years, small Earth Observation (EO) satellites have become increasingly capable of taking high-resolution images at high sample rates. These images contain valuable information for different sectors, such as the agricultural and military sector. Furthermore they can contain important information about the climate and climate change. Sending these images to earth requires a large amount of down-link bandwidth. This results in heavy, large power modules and communication modules, resulting in larger, more expensive (in terms of launch cost as well as in terms of production cost) satellites. This phenomenon already results in satellites not sending all information they gather, with examples of being able to send 2 minutes worth of data per orbit (approx. 90 minutes) not being out of the ordinary. As more and more satellites are transmitting data towards earth the communication is also expected to become even more power-intensive (or even more limited), since the (theoretically) available bandwidth per satellite is reduced. Therefore a shift towards a different approach is necessary. Smarter ways to get the relevant information to earth have to be developed. In contrast with the "common knowledge" that is often applied in the field of object detection, using the highest possible image quality does not transfer to the best trained network when gathering a dataset of satellite images, since the main constraint is the bandwidth available for transmitting images, where "normally" the largest constraint is the amount of man-hours spent on annotation. Compression of training images with JPEG-XR quality level 2 during the gathering of training images results in a better "bandwidth-efficiency", in the dataset used for this research at least up to 30000 images. It was also found that when more bandwidth is available and thus more images can be added to the training set, the optimal amount of compression tends to decrease. This results also lies in line with the result that the "final accuracy" (the predicted accuracy of a model trained on an infinite amount of images) of the models tend to improve with better image quality. From this it can be concluded that for the optimal approach the training images should be compressed as far as possible at the start of training, to then decrease the amount of compression as the mission progresses and more cumulative bandwidth is available.

Files

Master_Thesis_Version_1.pdf
(.pdf | 111 Mb)
- Embargo expired in 08-04-2022