What is the potential of using Sentinel-1 and Sentinel-2 data to map farmer-led irrigated agriculture with machine learning?


A case study in Central Mozambique

More Info
expand_more

Abstract

Smallholder farmers cultivate more than 75% of the available agricultural land in Africa and therefore this form of agriculture is crucial to the global food supply. At present, little is known about these smallholder agricultural and related irrigation practices, yet the increasing availability and accessibility of remotely sensed data provides significant opportunities to assess the status quo of these practices. However, these characteristic agricultural landscapes can create complexity in identifying land use with the help of remote sensing. The cultivated plots are small and consist of intercropping systems with dynamic spatio-temporal practices concerning planting, irrigation and harvesting. 
This research aims to provide insight into the usefulness of remotely sensed passive Sentinel-2 Level-1C and active Sentinel-1 SAR data for land use classification of these complex landscapes with a focus on irrigated agriculture, using a case study in Central Mozambique. For this purpose, an open source-code is written that uses open-source satellite data from Google Earth Engine to execute the supervised image classification methodology, using confusion matrices as an assessment method. 
The results of this research appear to show that a nonparametric RF classifier ( = 88.0%) is preferred over a parametric ML classifier ( = 85.0%) for processing the data that is high in variability, in which classifications based on the chlorophyll sensitive Red Edge and SWIR bands provide the highest overall accuracies (>88.0%). However, the classifier overestimates the amount of irrigated areas by a factor of 1.5 in the first and a factor of 3 in the second irrigation season. The opportunistic sampling method appears to cause inflated accuracy outcomes and an optimistic bias towards classification of the main class in training. Spectral analysis of the temporal behavior of various S-2 bandwidths does not provide insight into the underlying mechanisms on which the algorithm performs classification. Although it appears that irrigated agriculture with S-2 data can be identified on the basis of an increase in vegetation biomass and that the classifier benefits from more information through the use of multiple bands. Research into the use of Sentinel-1 SAR data appears to have potential for identifying irrigation. Time series of the VV backscatter signal show a difference between the irrigation class and the classes non irrigated and light seasonal vegetation in irrigation season 2. However, high standard deviations do reflect the high intra variability of the data, and classification accuracies in this period, do not exceed an overall accuracy of 64.1%. The main confusion as identified by the confusion matrices, comes from classes that are often identified as irrigated, whereas they are not, overestimating the amount of irrigated areas as with the use of S-2 data.The results of this research show that the used method and data collections do not provide accurate information for the intended classification goal. This research demonstrates in several ways the complexity of supervised image classification in complex agricultural landscapes: the unbalanced and variable reference data of different land uses, which often consist of only a few satellite pixels, make it difficult to identify characteristics of land classes, from which the classifier can derive information. In which Sentinel-1 as added and used in this research, offers no additional insights. Therefore, in order to improve the identification of farmer-led irrigated agriculture in Manica, other technologies for smart agriculture can be explored in addition to deploying satellite data, such as citizen science. Further research is recommended on the field of using S-1 and S-2 data for classification of complex agricultural landscapes. This may include more advanced methods of performing image classification and accuracy assessement with imbalance datasets, such as: the use of a weighted confusion matrix for accuracy assessment or exploring the use of spatial-spectral instead of pixelwise random forest algorithms. These algorithms seem to be better at handling spatial dependencies and intrinstic heterogeneity which is characteristic of these complex agricultural landscapes. Lastly, it is strongly recommended to assess the use of speckle filters when using SAR data for small target objects. These filters make use of a buffer zone, consisting of a few pixels in size and about the same size of the target object. The main challenge with be to balance the need of speckle reduction and class specific information preservation.