While this is a general consideration in most cases, the rise of AI and Machine Learning Algorithms has led to a belief that these can work as a silver bullet and are able to perform even when the input conditions are relaxed. This is not the case.
Modern AI algorithms have proven successful with billions of data points for everyday objects, while remote sensing of invasive species with aerial vehicles is a niche application, where fewer data points are available, and even fewer reliable data points that serve as ground truth are at hand to develop these models.
There are billions of images on the internet that show discrete images taken from the perspective of a mobile phone. For example, the image of a dog, taken from a mobile phone makes it easy to figure out what a dog looks like. Features of most dogs appear similar, even with respect to the various differences between breeds.
However, there are only very few images freely available that show certain weeds in various landscapes, and even fewer images available that have the labels required to train modern AI algorithms attached to them.
These considerations are important when using models as black-box predictors for weed detection.
Specifically, the following points should be documented during the data collection process. They should be attempted to be replicated when collecting new data and can generally be clustered into two groups:
- Appearance: which refer to consistency in the appearance of plants in UAV images
- Species growth stage: If the initial models have been developed during a flowering phase, the model will learn to pick up on distinguishing features. If new data is collected at early growth stages without flowers, there is no guarantee for the performance.
- Lighting conditions: While these are usually attempted to be controlled for in AI model development, good quality images (by visual inspection, not too dark, not too bright) commonly allow for more reliable results.
- Viewpoint: which refer to the consistency in the relative location of weeds and UAV
- Sensor angle: the angle at which the sensor is pointing at the ground (oblique or nadir (vertical) ) influences the point of view. Plants look very different from the top than from an angle.
- Ground resolution. If the ground resolution of the original flight was high, it should be attempted to replicate the same ground resolution (which depends on the sensor resolution and the flying height). Higher is better, since the images may be digitally reduced to lower resolutions, however, there are no guarantees for performance on higher resolution images, and even less so on lower resolutions.
- Sensor Model: Focal length, aperture, filters and lens distortion are all factors that influence the representation. While they can largely be neglected, these should not differ too much, e.g. if original data was collected with a normal zoom, fish-eye-collected images will unlikely yield optimal results, unless manually rectified.
While these features cannot always be guaranteed to be equal, being aware of modifications allows interpretation of possible failure modes of post-processed data.
A rule of thumb would be: if it is possible to identify plants by visual inspection of the collected images, and if the images look somewhat like the original training data, results should be reasonable.
Some useful publications for understanding remote sensing and deep learning processes on a more technical level are included below for those wanting to explore further:
Remote Sensing:
- T. Hoeser and C. Kuenzer, “Object Detection and Image Segmentation with Deep Learning on Earth Observation Data: A Review-Part I: Evolution and Recent Trends,” Remote Sens., vol. 12, no. 10, Art. no. 10, Jan. 2020, doi: 10.3390/rs12101667.
- T. Hoeser, F. Bachofer, and C. Kuenzer, “Object Detection and Image Segmentation with Deep Learning on Earth Observation Data: A Review—Part II: Applications,” Remote Sens., vol. 12, no. 18, Art. no. 18, Jan. 2020, doi: 10.3390/rs12183053
- A. E. Maxwell, T. A. Warner, and L. A. Guillén, “Accuracy Assessment in Convolutional Neural Network-Based Deep Learning Remote Sensing Studies—Part 1: Literature Review,” Remote Sens., vol. 13, no. 13, Art. no. 13, Jan. 2021, doi: 10.3390/rs13132450.
- A. E. Maxwell, T. A. Warner, and L. A. Guillén, “Accuracy Assessment in Convolutional Neural Network-Based Deep Learning Remote Sensing Studies—Part 2: Recommendations and Best Practices,” Remote Sens., vol. 13, no. 13, Art. no. 13, Jan. 2021, doi: 10.3390/rs13132591.
General Deep Learning:
- J. Jordan, “Building machine learning products: a problem well-defined is a problem half-solved.,” Jeremy Jordan, Sep. 22, 2019. https://www.jeremyjordan.me/ml-requirements/
- J. Jordan, “Organizing machine learning projects: project management guidelines.,” Jeremy Jordan, Sep. 02, 2018. https://www.jeremyjordan.me/ml-projects-guide/
- R. S. Geiger et al., “‘Garbage in, garbage out’ revisited: What do machine learning application papers report about human-labeled training data?,” Quant. Sci. Stud., vol. 2, no. 3, pp. 795–827, Nov. 2021, doi: 10.1162/qss_a_00144