Contact person
Olof Mogren
Senior Researcher
Contact OlofCoffee is one of the most popular and valuable beverages in the world, with an estimated global market value of over 100 billion US dollars. However, coffee production faces many challenges, such as climate change, pests, diseases, and market fluctuations.
One of the most serious threats to coffee crops is the coffee berry disease (CBD), a fungal infection that causes the berries to rot and fall off the plant. CBD can reduce the yield and quality of coffee by up to 80%, and affects millions of smallholder farmers in Africa, Asia, and Latin America.
To prevent and control CBD, farmers need to monitor their plants regularly and apply fungicides or biological agents. However, this can be costly, time-consuming, and environmentally harmful. Moreover, farmers may not have access to reliable information or tools to diagnose CBD accurately and timely. This is where machine learning, a branch of artificial intelligence that enables computers to learn from data and make predictions, can offer a solution.
In a research project at RISE, we have developed an efficient machine learning based solution for detecting CBD in images of coffee plants. The project aims to provide a low-cost, scalable, and accurate tool for farmers to monitor their coffee plants and take appropriate actions to prevent or treat CBD.
The image dataset used was collected by our collaborating partners Mpendakazi Agribusiness in Tanzania. The annotation process is one of the most challenging and time-consuming tasks in machine learning; in the case of detecting and counting objects in images, it requires human experts to manually draw bounding boxes around the objects in each image. To reduce this burden, the project uses a novel method that combines weak and strong labels. Weak labels are point labels that mark the center of an object, while strong labels are box labels that enclose the entire object. The project has incorporated open-set detectors, machine learning models that can detect arbitrary objects without specific training, to generate proposals for ground truth bounding boxes in each image. The human annotators then only need to annotate point labels for the remaining objects in each image, which is much faster and easier than drawing boxes.
Machine learning models that can learn from the annotated images and predict the bounding boxes and labels of CBD in new images were implemented. The project based its work on the YOLOv8 framework, which is a state-of-the-art model for object detection, and has modified it to work with the mixed weak and strong labels. The project has investigated two models for this task: Point-guided loss suppression (PLS) and mixed Point-Teaching (MPT).
The PLS model is a simple adaptation of YOLOv8, which uses a loss function that penalizes the model for predicting boxes that do not match the point labels. The MPT framework consists of two models, one that generates boxes that the other uses as pseudo labels during training. The pseudo labels are synthetic labels that are generated by a teacher model and used to guide the learning of a student model. The MPT framework aims to leverage the complementary strengths of the two models and improve their performance.
The performance of the developed models were evaluated on a test split of the collected CBD dataset. The models were compared with the baseline YOLOv8 model and the semi-supervised YOLOv8 model, which uses only box labels for training.
The evaluation results show that the PLS model gives a slight improvement in performance on the CBD dataset, compared to the semi-supervised model. The MPT framework generally performs worse, only performing above the baseline in a few cases. The exact efficiency of using point labels is difficult to determine, but the results indicate that there are potential use cases where annotating points is more efficient than boxes, especially with further development of the models.
The research project at RISE has developed an efficient machine learning solution for detecting CBD in images of coffee plants, and compared different methods of annotating datasets with weak and strong labels. The project has in this early state made several contributions and implications for the field of machine learning and agriculture, such as:
The project has so far identified a number of activities which will be investigated in future work:
The research project at RISE demonstrates how machine learning can help coffee farmers deal with coffee berry disease, and how this technology can be further developed to benefit agriculture in general. The project hopes to inspire more research and innovation in this field, and to contribute to the global goals of food security, poverty reduction, and environmental protection.
Machine learning for coffee berry diseas
Completed
Koordinator
2 years