Abstract
Fruit detection plays a vital role in robotic harvesting platforms. However, natural scene attributes such as illumination variation, branch and leaf occlusion, clusters of tomatoes, shading, etc. and double scene including image augmentation and natural scene have made fruit detection a difficult task. An improved YOLOv3 model termed as Tomato detection models, which includes YOLODenseNet and YOLOMixNet was applied to solve these problems. YOLODenseNet incorporated DenseNet backbone, while the backbone of YOLOMixNet combined DarkNet and DenseNet. With the incorporation of spatial pyramid pooling (SPP), feature pyramid network (FPN), complete (CIoU) loss and Mish activation function into both models, the tested accuracy of YOLODenseNet at 98.3 % and YOLOMixNet at 98.4 % on natural scene performed better than YOLOv3 at 96.1 % and YOLOv4 at 97.6 %, but not with YOLOv4 under the double scene. Furthermore, the obtained detection speed of YOLOMixNet at 47.4FPS was noted to be in close par with the YOLOv4 at 48.9FPS. Finally, the Tomato detection models showed reliability, better generalization, and a high prospect for real − time harvesting robots.
Similar content being viewed by others
Abbreviations
- SVM:
-
Support Vector Machine
- CNN:
-
Convolutional Neural Network
- YOLO:
-
You Only Look Once
- SSD:
-
Single Shot Multi−Box Detector
- RPN:
-
Region Proposal Network
- AP:
-
Average Precision
- FPS:
-
Frame Per Second
- DenseNet:
-
Dense Convolutional Network
- C−Bbox:
-
Circular Bounding Box
- IoU:
-
Intersection Over Union
- CIoU:
-
Complete IoU loss
- SPP:
-
Spatial Pyramid Pooling
- GPU:
-
Graphics Processing Unit
- FPN:
-
Feature Pyramid Network
- GT:
-
Ground Truth
- PANet:
-
Path Aggregation Network
- ReLU:
-
Rectified Linear Unit
- NMS:
-
Non−Maximum Suppression
- DIoU-NMS:
-
Distance IoU−NMS
- LWYS:
-
Label What You See
- FDL:
-
Front Detection Layers
- TP:
-
True Positive
- FN:
-
False Negative
- FP:
-
False Positive
- P–R curves:
-
Precision–Recall curves
- AUC:
-
Area Under Curve
References
Alexey B, Chien-Yao W, Liao HM (2020) YOLOv4: Optimal speed and accuracy of object detection. preprint ar**v, ar**v:2004.10934v1
Bulanon DM, Kataoka T, Ota Y, Hiroma T (2002) AE – Automation and emerging technologies: A segmentation algorithm for the automatic recognition of Fuji apples at harvest. Biosyst Eng 83(4):405–412
Bulanon D, Burks T, Alchanatis V (2009) Image fusion of visible and thermal images for fruit detection. Biosyst Eng 103:12–22
Girshick RB (2015) Fast R-CNN. International Conference on Computer Vision (ICCV) (Santi-ago, Chile), pp 1440–1448
Han J, Zhang D, Cheng G, Liu N, Xu D (2018) Advanced Deep-Learning Techniques for Salient and Category-Specific Object Detection: A Survey. IEEE Signal Process Mag 35:84–100
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional net-works for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1916
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Recognition P, (Honolulu, HI, USA), pp 4700–4708
Huang YQ, Zheng JC, Sun SD, Yang CF, Liu J (2020) Optimized YOLOv3 algorithm and its application in traffic flow detections. Appl Sci 10:3079
Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. preprint ar**v, ar**v:1502.03167
Ji W, Zhao D, Cheng F, Xu B, Zhang Y, Wang J (2012) Automatic recognition vision system guided for apple harvesting robot. Comput Electr Eng 38:1186–1195
Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: A survey. Comput Electron Agric 147:70–90
Kapach K, Barnea E, Mairon R, Edan Y, Ben-Shahar O (2012) Computer vision for fruit harvesting robots—State of the art and challenges ahead. Int J Comput Vis Robot 3:4–34
Kelman EE, Linker R (2014) Vision-based localization of mature apples in tree images using convexity. Biosyst Eng 118:174–185
Koirala A, Walsh KB, Wang Z, McCarthy C (2019) Deep learning for real–time fruit detection and orchard fruit load estimation: benchmarking of ‘MangoYOLO’. Precis Agric 20:1107–1135
Kurtulmus F, Lee WS, Vardar A (2011) Green citrus detection using ‘eigenfruit’, color and circular Gabor texture features under natural outdoor conditions. Comput Electron Agric 78:140–149
Kurtulmus F, Lee WS, Vardar A (2014) Immature peach detection in colour images acquired in natural illumination conditions using statistical classifiers and neural network. Precis Agric 15:57–79
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Recognition P, (Honolulu, HI, USA), pp 2117–2125
Linker R, Cohen O, Naor A (2012) Determination of the number of green apples in RGB images recorded in orchards. Comput Electron Agric 81:45–57
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: Single shot multibox detector. European Conference on Computer Vision (ECCV) (Amsterdam, Netherlands) 9905, pp 21–37
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 8759–8768
Liu G, Mao S, Kim JH (2019) A mature-tomato detection algorithm using machine learning and color analysis. Sensors 19:2023
Liu G, Nouaze JC, Mbouembe PL, Kim JH (2020) YOLO-Tomato: A robust algorithm for tomato detection based on YOLOv3. Sensors 20:2145
Luo L, Tang Y, Zou X, Wang C, Zhang P, Feng W (2016) Robust grape cluster detection in a vineyard by combining the AdaBoost framework and multiple color components. Sensors 16:2098
Mao W, Ji B, Zhan J, Zhang X, Hu X (2009) Apple location method for the apple harvesting robot. Proceedings of the 2nd International Congress on Image and Signal Processing (CISP’09) (Tian**, China), pp 1–5
Misra D (2019) Mish: A self-regularized nonmonotonic neural activation function. preprint ar**v, ar**v:1908.08681
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. Proceedings of the 27th International Conference on International Conference on Machine Learning, pp 807–814
Patrício DI, Rieder R (2018) Computer vision and artificial intelligence in precision agriculture for grain crops: A systematic review. Comput Electron Agric 153:69–81
Payne A, Walsh K, Subedi P, Jarvis D (2014) Estimating mango crop yield using image analysis using fruit at ‘stone hardening’ stage and night time imaging. Comput Electron Agric 100:160–167
Qiang L, Jianrong C, Bin L, Lie D, Ya**g Z (2014) Identification of fruit and branch in natural scenes for citrus harvesting robot using machine vision and support vector machine. Int J Agric Biol Eng 7:115–121
Ramachandran P, Zoph B, Le QV (2017) Swish: A self-gated activation function. preprint ar**v, ar**v:1710.059417
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. preprint ar**v, ar**v:1804.02767
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You Only Look Once: Unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Las Vegas, NV, USA), pp 779–788
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115:211–252
Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. preprint ar**v, ar**v:14126806
Tian Y, Yang G, Wang Z, Li E, Liang Z (2019) Detection of apple lesions in orchards based on deep learning methods of CycleGAN and YOLOV3-Dense. Hindawi J Sens 2019:13
Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2018) Understanding convolution for semantic segmentation. IEEE Winter Conference on Applications of Computer Vision, pp 1451–1460
Wei X, Jia K, Lan J, Li Y, Zeng Y, Wang C (2014) Automatic method of fruit object extraction under complex agricultural background for vision system of fruit picking robot. Opt Int J Light Electron Opt 125(19):5684–5689
**ang R, Jiang H, Ying Y (2014) Recognition of clustered tomatoes based on binocular stereo vision. Comput Electron Agric 106:75–90
Yamamoto K, Guo W, Yoshioka Y, Ninomiya S (2014) On plant detection of intact tomato fruits using image analysis and machine learning methods. Sensors 14:12191–12206
Yin H, Chai Y, Yang SX, Mittal GS (2009) Ripe tomato extraction for a harvesting robotic system. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC 2009) (San Antonio, TX, USA), pp 2984–2989
Zhao Y, Gong L, Huang Y, Liu C (2016) A review of key techniques of vision-based control for harvesting robot. Comput Electron Agric 127:311–323
Zhao Y, Gong L, Huang Y, Liu C (2016) Robust tomato recognition for robotic harvesting using feature images fusion. Sensors 16:173
Zhao Y, Gong L, Zhou B, Huang Y, Liu C (2016) Detecting tomatoes in greenhouse scenes by combining AdaBoost classifier and colour analysis. Biosyst Eng 148:127–137
Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2019) Distance-IoU Loss: Faster and better learning for bounding box regression. preprint ar**v, ar**v:1911.08287v1
Zoph B, Cubuk ED, Ghiasi G, Lin T, Shlens J, Le QV (2019) Learning data augmentation strategies for object detection. preprint ar**v, ar**v:1906.11172
Acknowledgements
I wish to acknowledge Prof. Zheng Decong, Wang Zhe and institute of engineering entire staff for their advices and supports during this research work.
Funding
This research work was supported by the Natural Science Foundation of Shanxi Province and Shanxi Agricultural University, China under Grant No. 2020BQ34.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The author declares no conflicts of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lawal, O.M. Development of tomato detection model for robotic platform using deep learning. Multimed Tools Appl 80, 26751–26772 (2021). https://doi.org/10.1007/s11042-021-10933-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10933-w