Log in

Development of tomato detection model for robotic platform using deep learning

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Fruit detection plays a vital role in robotic harvesting platforms. However, natural scene attributes such as illumination variation, branch and leaf occlusion, clusters of tomatoes, shading, etc. and double scene including image augmentation and natural scene have made fruit detection a difficult task. An improved YOLOv3 model termed as Tomato detection models, which includes YOLODenseNet and YOLOMixNet was applied to solve these problems. YOLODenseNet incorporated DenseNet backbone, while the backbone of YOLOMixNet combined DarkNet and DenseNet. With the incorporation of spatial pyramid pooling (SPP), feature pyramid network (FPN), complete (CIoU) loss and Mish activation function into both models, the tested accuracy of YOLODenseNet at 98.3 % and YOLOMixNet at 98.4 % on natural scene performed better than YOLOv3 at 96.1 % and YOLOv4 at 97.6 %, but not with YOLOv4 under the double scene. Furthermore, the obtained detection speed of YOLOMixNet at 47.4FPS was noted to be in close par with the YOLOv4 at 48.9FPS. Finally, the Tomato detection models showed reliability, better generalization, and a high prospect for real − time harvesting robots.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Abbreviations

SVM:

Support Vector Machine

CNN:

Convolutional Neural Network

YOLO:

You Only Look Once

SSD:

Single Shot Multi−Box Detector

RPN:

Region Proposal Network

AP:

Average Precision

FPS:

Frame Per Second

DenseNet:

Dense Convolutional Network

C−Bbox:

Circular Bounding Box

IoU:

Intersection Over Union

CIoU:

Complete IoU loss

SPP:

Spatial Pyramid Pooling

GPU:

Graphics Processing Unit

FPN:

Feature Pyramid Network

GT:

Ground Truth

PANet:

Path Aggregation Network

ReLU:

Rectified Linear Unit

NMS:

Non−Maximum Suppression

DIoU-NMS:

Distance IoU−NMS

LWYS:

Label What You See

FDL:

Front Detection Layers

TP:

True Positive

FN:

False Negative

FP:

False Positive

P–R curves:

Precision–Recall curves

AUC:

Area Under Curve

References

  1. Alexey B, Chien-Yao W, Liao HM (2020) YOLOv4: Optimal speed and accuracy of object detection. preprint ar**v, ar**v:2004.10934v1

  2. Bulanon DM, Kataoka T, Ota Y, Hiroma T (2002) AE – Automation and emerging technologies: A segmentation algorithm for the automatic recognition of Fuji apples at harvest. Biosyst Eng 83(4):405–412

    Article  Google Scholar 

  3. Bulanon D, Burks T, Alchanatis V (2009) Image fusion of visible and thermal images for fruit detection. Biosyst Eng 103:12–22

    Article  Google Scholar 

  4. Girshick RB (2015) Fast R-CNN. International Conference on Computer Vision (ICCV) (Santi-ago, Chile), pp 1440–1448

  5. Han J, Zhang D, Cheng G, Liu N, Xu D (2018) Advanced Deep-Learning Techniques for Salient and Category-Specific Object Detection: A Survey. IEEE Signal Process Mag 35:84–100

    Article  Google Scholar 

  6. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional net-works for visual recognition. IEEE Trans Pattern Anal Mach Intell 37:1904–1916

    Article  Google Scholar 

  7. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  8. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Recognition P, (Honolulu, HI, USA), pp 4700–4708

  9. Huang YQ, Zheng JC, Sun SD, Yang CF, Liu J (2020) Optimized YOLOv3 algorithm and its application in traffic flow detections. Appl Sci 10:3079

    Article  Google Scholar 

  10. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. preprint ar**v, ar**v:1502.03167

  11. Ji W, Zhao D, Cheng F, Xu B, Zhang Y, Wang J (2012) Automatic recognition vision system guided for apple harvesting robot. Comput Electr Eng 38:1186–1195

    Article  Google Scholar 

  12. Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: A survey. Comput Electron Agric 147:70–90

    Article  Google Scholar 

  13. Kapach K, Barnea E, Mairon R, Edan Y, Ben-Shahar O (2012) Computer vision for fruit harvesting robots—State of the art and challenges ahead. Int J Comput Vis Robot 3:4–34

    Article  Google Scholar 

  14. Kelman EE, Linker R (2014) Vision-based localization of mature apples in tree images using convexity. Biosyst Eng 118:174–185

    Article  Google Scholar 

  15. Koirala A, Walsh KB, Wang Z, McCarthy C (2019) Deep learning for real–time fruit detection and orchard fruit load estimation: benchmarking of ‘MangoYOLO’. Precis Agric 20:1107–1135

    Article  Google Scholar 

  16. Kurtulmus F, Lee WS, Vardar A (2011) Green citrus detection using ‘eigenfruit’, color and circular Gabor texture features under natural outdoor conditions. Comput Electron Agric 78:140–149

    Article  Google Scholar 

  17. Kurtulmus F, Lee WS, Vardar A (2014) Immature peach detection in colour images acquired in natural illumination conditions using statistical classifiers and neural network. Precis Agric 15:57–79

    Article  Google Scholar 

  18. Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Recognition P, (Honolulu, HI, USA), pp 2117–2125

  19. Linker R, Cohen O, Naor A (2012) Determination of the number of green apples in RGB images recorded in orchards. Comput Electron Agric 81:45–57

    Article  Google Scholar 

  20. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: Single shot multibox detector. European Conference on Computer Vision (ECCV) (Amsterdam, Netherlands) 9905, pp 21–37

  21. Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 8759–8768

  22. Liu G, Mao S, Kim JH (2019) A mature-tomato detection algorithm using machine learning and color analysis. Sensors 19:2023

    Article  Google Scholar 

  23. Liu G, Nouaze JC, Mbouembe PL, Kim JH (2020) YOLO-Tomato: A robust algorithm for tomato detection based on YOLOv3. Sensors 20:2145

    Article  Google Scholar 

  24. Luo L, Tang Y, Zou X, Wang C, Zhang P, Feng W (2016) Robust grape cluster detection in a vineyard by combining the AdaBoost framework and multiple color components. Sensors 16:2098

    Article  Google Scholar 

  25. Mao W, Ji B, Zhan J, Zhang X, Hu X (2009) Apple location method for the apple harvesting robot. Proceedings of the 2nd International Congress on Image and Signal Processing (CISP’09) (Tian**, China), pp 1–5

  26. Misra D (2019) Mish: A self-regularized nonmonotonic neural activation function. preprint ar**v, ar**v:1908.08681

  27. Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. Proceedings of the 27th International Conference on International Conference on Machine Learning, pp 807–814

  28. Patrício DI, Rieder R (2018) Computer vision and artificial intelligence in precision agriculture for grain crops: A systematic review. Comput Electron Agric 153:69–81

    Article  Google Scholar 

  29. Payne A, Walsh K, Subedi P, Jarvis D (2014) Estimating mango crop yield using image analysis using fruit at ‘stone hardening’ stage and night time imaging. Comput Electron Agric 100:160–167

    Article  Google Scholar 

  30. Qiang L, Jianrong C, Bin L, Lie D, Ya**g Z (2014) Identification of fruit and branch in natural scenes for citrus harvesting robot using machine vision and support vector machine. Int J Agric Biol Eng 7:115–121

    Google Scholar 

  31. Ramachandran P, Zoph B, Le QV (2017) Swish: A self-gated activation function. preprint ar**v, ar**v:1710.059417

  32. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271

  33. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. preprint ar**v, ar**v:1804.02767

  34. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You Only Look Once: Unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Las Vegas, NV, USA), pp 779–788

  35. Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39:1137–1149

    Article  Google Scholar 

  36. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115:211–252

    Article  MathSciNet  Google Scholar 

  37. Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. preprint ar**v, ar**v:14126806

  38. Tian Y, Yang G, Wang Z, Li E, Liang Z (2019) Detection of apple lesions in orchards based on deep learning methods of CycleGAN and YOLOV3-Dense. Hindawi J Sens 2019:13

  39. Wang P, Chen P, Yuan Y, Liu D, Huang Z, Hou X, Cottrell G (2018) Understanding convolution for semantic segmentation. IEEE Winter Conference on Applications of Computer Vision, pp 1451–1460

  40. Wei X, Jia K, Lan J, Li Y, Zeng Y, Wang C (2014) Automatic method of fruit object extraction under complex agricultural background for vision system of fruit picking robot. Opt Int J Light Electron Opt 125(19):5684–5689

    Article  Google Scholar 

  41. **ang R, Jiang H, Ying Y (2014) Recognition of clustered tomatoes based on binocular stereo vision. Comput Electron Agric 106:75–90

    Article  Google Scholar 

  42. Yamamoto K, Guo W, Yoshioka Y, Ninomiya S (2014) On plant detection of intact tomato fruits using image analysis and machine learning methods. Sensors 14:12191–12206

    Article  Google Scholar 

  43. Yin H, Chai Y, Yang SX, Mittal GS (2009) Ripe tomato extraction for a harvesting robotic system. Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC 2009) (San Antonio, TX, USA), pp 2984–2989

  44. Zhao Y, Gong L, Huang Y, Liu C (2016) A review of key techniques of vision-based control for harvesting robot. Comput Electron Agric 127:311–323

    Article  Google Scholar 

  45. Zhao Y, Gong L, Huang Y, Liu C (2016) Robust tomato recognition for robotic harvesting using feature images fusion. Sensors 16:173

    Article  Google Scholar 

  46. Zhao Y, Gong L, Zhou B, Huang Y, Liu C (2016) Detecting tomatoes in greenhouse scenes by combining AdaBoost classifier and colour analysis. Biosyst Eng 148:127–137

    Article  Google Scholar 

  47. Zheng Z, Wang P, Liu W, Li J, Ye R, Ren D (2019) Distance-IoU Loss: Faster and better learning for bounding box regression. preprint ar**v, ar**v:1911.08287v1

  48. Zoph B, Cubuk ED, Ghiasi G, Lin T, Shlens J, Le QV (2019) Learning data augmentation strategies for object detection. preprint ar**v, ar**v:1906.11172

Download references

Acknowledgements

I wish to acknowledge Prof. Zheng Decong, Wang Zhe and institute of engineering entire staff for their advices and supports during this research work.

Funding

This research work was supported by the Natural Science Foundation of Shanxi Province and Shanxi Agricultural University, China under Grant No. 2020BQ34.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olarewaju Mubashiru Lawal.

Ethics declarations

Conflicts of interest

The author declares no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lawal, O.M. Development of tomato detection model for robotic platform using deep learning. Multimed Tools Appl 80, 26751–26772 (2021). https://doi.org/10.1007/s11042-021-10933-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-021-10933-w

Keywords

Navigation