A lightweight and real-time responsive framework for various visual tasks via neural architecture search

Wang, Zixiao; Wang, Jiansu; Li, Shuo; Yang, Jiadi; **ng, Tianzhang

doi:10.1007/s42486-024-00157-w

A lightweight and real-time responsive framework for various visual tasks via neural architecture search

Regular Paper
Published: 21 May 2024

(2024)
Cite this article

CCF Transactions on Pervasive Computing and Interaction Aims and scope Submit manuscript

Zixiao Wang¹,
Jiansu Wang¹,
Shuo Li¹,
Jiadi Yang¹ &
…
Tianzhang **ng ORCID: orcid.org/0000-0001-7526-7269¹

40 Accesses
Explore all metrics

Abstract

With the enhanced capabilities of edge devices in processing images and video streams, novel deep vision applications are rapidly emerging. To support such applications, lightweight neural networks have proven effective, and existing solutions often adjust the model across various dimensions to meet specific application requirements. However, as each application varies in accuracy, latency, and memory usage requirements, a single lightweight technology cannot satisfy all these diverse metrics. Additionally, processing video stream tasks on mobile devices often falls short of achieving real-time performance. Therefore, this paper proposes a lightweight and real-time framework based on neural architecture search, termed LRNAS. We developed a lightweight network search algorithm employing evolutionary strategies and multi-objective optimization to personalize designs for diverse vision tasks. To reduce inference latency further, we designed a video frame filtering strategy. This strategy uses motion vectors and inertial sensors to filter out redundant video frames. We conducted experiments on two public datasets and one custom dataset, demonstrating LRNAS’s effectiveness in enhancing mobile deep vision application performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

DLW-NAS: Differentiable Light-Weight Neural Architecture Search

Article 08 August 2022

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

Article 02 February 2023

Depth-Adaptive Computational Policies for Efficient Visual Tracking

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author onreasonable request.

References

Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: International Conference on Learning Representations (2016)
Chernyshev, M., Baig, Z., Zeadally, S.: Cloud-native application security: Risks, opportunities, and challenges in securing the evolving attack surface. Computer 54(11), 47–57 (2021). https://doi.org/10.1109/MC.2021.3076537
Article Google Scholar
Chu, X., Lu, S., Li, X., Zhang, B.: Mixpath: A unified approach for one-shot neural architecture search. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5949–5958 (2023). https://doi.org/10.1109/ICCV51070.2023.00549
Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: Dpp-net: Device-aware progressive search for pareto-optimal neural architectures. In: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XI, pp. 540–555 (2018). https://doi.org/10.1007/978-3-030-01252-6_32
Ghiasi, G., Lin, T.-Y., Le, Q.V.: Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7029–7038 (2019). https://doi.org/10.1109/CVPR.2019.00720
Gong, T., Zhou, W., Qian, X., Lei, J., Yu, L.: Global contextually guided lightweight network for rgb-thermal urban scene understanding. Eng. Appl. Artif. Intell. 117(PA) (2023) https://doi.org/10.1016/j.engappai.2022.105510
Gowda, S.N., Rohrbach, M., Sevilla-Lara, L.: Smart frame selection for action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1451–1459 (2021)
Han, D., Kim, J., Kim, J.: Deep pyramidal residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6307–6315 (2017). https://doi.org/10.1109/CVPR.2017.668
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1398–1406 (2017). https://doi.org/10.1109/ICCV.2017.155
Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.-C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R., Adam, H., Le, Q.: Searching for mobilenetv3. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1314–1324 (2019). https://doi.org/10.1109/ICCV.2019.00140
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. ar**v preprint ar**v:1704.04861 (2017)
Huang, G., Liu, S., Maaten, L., Weinberger, K.Q.: Condensenet: An efficient densenet using learned group convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2752–2761 (2018)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017). https://doi.org/10.1109/CVPR.2017.243
Huynh, L.N., Lee, Y., Balan, R.K.: Deepmon: Mobile gpu-based deep learning framework for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services, pp. 82–95 (2017). https://doi.org/10.1145/3081333.3081360
Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. ar**v preprint ar**v:1602.07360 (2016)
Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., Kalenichenko, D.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2704–2713 (2018). https://doi.org/10.1109/CVPR.2018.00286
Kandaswamy, I., Farkya, S., Daniels, Z., Wal, G., Raghavan, A., Zhang, Y., Hu, J., Lomnitz, M., Isnardi, M., Zhang, D., Piacentino, M.: Real-time hyper-dimensional reconfiguration at the edge using hardware accelerators. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 3609–3617 (2022). https://doi.org/10.1109/CVPRW56347.2022.00405
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Article Google Scholar
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791
Article Google Scholar
Li, Z., Li, M., Mohapatra, P., Han, J., Chen, S.: itype: Using eye gaze to enhance ty** privacy. In: IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, pp. 1–9 (2017). https://doi.org/10.1109/INFOCOM.2017.8057233
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2755–2763 (2017). https://doi.org/10.1109/ICCV.2017.298
Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: International Conference on Learning Representations (2018)
Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. In: International Conference on Learning Representations (2018)
Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. In: International Conference on Learning Representations (2019)
Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L.-J., Fei-Fei, L., Yuille, A., Huang, J., Murphy, K.: Progressive neural architecture search. In: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part I, pp. 19–35 (2018). https://doi.org/10.1007/978-3-030-01246-5_2
Lopes, V., Carlucci, F.M., Esperança, P.M., Singh, M., Yang, A., Gabillon, V., Xu, H., Chen, Z., Wang, J.: Manas: multi-agent neural architecture search. Mach. Learn. 113(1), 73–96 (2023). https://doi.org/10.1007/s10994-023-06379-w
Article MathSciNet Google Scholar
Lu, Z., Cheng, R., **, Y., Tan, K.C., Deb, K.: Neural architecture search as multiobjective optimization benchmarks: Problem formulation and performance assessment. IEEE Transactions on Evolutionary Computation, 1–1 (2022) https://doi.org/10.1109/TEVC.2022.3233364
Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design, pp. 122–138 (2018). https://doi.org/10.1007/978-3-030-01264-9_8
Mehta, S., Rastegari, M., Shapiro, L., Hajishirzi, H.: Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9182–9192 (2019). https://doi.org/10.1109/CVPR.2019.00941
Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp. 4095–4104 (2018)
Rahman, M., Topkara, U., Carbunar, B.: Movee: Video liveness verification for mobile devices using built-in motion sensors. IEEE Trans. Mob. Comput. 15(5), 1197–1210 (2016). https://doi.org/10.1109/TMC.2015.2456904
Article Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: Proceedings of the Aaai Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)
Sacco, A., Esposito, F., Marchetto, G.: Resource inference for sustainable and responsive task offloading in challenged edge networks. IEEE Trans. Green Commun. Netw. 5(3), 1114–1127 (2021). https://doi.org/10.1109/TGCN.2021.3091812
Article Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/10.1109/CVPR.2018.00474
Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., Le, Q.V.: Mnasnet: Platform-aware neural architecture search for mobile. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2815–2823 (2019). https://doi.org/10.1109/CVPR.2019.00293
Wang, Z., He, X., Zhou, Z., Wang, X., Ma, Q., Miao, X., Liu, Z., Thiele, L., Yang, Z.: Stitching weight-shared deep neural networks for efficient multitask inference on gpu. In: 2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), pp. 145–153 (2022). https://doi.org/10.1109/SECON55815.2022.9918563
Wang, T., Sun, B., Wang, L., Zheng, X., Jia, W.: Eidls: An edge-intelligence-based distributed learning system over internet of things. IEEE Trans. Syst. Man Cybern. Syst. 53(7), 3966–3978 (2023). https://doi.org/10.1109/TSMC.2023.3240992
Article Google Scholar
Wei, H., Lee, F., Hu, C., Chen, Q.: Moo-dnas: Efficient neural network design via differentiable architecture search based on multi-objective optimization. IEEE Access 10, 14195–14207 (2022). https://doi.org/10.1109/ACCESS.2022.3148323
Article Google Scholar
Wu, Z., **ong, C., Ma, C.-Y., Socher, R., Davis, L.S.: Adaframe: Adaptive frame selection for fast video recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1278–1287 (2019)
**e, L., Yuille, A.: Genetic cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1388–1397 (2017). https://doi.org/10.1109/ICCV.2017.154
Xu, M., Liu, J., Liu, Y., Lin, F.X., Liu, Y., Liu, X.: A first look at deep learning apps on smartphones. In: The World Wide Web Conference, pp. 2125–2136 (2019). https://doi.org/10.1145/3308558.3313591
Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018). https://doi.org/10.1109/CVPR.2018.00716
Zhao, M., Yu, Y., Wang, X., Yang, L., Niu, D.: Search-map-search: A frame selection paradigm for action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10627–10636 (2023)
Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., Zhang, J.: Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107(8), 1738–1762 (2019). https://doi.org/10.1109/JPROC.2019.2918951
Article Google Scholar
Zhou, Y., Yen, G.G., Yi, Z.: Evolutionary shallowing deep neural networks at block levels. IEEE Trans. Neural Netw Learn Syst 33(9), 4635–4647 (2022). https://doi.org/10.1109/TNNLS.2021.3059529
Article MathSciNet Google Scholar
Zhou, H., Jiang, K., He, S., Min, G., Wu, J.: Distributed deep multi-agent reinforcement learning for cooperative edge caching in internet-of-vehicles. IEEE Trans. Wireless Commun. 22(12), 9595–9609 (2023). https://doi.org/10.1109/TWC.2023.3272348
Article Google Scholar
Zhou, H., Li, M., Wang, N., Min, G., Wu, J.: Accelerating deep learning inference via model parallelism and partial computation offloading. IEEE Trans. Parallel Distrib. Syst. 34(2), 475–488 (2023). https://doi.org/10.1109/TPDS.2022.3222509
Article Google Scholar
Zhou, H., Jiang, K., He, S., Min, G., Wu, J.: Distributed deep multi-agent reinforcement learning for cooperative edge caching in internet-of-vehicles. IEEE Trans. Wireless Commun. 22(12), 9595–9609 (2023). https://doi.org/10.1109/TWC.2023.3272348
Article Google Scholar
Zoph, B., Le, Q.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (2016)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018). https://doi.org/10.1109/CVPR.2018.00907

Download references

Author information

Authors and Affiliations

School of Information Science and Technology, Northwest University, **an, 710027, China
Zixiao Wang, Jiansu Wang, Shuo Li, Jiadi Yang & Tianzhang **ng

Authors

Zixiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiansu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shuo Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiadi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Tianzhang **ng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tianzhang **ng.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, Z., Wang, J., Li, S. et al. A lightweight and real-time responsive framework for various visual tasks via neural architecture search. CCF Trans. Pervasive Comp. Interact. (2024). https://doi.org/10.1007/s42486-024-00157-w

Download citation

Received: 30 January 2024
Accepted: 22 April 2024
Published: 21 May 2024
DOI: https://doi.org/10.1007/s42486-024-00157-w

Keywords

Access this article

Log in via an institution

Price includes VAT (Germany)

Instant access to the full article PDF.

Institutional subscriptions

A lightweight and real-time responsive framework for various visual tasks via neural architecture search

Abstract

Access this article

Similar content being viewed by others

DLW-NAS: Differentiable Light-Weight Neural Architecture Search

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

Depth-Adaptive Computational Policies for Efficient Visual Tracking

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A lightweight and real-time responsive framework for various visual tasks via neural architecture search

Abstract

Access this article

Similar content being viewed by others

DLW-NAS: Differentiable Light-Weight Neural Architecture Search

DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution Pruning

Depth-Adaptive Computational Policies for Efficient Visual Tracking

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation