Log in

A lightweight and real-time responsive framework for various visual tasks via neural architecture search

  • Regular Paper
  • Published:
CCF Transactions on Pervasive Computing and Interaction Aims and scope Submit manuscript

Abstract

With the enhanced capabilities of edge devices in processing images and video streams, novel deep vision applications are rapidly emerging. To support such applications, lightweight neural networks have proven effective, and existing solutions often adjust the model across various dimensions to meet specific application requirements. However, as each application varies in accuracy, latency, and memory usage requirements, a single lightweight technology cannot satisfy all these diverse metrics. Additionally, processing video stream tasks on mobile devices often falls short of achieving real-time performance. Therefore, this paper proposes a lightweight and real-time framework based on neural architecture search, termed LRNAS. We developed a lightweight network search algorithm employing evolutionary strategies and multi-objective optimization to personalize designs for diverse vision tasks. To reduce inference latency further, we designed a video frame filtering strategy. This strategy uses motion vectors and inertial sensors to filter out redundant video frames. We conducted experiments on two public datasets and one custom dataset, demonstrating LRNAS’s effectiveness in enhancing mobile deep vision application performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Germany)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author onreasonable request.

References

  • Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: International Conference on Learning Representations (2016)

  • Chernyshev, M., Baig, Z., Zeadally, S.: Cloud-native application security: Risks, opportunities, and challenges in securing the evolving attack surface. Computer 54(11), 47–57 (2021). https://doi.org/10.1109/MC.2021.3076537

    Article  Google Scholar 

  • Chu, X., Lu, S., Li, X., Zhang, B.: Mixpath: A unified approach for one-shot neural architecture search. In: 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5949–5958 (2023). https://doi.org/10.1109/ICCV51070.2023.00549

  • Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: Dpp-net: Device-aware progressive search for pareto-optimal neural architectures. In: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XI, pp. 540–555 (2018). https://doi.org/10.1007/978-3-030-01252-6_32

  • Ghiasi, G., Lin, T.-Y., Le, Q.V.: Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7029–7038 (2019). https://doi.org/10.1109/CVPR.2019.00720

  • Gong, T., Zhou, W., Qian, X., Lei, J., Yu, L.: Global contextually guided lightweight network for rgb-thermal urban scene understanding. Eng. Appl. Artif. Intell. 117(PA) (2023) https://doi.org/10.1016/j.engappai.2022.105510

  • Gowda, S.N., Rohrbach, M., Sevilla-Lara, L.: Smart frame selection for action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 1451–1459 (2021)

  • Han, D., Kim, J., Kim, J.: Deep pyramidal residual networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6307–6315 (2017). https://doi.org/10.1109/CVPR.2017.668

  • He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  • He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1398–1406 (2017). https://doi.org/10.1109/ICCV.2017.155

  • Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L.-C., Tan, M., Chu, G., Vasudevan, V., Zhu, Y., Pang, R., Adam, H., Le, Q.: Searching for mobilenetv3. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 1314–1324 (2019). https://doi.org/10.1109/ICCV.2019.00140

  • Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. ar**v preprint ar**v:1704.04861 (2017)

  • Huang, G., Liu, S., Maaten, L., Weinberger, K.Q.: Condensenet: An efficient densenet using learned group convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2752–2761 (2018)

  • Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269 (2017). https://doi.org/10.1109/CVPR.2017.243

  • Huynh, L.N., Lee, Y., Balan, R.K.: Deepmon: Mobile gpu-based deep learning framework for continuous vision applications. In: Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services, pp. 82–95 (2017). https://doi.org/10.1145/3081333.3081360

  • Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. ar**v preprint ar**v:1602.07360 (2016)

  • Jacob, B., Kligys, S., Chen, B., Zhu, M., Tang, M., Howard, A., Adam, H., Kalenichenko, D.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2704–2713 (2018). https://doi.org/10.1109/CVPR.2018.00286

  • Kandaswamy, I., Farkya, S., Daniels, Z., Wal, G., Raghavan, A., Zhang, Y., Hu, J., Lomnitz, M., Isnardi, M., Zhang, D., Piacentino, M.: Real-time hyper-dimensional reconfiguration at the edge using hardware accelerators. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 3609–3617 (2022). https://doi.org/10.1109/CVPRW56347.2022.00405

  • Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386

    Article  Google Scholar 

  • Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998). https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  • Li, Z., Li, M., Mohapatra, P., Han, J., Chen, S.: itype: Using eye gaze to enhance ty** privacy. In: IEEE INFOCOM 2017 - IEEE Conference on Computer Communications, pp. 1–9 (2017). https://doi.org/10.1109/INFOCOM.2017.8057233

  • Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2755–2763 (2017). https://doi.org/10.1109/ICCV.2017.298

  • Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: International Conference on Learning Representations (2018)

  • Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. In: International Conference on Learning Representations (2018)

  • Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. In: International Conference on Learning Representations (2019)

  • Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L.-J., Fei-Fei, L., Yuille, A., Huang, J., Murphy, K.: Progressive neural architecture search. In: Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part I, pp. 19–35 (2018). https://doi.org/10.1007/978-3-030-01246-5_2

  • Lopes, V., Carlucci, F.M., Esperança, P.M., Singh, M., Yang, A., Gabillon, V., Xu, H., Chen, Z., Wang, J.: Manas: multi-agent neural architecture search. Mach. Learn. 113(1), 73–96 (2023). https://doi.org/10.1007/s10994-023-06379-w

    Article  MathSciNet  Google Scholar 

  • Lu, Z., Cheng, R., **, Y., Tan, K.C., Deb, K.: Neural architecture search as multiobjective optimization benchmarks: Problem formulation and performance assessment. IEEE Transactions on Evolutionary Computation, 1–1 (2022) https://doi.org/10.1109/TEVC.2022.3233364

  • Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: Practical guidelines for efficient cnn architecture design, pp. 122–138 (2018). https://doi.org/10.1007/978-3-030-01264-9_8

  • Mehta, S., Rastegari, M., Shapiro, L., Hajishirzi, H.: Espnetv2: A light-weight, power efficient, and general purpose convolutional neural network. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9182–9192 (2019). https://doi.org/10.1109/CVPR.2019.00941

  • Pham, H., Guan, M., Zoph, B., Le, Q., Dean, J.: Efficient neural architecture search via parameters sharing. In: International Conference on Machine Learning, pp. 4095–4104 (2018)

  • Rahman, M., Topkara, U., Carbunar, B.: Movee: Video liveness verification for mobile devices using built-in motion sensors. IEEE Trans. Mob. Comput. 15(5), 1197–1210 (2016). https://doi.org/10.1109/TMC.2015.2456904

    Article  Google Scholar 

  • Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: Proceedings of the Aaai Conference on Artificial Intelligence, vol. 33, pp. 4780–4789 (2019)

  • Sacco, A., Esposito, F., Marchetto, G.: Resource inference for sustainable and responsive task offloading in challenged edge networks. IEEE Trans. Green Commun. Netw. 5(3), 1114–1127 (2021). https://doi.org/10.1109/TGCN.2021.3091812

    Article  Google Scholar 

  • Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018). https://doi.org/10.1109/CVPR.2018.00474

  • Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., Le, Q.V.: Mnasnet: Platform-aware neural architecture search for mobile. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2815–2823 (2019). https://doi.org/10.1109/CVPR.2019.00293

  • Wang, Z., He, X., Zhou, Z., Wang, X., Ma, Q., Miao, X., Liu, Z., Thiele, L., Yang, Z.: Stitching weight-shared deep neural networks for efficient multitask inference on gpu. In: 2022 19th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), pp. 145–153 (2022). https://doi.org/10.1109/SECON55815.2022.9918563

  • Wang, T., Sun, B., Wang, L., Zheng, X., Jia, W.: Eidls: An edge-intelligence-based distributed learning system over internet of things. IEEE Trans. Syst. Man Cybern. Syst. 53(7), 3966–3978 (2023). https://doi.org/10.1109/TSMC.2023.3240992

    Article  Google Scholar 

  • Wei, H., Lee, F., Hu, C., Chen, Q.: Moo-dnas: Efficient neural network design via differentiable architecture search based on multi-objective optimization. IEEE Access 10, 14195–14207 (2022). https://doi.org/10.1109/ACCESS.2022.3148323

    Article  Google Scholar 

  • Wu, Z., **ong, C., Ma, C.-Y., Socher, R., Davis, L.S.: Adaframe: Adaptive frame selection for fast video recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1278–1287 (2019)

  • **e, L., Yuille, A.: Genetic cnn. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1388–1397 (2017). https://doi.org/10.1109/ICCV.2017.154

  • Xu, M., Liu, J., Liu, Y., Lin, F.X., Liu, Y., Liu, X.: A first look at deep learning apps on smartphones. In: The World Wide Web Conference, pp. 2125–2136 (2019). https://doi.org/10.1145/3308558.3313591

  • Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6848–6856 (2018). https://doi.org/10.1109/CVPR.2018.00716

  • Zhao, M., Yu, Y., Wang, X., Yang, L., Niu, D.: Search-map-search: A frame selection paradigm for action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10627–10636 (2023)

  • Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., Zhang, J.: Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proc. IEEE 107(8), 1738–1762 (2019). https://doi.org/10.1109/JPROC.2019.2918951

    Article  Google Scholar 

  • Zhou, Y., Yen, G.G., Yi, Z.: Evolutionary shallowing deep neural networks at block levels. IEEE Trans. Neural Netw Learn Syst 33(9), 4635–4647 (2022). https://doi.org/10.1109/TNNLS.2021.3059529

    Article  MathSciNet  Google Scholar 

  • Zhou, H., Jiang, K., He, S., Min, G., Wu, J.: Distributed deep multi-agent reinforcement learning for cooperative edge caching in internet-of-vehicles. IEEE Trans. Wireless Commun. 22(12), 9595–9609 (2023). https://doi.org/10.1109/TWC.2023.3272348

    Article  Google Scholar 

  • Zhou, H., Li, M., Wang, N., Min, G., Wu, J.: Accelerating deep learning inference via model parallelism and partial computation offloading. IEEE Trans. Parallel Distrib. Syst. 34(2), 475–488 (2023). https://doi.org/10.1109/TPDS.2022.3222509

    Article  Google Scholar 

  • Zhou, H., Jiang, K., He, S., Min, G., Wu, J.: Distributed deep multi-agent reinforcement learning for cooperative edge caching in internet-of-vehicles. IEEE Trans. Wireless Commun. 22(12), 9595–9609 (2023). https://doi.org/10.1109/TWC.2023.3272348

    Article  Google Scholar 

  • Zoph, B., Le, Q.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (2016)

  • Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018). https://doi.org/10.1109/CVPR.2018.00907

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tianzhang **ng.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Wang, J., Li, S. et al. A lightweight and real-time responsive framework for various visual tasks via neural architecture search. CCF Trans. Pervasive Comp. Interact. (2024). https://doi.org/10.1007/s42486-024-00157-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42486-024-00157-w

Keywords

Navigation