Abstract
Analyzing the robustness of neural networks is crucial for trusting them. The vast majority of existing works focus on networks’ robustness in \(\epsilon \)-ball neighborhoods, but these cannot capture complex robustness specifications. We propose MaRVeL, a system for computing maximal non-uniform robust specifications that maximize a target norm. The main idea is to employ oracle-guided numerical optimization, thereby leveraging the efficiency of a numerical optimizer as well as the accuracy of a non-differentiable robustness verifier, acting as the oracle. The optimizer iteratively submits to the verifier candidate specifications, which in turn returns the closest inputs to the decision boundaries. The optimizer then computes their gradients to guide its search in the directions the specification can expand while remaining robust. We evaluate MaRVeL on several datasets and classifiers and show that its specifications are larger by 5.1x than prior works. On a two-dimensional dataset, we show that the average diameter of its specifications is 93% of the optimal average diameter, whereas the diameter of prior works’ specifications is only 26%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
There is an edge case where \(l_i=u_i\), in which case the gradient is zero. There are standard corrections to guarantee that the gradient is monotonically increasing. For example, for the \(L_1\) norm, which is the one currently supported in our implementation, the correction replaces the zero gradient by 1 (for \(u_i\)) or \(-1\) (for \(l_i\)).
- 2.
- 3.
- 4.
References
Anderson, G., Pailoor, S., Dillig, I., Chaudhuri., S.: Optimization and abstraction: a synergistic approach for analyzing neural network robustness. In: PLDI, pp. 731–744 (2019)
Balunovic, M., Vechev, M.T.: Adversarial training and provable defenses: bridging the gap. In: ICLR, pp. 1–18 (2020)
Boopathy, A., Weng, T., Chen, P., Liu, S., Dani., L.: Cnn-cert: an efficient framework for certifying robustness of convolutional neural networks. In: AAAI, pp. 3240–3247 (2019)
Carlini, N., Wagner., D.A.: Towards evaluating the robustness of neural networks. In: SP, pp. 39–57 (2017)
Chen, P., Sharma, Y., Zhang, H., Yi, J., Hsieh, C.: EAD: elastic-net attacks to deep neural networks via adversarial examples. In: AAAI (2018)
Chen, P., Zhang, H., Sharma, Y., Yi, J., Hsieh, C.: ZOO: zeroth order optimization based black-box attacks to deep neural networks without training substitute models. In: AISec Workshop, pp. 15–26 (2017)
Contagio: Contagio, pdf malware dump (2010). http://contagiodump.blogspot.de/2010/08/malicious-documents-archivefor.html
Dimitrov, D.I., Singh, G., Gehr, T., Vechev, M.T.: Provably robust adversarial examples. In: ICLR (2022)
Dvijotham, K., Stanforth, R., Gowal, S., Mann, T.A., Kohli, P.: A dual approach to scalable verification of deep networks. In: UAI, p. 3 (2018)
Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: D’Souza, D., Narayan Kumar, K. (eds.) ATVA 2017. LNCS, vol. 10482, pp. 269–286. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68167-2_19
Elboher, Y.Y., Gottschlich, J., Katz, G.: An abstraction-based framework for neural network verification. In: Lahiri, S.K., Wang, C. (eds.) CAV 2020. LNCS, vol. 12224, pp. 43–65. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-53288-8_3
Erdemir, E., Bickford, J., Melis, L., Aydöre, S.: Adversarial robustness with non-uniform perturbations. In: NeurIPS (2021)
Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.T.: AI2: safety and robustness certification of neural networks with abstract interpretation. In: SP, pp. 3–18 (2018)
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: ICLR (2015)
Ilyas, A., Santurkar, S., Tsipras, D., Engstrom, L., Tran, B., Madry, A.: Adversarial examples are not bugs, they are features. In: NeurIPS (2019)
Jha, S., Gulwani, S., Seshia, S.A., Tiwari, A.: Oracle-guided component-based program synthesis. In: ICSE, pp. 215–224 (2010)
Jha, S., Seshia, S.A.: A theory of formal synthesis via inductive learning. Acta Informatica 54(7), 693–726 (2017). https://doi.org/10.1007/s00236-017-0294-5
Katz, G., Barrett, C., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: an efficient SMT solver for verifying deep neural networks. In: Majumdar, R., Kunčak, V. (eds.) CAV 2017. LNCS, vol. 10426, pp. 97–117. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63387-9_5
Katz, G., et al.: The marabou framework for verification and analysis of deep neural networks. In: Dillig, I., Tasiran, S. (eds.) CAV 2019. LNCS, vol. 11561, pp. 443–452. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25540-4_26
Krizhevsky, A.: Learning multiple layers of features from tiny images. In: CoRR, abs/1708.07747 (2009)
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial examples in the physical world. In: ICLR Workshop, pp. 99–112 (2017)
Kurakin, A., Goodfellow, I.J., Bengio, S.: Adversarial machine learning at scale. In: ICLR, pp. 99–112 (2017)
Lazarus, C., Kochenderfer, M.J.: A mixed integer programming approach for verifying properties of binarized neural networks. In: IJCAI Workshop (2021)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Li, C., et al.: Towards certifying the asymmetric robustness for neural networks: quantification and applications. In: TDSC (2021)
Liu, C., Tomioka, R., Cevher, V.: On certifying non-uniform bounds against adversarial attacks. In: ICML, pp. 4072–4081 (2019)
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: ICLR (2018)
MĂĽller, C., Serre, F., Singh, G., PĂĽschel, M., Vechev, M.: Scaling polyhedral neural network verification on GPUs. In: MLSys (2021)
Qin, C., et al.: Verification of non-linear specifications for neural networks. In: ICLR (2019)
Raghunathan, A., Steinhardt, J., Liang, P.: Certified defenses against adversarial examples. In: ICLR (2018)
Salman, H., Yang, G., Zhang, H., Hsieh, C., Zhang, P.: A convex relaxation barrier to tight robustness verification of neural networks. In: NeurIPS (2019)
Singh, G., Ganvir, R., PĂĽschel, M., Vechev, M.T.: Beyond the single neuron convex barrier for neural network certification. In: NeurIPS (2019)
Singh, G., Gehr, T., Püschel, M., Vechev, M.T.: An abstract domain for certifying neural networks. In: POPL, pp. 1–30 (2019)
Singh, G., Gehr, T., PĂĽschel, M., Vechev, M.T.: Boosting robustness certification of neural networks. In: ICLR (2019)
Sinha, A., Namkoong, H., Duchi, J.C.: Certifying some distributional robustness with principled adversarial training. In: ICLR (2019)
Solar-Lezama, A., Tancau, L., BodĂk, R., Seshia, S.A., Saraswat, V.A.: Combinatorial sketching for finite programs. In: ASPLOS, pp. 404–415 (2006)
Szegedy, C., et al.: Intriguing properties of neural networks. In: ICLR (2014)
Tjeng, V., **ao, K.Y., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: ICLR (2019)
Tu, C., et al.: Autozoom: autoencoder-based zeroth order optimization method for attacking black-box neural networks. In: AAAI, pp. 742–749 (2019)
VirusTotal: Virustotal, a free service that analyzes suspicious files and urls and facilitates the quick detection of viruses, worms, trojans, and all kinds of malware (2004). https://www.virustotal.com/
Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Efficient formal safety analysis of neural networks. In: NeurIPS (2018)
Wang, S., Pei, K., Whitehouse, J., Yang, J., Jana, S.: Formal security analysis of neural networks using symbolic intervals. In: USENIX, pp. 1599–1614 (2018)
Wang, S., et al.: Beta-crown: efficient bound propagation with per-neuron split constraints for neural network robustness verification. In: NeurIPS (2021)
**ao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. ar**v preprint ar**v:1708.07747 (2017)
**e, C., Wu, Y., van der Maaten, L., Yuille, A.L., He, K.: Feature denoising for improving adversarial robustness. In: CVPR, pp. 501–509 (2019)
Yuan, X., He, P., Zhu, Q., Li, X.: Adversarial examples: attacks and defenses for deep learning. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2805–2824 (2019)
Zeng, H., Zhu, C., Goldstein, T., Huang, F.: Are adversarial examples created equal? A learnable weighted minimax risk for robustness under non-uniform attacks. In: AAAI, pp. 10815–10823 (2021)
Zhang, C., Benz, P., Imtiaz, T., Kweon, I.S.: Understanding adversarial examples from the mutual influence of images and perturbations. In: CVPR, pp. 14521–14530 (2020)
Zhang, H., Weng, T., Chen, P., Hsieh, C., Daniel, L.: Efficient neural network robustness certification with general activation functions. In: NeurIPS (2018)
Acknowledgements
We thank the reviewers for their feedback. This research was supported by the Israel Science Foundation (grant No. 2605/20).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kabaha, A., Drachsler-Cohen, D. (2023). Maximal Robust Neural Network Specifications via Oracle-Guided Numerical Optimization. In: Dragoi, C., Emmi, M., Wang, J. (eds) Verification, Model Checking, and Abstract Interpretation. VMCAI 2023. Lecture Notes in Computer Science, vol 13881. Springer, Cham. https://doi.org/10.1007/978-3-031-24950-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-031-24950-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24949-5
Online ISBN: 978-3-031-24950-1
eBook Packages: Computer ScienceComputer Science (R0)