Toward Understanding and Testing Deep Learning Information Flow in Deep Learning-Based Android Apps
Authors: Jie Zhang, Qianyu Guo, Tieyi Zhang, Zhiyong Feng, Xiaohong Li
Abstract:
The widespread popularity of mobile devices and the development of artificial intelligence (AI) have led to the widespread adoption of deep learning (DL) in Android apps. Compared with traditional Android apps (traditional apps), deep learning based Android apps (DL-based apps) need to use more third-party application programming interfaces (APIs) to complete complex DL inference tasks. However, existing methods (e.g., FlowDroid) for detecting sensitive information leakage in Android apps cannot be directly used to detect DL-based apps as they are difficult to detect third-party APIs. To solve this problem, we design DLtrace, a new static information flow analysis tool that can effectively recognize third-party APIs. With our proposed trace and detection algorithms, DLtrace can also efficiently detect privacy leaks caused by sensitive APIs in DL-based apps. Additionally, we propose two formal definitions to deal with the common polymorphism and anonymous inner-class problems in the Android static analyzer. Using DLtrace, we summarize the non-sequential characteristics of DL inference tasks in DL-based apps and the specific functionalities provided by DL models for such apps. We conduct an empirical assessment with DLtrace on 208 popular DL-based apps in the wild and found that 26.0% of the apps suffered from sensitive information leakage. Furthermore, DLtrace outperformed FlowDroid in detecting and identifying third-party APIs. The experimental results demonstrate that DLtrace expands FlowDroid in understanding DL-based apps and detecting security issues therein.
Keywords: Mobile computing, deep learning apps, sensitive information, static analysis.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 623References:
[1] A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan, “Speech recognition using deep neural networks: A systematic review,” IEEE access, vol. 7, pp. 19 143–19 165, 2019.
[2] Y. Li, “Research and application of deep learning in image recognition,” in 2022 IEEE 2nd International Conference on Power, Electronics and Computer Applications (ICPECA). IEEE, 2022, pp. 994–999.
[3] D. W. Otter, J. R. Medina, and J. K. Kalita, “A survey of the usages of deep learning for natural language processing,” IEEE transactions on neural networks and learning systems, vol. 32, no. 2, pp. 604–624, 2020.
[4] Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “A survey of model compression and acceleration for deep neural networks,” arXiv preprint arXiv:1710.09282, 2017.
[5] R. J. Bolton and D. J. Hand, “Statistical fraud detection: A review,” Statistical science, vol. 17, no. 3, pp. 235–255, 2002.
[6] M. Fredrikson, E. Lantz, S. Jha, S. Lin, D. Page, and T. Ristenpart, “Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing,” in 23rd USENIX Security Symposium (USENIX Security 14), 2014, pp. 17–32.
[7] M. Xu, J. Liu, Y. Liu, F. X. Lin, Y. Liu, and X. Liu, “A first look at deep learning apps on smartphones,” in The World Wide Web Conference, 2019, pp. 2125–2136.
[8] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. McDaniel, “Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps,” Acm Sigplan Notices, vol. 49, no. 6, pp. 259–269, 2014.
[9] X. Sun, X. Chen, K. Liu, S. Wen, L. Li, and J. Grundy, “Characterizing sensor leaks in android apps,” in 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2021, pp. 498–509.
[10] L. Li, T. F. Bissyandé, M. Papadakis, S. Rasthofer, A. Bartel, D. Octeau, J. Klein, and L. Traon, “Static analysis of android apps: A systematic literature review,” Information and Software Technology, vol. 88, pp. 67–95, 2017.
[11] É. Payet and F. Spoto, “Static analysis of android programs,” Information and Software Technology, vol. 54, no. 11, pp. 1192–1201, 2012.
[12] F. Tong and Z. Yan, “A hybrid approach of mobile malware detection in android,” Journal of Parallel and Distributed computing, vol. 103, pp. 22–31, 2017.
[13] W. Enck, P. Gilbert, S. Han, V. Tendulkar, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth, “Taintdroid: an information-flow tracking system for realtime privacy monitoring on smartphones,” ACM Transactions on Computer Systems (TOCS), vol. 32, no. 2, pp. 1–29, 2014.
[14] M. Sun, T. Wei, and J. C. Lui, “Taintart: A practical multi-level information-flow tracking system for android runtime,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 331–342.
[15] P. Feng, J. Ma, C. Sun, X. Xu, and Y. Ma, “A novel dynamic android malware detection system with ensemble learning,” IEEE Access, vol. 6, pp. 30 996–31 011, 2018.
[16] M. I. Gordon, D. Kim, J. H. Perkins, L. Gilham, N. Nguyen, and M. C. Rinard, “Information flow analysis of android applications in droidsafe.” in NDSS, vol. 15, no. 201, 2015, p. 110.
[17] L. Li, A. Bartel, T. F. Bissyandé, J. Klein, Y. Le Traon, S. Arzt, S. Rasthofer, E. Bodden, D. Octeau, and P. McDaniel, “Iccta: Detecting inter-component privacy leaks in android apps,” in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1. IEEE, 2015, pp. 280–291.
[18] F. Wei, S. Roy, and X. Ou, “Amandroid: A precise and general inter-component data flow analysis framework for security vetting of android apps,” ACM Transactions on Privacy and Security (TOPS), vol. 21, no. 3, pp. 1–32, 2018.
[19] P. Lam, E. Bodden, O. Lhoták, and L. Hendren, “The soot framework for java program analysis: a retrospective,” in Cetus Users and Compiler Infastructure Workshop (CETUS 2011), vol. 15, no. 35, 2011.
[20] Google. (2022) TFlite. (Online). Available: https://www.tensorflow.org/ lite
[21] Facebook. (2022) Caffe2. (Online). Available: https://caffe2.ai/
[22] Apple. (2022) Core ML. (Online). Available: https://developer.apple. com/cn/documentation/coreml/
[23] M. A. Ayub, W. A. Johnson, D. A. Talbert, and A. Siraj, “Model evasion attack on intrusion detection systems using adversarial machine learning,” in 2020 54th Annual Conference on Information Sciences and Systems (CISS). IEEE, 2020, pp. 1–6.
[24] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” in Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, 2015, pp. 1322–1333.
[25] Y. Liu, S. Ma, Y. Aafer, W.-C. Lee, J. Zhai, W. Wang, and X. Zhang, “Trojaning attack on neural networks,” 2017.
[26] S. Shen, S. Tople, and P. Saxena, “Auror: Defending against poisoning attacks in collaborative deep learning systems,” in Proceedings of the 32nd Annual Conference on Computer Security Applications, 2016, pp. 508–519.
[27] F. Tramèr, F. Zhang, A. Juels, M. K. Reiter, and T. Ristenpart, “Stealing machine learning models via prediction {APIs},” in 25th USENIX security symposium (USENIX Security 16), 2016, pp. 601–618.
[28] B. Wang, Y. Yao, S. Shan, H. Li, B. Viswanath, H. Zheng, and B. Y. Zhao, “Neural cleanse: Identifying and mitigating backdoor attacks in neural networks,” in 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 2019, pp. 707–723.
[29] Z. Li, C. Hu, Y. Zhang, and S. Guo, “How to prove your model belongs to you: A blind-watermark based framework to protect intellectual property of dnn,” in Proceedings of the 35th Annual Computer Security Applications Conference, 2019, pp. 126–137.
[30] D. Hitaj and L. V. Mancini, “Have you stolen my model? evasion attacks against deep neural network watermarking techniques,” arXiv preprint arXiv:1809.00615, 2018.
[31] Y. Huang, H. Hu, and C. Chen, “Robustness of on-device models: Adversarial attack to deep learning models on android apps,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 2021, pp. 101–110.
[32] Z. Sun, R. Sun, L. Lu, and A. Mislove, “Mind your weight (s): A large-scale study on insufficient machine learning model protection in mobile apps,” in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 1955–1972.
[33] (2022) Amazon. (Online). Available: https://docs.aws.amazon.com/zh\ _cn/personalize/latest/dg/personalize-dg.pdf
[34] (2022) Google. (Online). Available: https://firebase.google.com/docs/ ml-kit
[35] (2022) Microsoft. (Online). Available: https://www.microsoft.com/ en-us/ai