Yuhong Nan - Purdue University
Students: Spring 2025, unless noted otherwise, sessions will be virtual on Zoom.
Semantics-Driven, Learning-Based Privacy Discovery in Mobile Apps
Feb 26, 2020
Download: MP4 Video Size: 225.6MBWatch on YouTube
Abstract
A long-standing challenge in analyzing information leaks within mobile apps is to automatically identify the codeoperating on sensitive data. With all existing solutions relying on System APIs (e.g., IMEI, GPS location) or features of user interfaces (UI), the content from app servers, like user's Facebook profile, payment history, fall through the crack.
In this talk, I will introduce ClueFinder, a novel semantics-driven solution for automatic discovery of sensitive user data, including those from the server side. ClueFinder utilizes natural language processing (NLP) to automatically locate the program elements (variables, methods, etc.) of interest, and then performs a learning-based program structure analysis to accurately identify those indeed carrying sensitive content. Using this new technique, we analyzed over 400k popular apps, an unprecedented scale for this type of research. Our findings brings to light the pervasiveness of information leaks, and the channels through which the leaks happen, including unintentional over-sharing across libraries and aggressive data acquisition behaviors.
About the Speaker
Dr. Yuhong Nan is a Post-Doctoral Research Associate at Purdue University. He earned his Ph.D. in the School of Computer Science from Fudan University, China, with the honor of the 2018 ACM SIGSAC China Doctoral Dissertation Award. His research interests span privacy leakage detection in mobile and IoT platforms, security enhancement for IoT systems, as well as cyber-attack investigation with audit logs.