Keynote (WoRMA)

last update: 29 April, 2022

Open Challenges of Malware Detection under Concept Drift

Prof. Gang Wang
Prof. Gang Wang
University of Illinois at Urbana-Champaign
United States of America
Abstract: The security community today is increasingly using machine learning (ML) for malware detection for its ability to scale to a large number of files and capture patterns that are difficult to describe explicitly. However, deploying an ML-based malware detector is challenging in practice due to concept drift. As the behaviors of malware and goodware constantly evolve, the shift in their data distribution often leads to serious errors in the deployed detectors. In addition, such dynamic evolvement further adds to the pressure of labeling new malware variants for model updating, which is already an expensive process.
In this talk, I will introduce our recent exploration of the challenges introduced by malware concept drift and the potential solutions. I will first discuss the problem of detecting drifting samples to proactively inform ML detectors when not to make decisions. We explore the idea of self-supervision for drift detection and design the corresponding explanation methods to make sense of the detected concept drift. Second, to facilitate malware labeling and model updating, I will share our recent results from combining cheap unsupervised methods with the existing limited/biased labels to generate higher-quality labels. Finally, I will discuss the emerging threat of poisoning and backdoor attacks that exploit the dynamic updating process of malware detectors, and potential directions to robustify this process.
Speaker’s Bio: Gang Wang is an Assistant Professor of Computer Science at University of Illinois at Urbana-Champaign. He obtained his Ph.D. from UC Santa Barbara in 2016, and a B.E. from Tsinghua University in 2010. Before joining University of Illinois, he worked as an assistant professor at Virginia Tech from 2016 to 2019. His research interests are Security and Privacy, Data Mining, and Internet Measurements. His work primarily takes a data-driven approach to address emerging security threats in massive communication systems, crowdsourcing systems, mobile applications, and enterprise networks. He is a recipient of the NSF CAREER Award (2018), Amazon Research Award (2021), Google Faculty Research Award (2017), and Best Paper Awards from IMWUT 2019, ACM CCS 2018, and SIGMETRICS 2013. His projects have been covered by various media outlets such as MIT Technology Review, The New York Times, Boston Globe, and ACM TechNews.