Hoa Khanh Dam Homepage

AI for Software Quality & Cybersecurity

Cyberattacks are one of the greatest existential threats to national security, economy and society. Cyberattacks are often formed by exploiting vulnerabilities in software applications. For instance, the WannaCry ransomware attack, which exploited a vulnerability in Microsoft Windows systems, hit medical emergency rooms across the UK, significantly disrupting medical procedures for many patients. Existing techniques and tools for security analysis are no longer able to cope with the significant increase in the size and complexity of applications, resulting in a massive number of attack-prone vulnerabilities reported in recent years.

The rise of Artificial Intelligence (AI), empowered by the growth and availability of big data, breakthroughs in AI algorithms (e.g. deep learning), and significantly increased computational power, is potentially a game changer in fighting against software vulnerabilities. We are working towards AI-powered automated solutions which: (1) instantaneously detect vulnerability threats while code is being written and alert the software engineers of those threats; and (2) recommend patches to fix those vulnerabilities. Through this novel approach of tackling vulnerabilities early in the software lifecycle, this project will prevent vulnerabilities from being injected in the code at the very first entry point, thus saving the significant cost increase later.

Ongoing projects:

Security vulnerability prediction: We have developed a powerful AI deep learning Long Short Term Memory based model to automatically learn both semantic and syntactic features of code for predicting software vulnerabilities. Our model is able to detect a wide range of vulnerabilities such as log forging, information leak, unreleased resource, denial of service, race condition, cross-site scripting, command injection, privacy violation and header manipulation.

Hoa Khanh Dam, Truyen Tran, Trang Pham, Shien Wee Ng, John Grundy, Aditya Ghose, Automatic feature learning for predicting vulnerable software components, IEEE Transactions on Software Engineering. DOI: 10.1109/TSE.2018.2881961. Preprint available here.

Defect prediction: We have developed AI deep learning models (Tree-LSTM and CNN based) for predicting defects in large codebases and also in code changes (e.g. commits).

Thong Hoang, Hoa Khanh Dam, Yasutaka Kamei, David Lo and Naoyasu Ubayashi, DeepJIT: An End-To-End Deep LearningFramework for Just-In-Time Defect Prediction, Proceeedings of 16th International Conference on Mining Software Repositories (MSR 2019), co-located with ICSE 2019, To Appear.
Hoa Khanh Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya Ghose, Taeksu Kim and Chul-Joo Kim, Lessons learned from using a deep tree-based model for software defect prediction in practice, Proceeedings of 16th International Conference on Mining Software Repositories (MSR 2019), co-located with ICSE 2019, To Appear.