AI for Software Quality & Cybersecurity

Cyberattacks are one of the greatest existential threats to national security, economy and society. Cyberattacks are often formed by exploiting vulnerabilities in software applications. For instance, the WannaCry ransomware attack, which exploited a vulnerability in Microsoft Windows systems, hit medical emergency rooms across the UK, significantly disrupting medical procedures for many patients. Existing techniques and tools for security analysis are no longer able to cope with the significant increase in the size and complexity of applications, resulting in a massive number of attack-prone vulnerabilities reported in recent years.

The rise of Artificial Intelligence (AI), empowered by the growth and availability of big data, breakthroughs in AI algorithms (e.g. deep learning), and significantly increased computational power, is potentially a game changer in fighting against software vulnerabilities. We are working towards AI-powered automated solutions which: (1) instantaneously detect vulnerability threats while code is being written and alert the software engineers of those threats; and (2) recommend patches to fix those vulnerabilities. Through this novel approach of tackling vulnerabilities early in the software lifecycle, this project will prevent vulnerabilities from being injected in the code at the very first entry point, thus saving the significant cost increase later.

Ongoing projects:
  1. Security vulnerability prediction: We have developed a powerful AI deep learning Long Short Term Memory based model to automatically learn both semantic and syntactic features of code for predicting software vulnerabilities. Our model is able to detect a wide range of vulnerabilities such as log forging, information leak, unreleased resource, denial of service, race condition, cross-site scripting, command injection, privacy violation and header manipulation.

  1. Defect prediction: We have developed AI deep learning models (Tree-LSTM and CNN based) for predicting defects in large codebases and also in code changes (e.g. commits).