Rethinking The Future of Cybersecurity Vulnerability Disclosure
Why we need to stop giving AI models a head start on developing and reverse-engineering security vulnerabilities
By Jack Kengott
If you are looking for malicious code, you actually don’t have to go searching on the dark web. This fact often surprises those who haven’t spent much time working in offensive security. The truth is, the description and code for thousands of exploits is available freely and openly on the internet—easily reachable with a quick Google search. For years, sites like ExploitDB and the MITRE Corporation’s Common Vulnerability and Exposure (CVE) repository have been used by security researchers to openly share details of security vulnerabilities and proof-of-concept (PoC) exploit code. This kind of open disclosure is a key pillar in security frameworks used by ethical hackers and defense teams alike to keep one step ahead of cyber adversaries.
In the last decade, the equilibrium between defense teams and malicious actors has been relatively stable, in part thanks to these practices. Researchers—either on defense teams responding to incidents or by active ethical hacking—discover and disclose vulnerabilities and technology vendors like Microsoft or Oracle develop fixes. Coordinated and responsible disclosure frameworks function well because new exploits are discovered at a manageable rate, making patching possible without much lag time. As a result, though cyber-attacks are common, they aren’t so common as to impact average citizens in their day-to-day lives.
Considering that cyber defenders start at a great disadvantage, this is a relatively remarkable outcome. Attackers need only find one vulnerability in an otherwise secure system to break through defenses. In other words, defenders might keep 9,999 doors locked, but leave one unlocked and it is all for naught. This is known as the “Defender’s Dilemma,” and for the last decade, AI has largely helped keep defense teams in the game despite this disadvantage. So far, AI capabilities have aligned well with the needs of defense teams. Security Operations Centers (SOCs) need to identify malicious patterns in massive amounts of alerts, and models are well suited for this task. Because there is a global shortage of defense analysts to parse the mountains of alerts produced in cyberspace, this use of AI has helped keep defenders’ heads above water.
As new generative AI capabilities are developed, however, the cyber balance might shift back toward malicious actors, through the creation of adversarial AI models that synthesize and test exploits automatically. This will turn defenders’ shields into adversaries’ own weapons. Open disclosure frameworks, which have been established to help cyber defenders detect and remediate vulnerabilities, will almost certainly be used as fuel for models that synthesize zero-day exploits or reverse-engineer exploits after vulnerability disclosure. Further, only a small group of adversaries will have the resources necessary to build the most powerful models. If the cyber security community cares to slow these capabilities, it must shift disclosure frameworks to move forward into an AI-centric cyberspace, even if this creates tension with cherished values.
Taking Stock of Disclosure Frameworks and Practices
To understand how adversarial AI poses risks to cyber defense teams, one must understand how and why the cybersecurity industry discloses vulnerabilities today, and what happens after vulnerabilities are disclosed to relevant vendors and made public.
Cybersecurity Disclosure Models
Vulnerabilities are mostly reported by two groups: offensive security researchers, who test software for security bugs, and cyber response teams, which observe and enumerate attacks in the field. In both cases, the reporter makes a decision about disclosure:
No Disclosure—Choose not to disclose the vulnerability at all. In this scenario, a fix is only to be implemented once the same vulnerability is rediscovered by another researcher, remaining out of public view and vulnerable to other attacks.
Public Disclosure—Release the vulnerability publicly without prior notification to the affected technology’s vendor, accepting any legal ramifications that may come with such a disclosure. This gives adversaries access to vulnerability details to the public and often occurs when a technology vendor is unresponsive to vulnerability reports
Responsible/Coordinated Disclosure—Disclose the vulnerability directly to the vendor first, giving a set amount of time before disclosing publicly (Google’s offensive research team, Project Zero, allows 90 days). This gives vendors valuable time to produce a fix before disclosure is publicly released. It also notifies the broader community of the risks associated with the vulnerability in a reasonable timeframe.
The industry's preferred practice is “Responsible” or “Coordinated” disclosure. In theory, the responsible and coordinated disclosure model gives a vendor time to develop fixes to their vulnerable product, called patches, so users will have a secure product version available by the time the vulnerability is made public. The impending public disclosure puts pressure on the organization to improve its security posture, and the researcher gets a reputational boost for discovering a new vulnerability. Everyone wins. Right?
What Happens After Disclosure?
The moment a disclosure is made public, the race is on for defense teams to update their vulnerable systems, usually with a security “patch” (assuming a patch is already released). Unfortunately, patching is not as easy as it may sound. In 2023, Verizon’s year-end study found that the average time to patch vulnerabilities is about 49 days, and depending on the type of technology, it can end up taking much longer. This means adversaries have a variable window to attack while a vulnerability is public.
Because public notification typically includes useful information for defense teams such as affected version(s), vulnerability descriptions, and even Proof of Concept (PoC) code, adversaries may have the means to exploit immediately. Accordingly, the availability of PoC code in disclosure is often a point of negotiation in responsible and coordinated disclosures. If example PoC code is present, adversaries have a loaded gun, ready to point and shoot at their target. If not, adversaries need to understand the vulnerability by description and reverse-engineer an exploit. Depending on the complexity of the vulnerability, this can take considerable effort. The goal of the adversary is to shrink the time from disclosure to reliable exploitation. The next section discusses how emerging AI capabilities are helping adversaries in this goal.
Cyber Adversaries and AI Models
Because AI models can be fed open-source security disclosures, it is important to understand how AI will be used to leverage this data. The confluence of expanded AI capacities and the prevailing cybersecurity frameworks have created a new and unique risk to the security outlook.
Generative Offensive Security Models
It is no secret that AI models can help identify new attack vectors for adversaries. In 2024, researchers used GPT-4 to autonomously hack websites without human feedback or previous knowledge of vulnerabilities present. In 2017, Chinese researchers created a tool called “SemFuzz” which generated PoC code using only CVE descriptions. Aided by SemFuzz, the researchers triggered 18 vulnerabilities and uncovered two previously undisclosed vulnerabilities. These are only two examples, and capabilities have been expanding into more and more stages of the “cyber kill chain.”
With the expansion of AI model capabilities through new training techniques and enhanced computing power, adversarial models will certainly become more powerful. Vulnerabilities will be generated and reverse-engineered faster, and the links between vulnerabilities will be formed automatically. This, in turn, will give defenders less time to patch disclosed vulnerabilities, and less time to respond once exploits have put adversaries inside target networks. As capabilities are linked together, defenders’ margins for error will become smaller and smaller.
Sourcing Training Data for Offensive Models
AI models depend on training data, and how developers acquire training data is becoming a topic of great public interest. Notably, OpenAI publicly states that the internet is one of its primary sources of training data. This has wide-ranging implications, but in the cybersecurity space, it means that all previous disclosures—including confirmed malicious code—are potentially being used for training future generative AI models.
As a business with legal liability and public relations considerations, OpenAI has incentives to act as responsibly as possible, but not all organizations share these incentives. When considering the possibility of malicious actors using the internet to train models, an important question arises: What currently prevents malicious groups from collecting swaths of vulnerability data from the internet? The answer is virtually nothing. Scraping the internet is logistically simple, and even if these groups cared about the law, there are as yet few legal hurdles to clear.
Further, current frameworks provide potential models with excellent data to use for training. We know models require clean, well-labeled data, and with our current open-source frameworks, there will be no shortage of such datasets. Disclosure datasets are categorized with extremely high fidelity and date back to 1999, with annual disclosures increasing every year. The annual total of published CVEs has nearly doubled from ~14.6k in 2017 to ~28.9k in 2023, and Google’s Mandiant expects to see more zero-day exploits, and thus CVEs, in 2024 than ever before. It is undeniable that these datasets are ready-made for model building, and will only continue to expand.
In short, even the legally compliant actors have few sources of friction when collecting training data. Actors that aren’t beholden to courts or morality will use everything they can get their hands on—including the fast-growing, and well-categorized datasets of vulnerability disclosures.
Who Will Control Offensive AI Models?
In the short term, access to the most advanced AI models largely will be limited by financial resources. With the cost of the latest training run for GPT-4 estimated at $30–$90 million, and Microsoft’s 2023 investment of $10 billion in OpenAI, operating and training the most powerful models requires significant capital. Therefore any actor interested in making a powerful offensive AI model will likely require the magnitude of resources normally associated with nation-state-backed groups. Unfortunately, there is no shortage of adversarial groups with access to such resources.
Advanced Persistent Threats (APTs) are organized groups of adversaries, typically associated with nation-states, which target specific sectors and technologies. Famous APTs have been associated with states like North Korea (Lazarus Group—APT38), Russia (Fancy Bear—APT28), and China (Double Dragon—APT41). These groups have spearheaded attacks on private companies, state infrastructure, and journalists globally. In alignment with this past activity, we should expect that groups like these are already building and training models with the financial support of their associated governments.
In a world where all exploit details are posted online, only a small handful of groups will have access to the resources required to build super-charged adversarial AI models. Five years ago, most security researchers didn’t need to consider that their vulnerability disclosures could be aiding regimes with whom they do not align. As new capabilities in AI come into view, it is becoming clear that they should, and must take an interest in what their research may be used for in the future, and by whom.
Moving Forward
As we consider shielding potentially dangerous data from super-powered (and super-consolidated) AI models, it will be crucial to install more safeguards into disclosure frameworks. The solutions in this space are all messy—there won’t be a fix that both avoids harm and keeps open principles intact as they are now. Further, solutions will likely move towards more centralized institutional oversight and strict access control—notions that will surely generate resistance in a community that highly values decentralization, open-source principles, and anonymity.
Revisiting Open Disclosure Principles
Organizations like the National Cyber Security Centre (UK) and the US Cybersecurity and Infrastructure Security Agency (CISA) have released AI development safety frameworks. However there's limited guidance on how to protect sensitive and potentially dangerous data from future use in training models. An obvious option for avoiding vulnerability details being scraped openly off the internet is to avoid them being openly available on the internet in the first place. In this world, a centralized institution might intake disclosures and determine access based on risk, scope, and other relevant factors. Some aspects of disclosures like descriptive detail or PoC exploits could be provided to verified entities, but most details will remain out of public view.
An initiative to restrict open access would, ironically, require transparency and international collaboration. It must be clear exactly who has access and why access has been granted. This would necessitate both strong private-public collaboration and scalable access control so that practices are adopted in the community and disclosures are functional for defense teams.
Private-Public Collaboration for Disclosure
The good news is that private-public collaboration frameworks are well-established in the cyber community. Information Sharing and Analysis Centers (ISACs) are nonprofits that centralize information on cyber threats by industry, and National Cybersecurity Centers of Excellence (NCCoEs) like NIST help develop international security standards. In fact, the MITRE organization already manages CVE ID assignments and works closely with researchers throughout disclosure. Any of these types of organizations could prove to be uniquely positioned to intake and distribute vulnerability disclosures.
Clearly, any partnership between the state and industry could put the involved state at an advantage. It should be acknowledged that both MITRE and NIST are associated with the U.S. government and that the U.S. government has adversarial cyber capabilities that would be bolstered by maintaining exclusive access to vulnerability details. Many would have justified concerns about handing any singular government such an advantage. For the adoption of such a framework to be successful and widespread, sufficient transparency measures must be put in place to build trust in the community that equal and fair access is granted and respected.
Access Control for Disclosure Details
Questions surrounding access control will be central to a more closed disclosure model. Overly permissive access would make exploits available the same as they are today, while highly restrictive policies could slow adoption from the community and leave defense teams scrambling. There is no clear answer here, and striking the right balance will take trial and error. In the meantime, basic monitoring should make clear when and if any large-scale scraping is taking place, and from where. Luckily, this kind of monitoring is a strength of the defense community already.
It is also worth acknowledging that the industry at large will certainly suffer from more restrictive access to disclosures. Open disclosure helps defense teams quickly understand attack vectors and how they might best mitigate risk before patches arrive—a vulnerability that requires authentication is very different from one that can be executed remotely. If this level of detail is only allowed to specific groups, it will be more difficult for security teams to triage security fixes as vulnerabilities pile up. In an ideal world, a disclosure system would allow broad enough access to be useful to most analysts while also building barriers to large-scale harvesting of vulnerability information. At present any barrier to automated scraping would be a considerable improvement, so we need not be too ambitious in any initial efforts to achieve progress.
Conclusion
AI is not new to the cybersecurity industry—defense teams have been touting AI-powered tools like SentinelOne, Cylance, and Darktrace for years. What is new about emerging AI is the ability for cyber adversaries to build models that generate exploits with data available on the internet much faster than their human counterparts.
Given the state of industry frameworks in confluence with the current state of adversarial AI capabilities, it is time to make concessions in our open disclosure models as we move to protect potentially dangerous data from falling into the wrong hands. These concessions should not be taken lightly, as the values that brought open disclosure have also built the collaborative, open, and high-functioning cybersecurity community we have today. Frankly, it is these values that I find most incredible about the cybersecurity community and that bother me most when discussing solutions that contradict them. Unfortunately, under the current circumstances, we need change to maintain an egalitarian and stable cybersecurity landscape.