Cyber threat intelligence must evolve to protect AI systems

Cyber threat intelligence must evolve to protect AI systems
Representative Image. Credit: ChatGPT

Artificial intelligence (AI) systems rely on complex pipelines of training data, machine learning models, software libraries, and deployment platforms. Each of these components can become a potential attack surface for cybercriminals seeking to manipulate outcomes, steal sensitive models, or disrupt operations.

The study Cyber Threat Intelligence for Artificial Intelligence Systems, published as a research preprint on arXiv, investigates how cybersecurity practices known as cyber threat intelligence must evolve to detect, analyze, and mitigate threats targeting artificial intelligence systems.

Cyber threat intelligence, commonly known as CTI, involves collecting and analyzing information about cyber threats, including attacker behavior, vulnerabilities, indicators of compromise, and attack techniques. Organizations use this intelligence to anticipate potential attacks and strengthen defenses before incidents occur. However, most existing CTI frameworks were built to protect traditional IT systems such as networks, servers, and applications. AI systems introduce new components and vulnerabilities that fall outside these traditional models.

Why AI systems introduce new cybersecurity risks

AI changes the structure of digital systems in ways that fundamentally alter the cybersecurity landscape. Instead of focusing solely on software and infrastructure vulnerabilities, attackers can target the underlying components that enable AI systems to function.

Training datasets represent one of the most critical assets in AI development. These datasets teach machine learning models how to recognize patterns and make decisions. If attackers manipulate this data, they can poison the model during training, causing it to produce incorrect outputs or biased predictions. Such attacks may remain hidden until the system is deployed, making them particularly difficult to detect.

Machine learning models themselves also present unique attack opportunities. Adversaries can insert hidden backdoors during development, allowing them to trigger specific behaviors later using specially crafted inputs. Other attacks involve extracting proprietary models through repeated queries, enabling competitors or attackers to replicate valuable intellectual property.

Adversarial examples represent another major vulnerability. These are carefully designed inputs that appear normal to humans but cause AI models to misclassify objects or make incorrect decisions. In safety-critical systems such as autonomous vehicles or medical diagnostics, such manipulation could have serious real-world consequences.

The researchers highlight that AI systems often operate as complex pipelines rather than isolated applications. These pipelines include data collection, data preparation, model development, evaluation, deployment, and ongoing updates. Each stage introduces additional points where attackers can interfere with the system.

Another growing concern involves prompt injection attacks targeting large language models. In these attacks, malicious instructions are embedded into prompts or external data sources to manipulate the responses generated by AI systems. As language models become widely integrated into enterprise tools, search engines, and digital assistants, prompt injection is emerging as a major cybersecurity challenge.

The study also notes that the increasing accessibility of AI frameworks is lowering the barrier for cybercriminals. Attackers can now use machine learning tools to automate malware creation, generate realistic phishing messages, and develop sophisticated social engineering campaigns. Deepfake technologies can produce convincing audio and video content that can deceive individuals, organizations, and even automated verification systems.

Building a cyber threat intelligence framework for AI

To address these risks, the researchers propose adapting cyber threat intelligence frameworks to focus specifically on artificial intelligence systems. The goal is to create a knowledge base that documents vulnerabilities, incidents, and attacker strategies related to AI technologies.

The study identifies several types of data sources that can support this effort. One category includes vulnerability-focused repositories that document weaknesses in AI systems. Databases such as the AI Vulnerability Database collect reports describing technical flaws that could be exploited by attackers. These repositories function similarly to vulnerability databases used in traditional cybersecurity but are tailored to the unique characteristics of machine learning systems.

Another important category includes incident-focused databases that track real-world failures or harmful outcomes involving AI technologies. One of the most prominent examples is the AI Incident Database, a community-driven repository that collects reports describing incidents in which AI systems contributed to harm or risk.

These incidents span a wide range of domains, including transportation, law enforcement, finance, and social media. Some reports describe autonomous vehicles involved in fatal accidents, while others document financial disruptions caused by algorithmic trading systems. Facial recognition technologies have also been linked to wrongful arrests in certain cases.

By collecting detailed metadata about these events, incident databases allow researchers and security analysts to identify patterns in how AI systems fail or are misused. This information can help organizations anticipate potential risks and develop safeguards before similar incidents occur.

The researchers also examine adversary-focused frameworks that describe how attackers target AI systems. One of the most significant frameworks in this area is MITRE ATLAS, which maps the tactics and techniques used in attacks against machine learning systems. The framework builds on earlier cybersecurity models that classify attacker behavior but adapts them to the specific characteristics of AI technologies.

These frameworks document stages of an attack that may include reconnaissance of machine learning artifacts, gaining access to systems through APIs, inserting malicious code into AI components, and manipulating model outputs. Attackers may also attempt to maintain long-term access by inserting hidden backdoors or escalating privileges within AI platforms.

By combining vulnerability data, incident reports, and attacker behavior frameworks, cyber threat intelligence systems can create a more comprehensive picture of AI-related threats.

Toward stronger protection for AI systems

A key challenge identified in the study is the lack of standardized methods for categorizing vulnerabilities in AI systems. Traditional cybersecurity relies on well-established frameworks for classifying software weaknesses, but comparable standards for machine learning technologies are still emerging.

Some researchers have begun developing taxonomies that categorize vulnerabilities according to where they occur in the AI lifecycle. These classifications distinguish between vulnerabilities introduced during development, training, or deployment. They also assess the potential impact of an attack on critical attributes such as accuracy, fairness, reliability, safety, and privacy.

Another important aspect of AI threat intelligence involves defining indicators of compromise specific to machine learning systems. In traditional cybersecurity, indicators of compromise include artifacts such as suspicious file hashes, malicious IP addresses, or unusual network traffic patterns.

For AI systems, indicators of compromise may include altered training datasets, suspicious model weights, modified training scripts, or malicious model repositories. Detecting these indicators requires new analytical techniques capable of examining complex AI artifacts.

The study highlights several approaches that could enable this capability. One method involves deep hashing, which converts machine learning models or datasets into compact digital fingerprints. These fingerprints allow security systems to compare new models with known malicious examples and identify similarities even if the files have been slightly modified.

Another approach adapts similarity hashing techniques used in malware detection. These methods generate similarity digests that enable rapid comparisons across large collections of files. By applying these techniques to AI models and datasets, analysts can identify clusters of suspicious artifacts within large repositories.

More advanced techniques involve semantic hashing methods designed to preserve meaningful patterns in complex data structures. These methods enable security systems to detect similarities between AI assets while remaining resilient to minor modifications.

Fuzzy hashing techniques also play a role by identifying partial matches between files rather than requiring exact duplicates. Combining multiple hashing methods can improve detection rates and allow security tools to identify modified or polymorphic AI artifacts more effectively.

According to the researchers, the ultimate goal is to build a comprehensive cyber threat intelligence ecosystem for artificial intelligence systems. Such a system would allow security tools to scan AI models before deployment, monitor deployed systems for suspicious activity, and analyze incidents using historical threat data.

Threat intelligence knowledge bases could also help organizations understand which vulnerabilities are most likely to affect specific types of models, frameworks, or deployment environments. By linking attack techniques with real-world incidents, security teams could identify patterns that signal emerging threats.

Despite the progress documented in the study, the researchers emphasize that the field remains in its early stages. Existing databases and frameworks provide valuable starting points but are often incomplete or inconsistent. Many repositories contain only a limited number of documented vulnerabilities or incidents, highlighting the need for greater collaboration between researchers, industry, and governments.

Future work will likely focus on developing new indicators of compromise tailored to AI systems and improving methods for detecting tampered models or datasets. Researchers may also explore signals that indicate an AI model has been compromised, such as unusual outputs, changes in training data distributions, or unexpected system behavior.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback