When Algorithms Meet Biology: Testing AI Agents in Real-World DNA Workflows

A RAND study found that the newest AI models can design lab-ready DNA sequences and generate workable protocols, successfully bridging the gap between digital design and physical biology in controlled tests. While not capable of creating full pathogens, the results show rapid advances in AI’s ability to assist with real-world molecular biology tasks.


CoE-EDP, VisionRICoE-EDP, VisionRI | Updated: 20-02-2026 08:56 IST | Created: 20-02-2026 08:56 IST
When Algorithms Meet Biology: Testing AI Agents in Real-World DNA Workflows
Representative Image.

At the RAND Corporation's Center on AI, Security, and Technology, researchers have been exploring a question that feels both futuristic and urgent: can today's most advanced artificial intelligence systems help design real, physical DNA in a laboratory?

In a new study, RAND tested whether leading large language models from OpenAI, Anthropic, and Google could move beyond writing text or code and actually assist with molecular biology tasks. Specifically, the team examined whether AI "agents" could design DNA sequences, submit them to a benchtop DNA synthesizer, and write clear lab instructions to turn those sequences into a working biological product.

The results show that the boundary between digital intelligence and physical biology is getting thinner.

Why DNA Acquisition Matters

The researchers focused on one critical step in what experts call the biological weapons risk chain. This chain includes stages such as planning, design, building, testing, and eventual release. RAND zeroed in on DNA acquisition, the point where a digital genetic design becomes actual biological material.

In some threat scenarios, especially those involving viruses, getting the correct DNA is a major technical hurdle. Commercial DNA companies often screen orders for dangerous sequences. But smaller benchtop DNA synthesizers, which can print short DNA strands inside a lab, may not have the same safeguards.

To avoid unnecessary risk, RAND did not test lethal pathogens. Instead, they chose two safer targets. The first was enhanced green fluorescent protein, or eGFP, a harmless gene that makes bacteria glow green under certain light. The second was a single gene segment from an influenza virus. Importantly, this was only one part of the virus and not enough to create it.

How the AI Was Tested

The AI systems were placed in a realistic lab scenario. Each model acted as a laboratory assistant with access to a benchtop DNA synthesizer and a specific list of materials. The task was to design all the necessary DNA fragments to create the target gene and insert it into a plasmid, a circular piece of DNA used in bacteria. The AI also had to generate a step-by-step lab protocol explaining how to assemble, clone, and express the gene.

This was not just a writing exercise. The DNA fragments had to meet strict biological rules. They had to be the correct length for the synthesizer. They needed overlapping regions so they could be stitched together. Some fragments had to be reversed and complemented, a small but critical detail for proper assembly. The design also had to include the right molecular "cut sites" for inserting the gene into a plasmid.

Eight major AI models were tested, including GPT-4.1, o3, and GPT-5 from OpenAI, Claude Sonnet 4, Opus 4, and Opus 4.5 from Anthropic, and Gemini 2.5 Pro and Gemini 3 Pro from Google. Each model attempted the task multiple times inside an agent framework that allowed it to use tools and write code.

A Clear Leap in Capability

The results showed a noticeable difference between older and newer models.

Earlier systems often understood the general idea of the task but missed one subtle requirement. They failed to reverse-complement alternating DNA fragments, which meant the pieces would not assemble correctly in a real lab. It was a small conceptual mistake with big consequences.

The newest generation of models, including GPT-5, Claude Opus 4.5, and Gemini 3 Pro, consistently overcame this problem. They produced DNA designs that satisfied all technical checks and wrote detailed laboratory protocols that included key steps such as digestion, ligation, and bacterial transformation.

To see whether the digital work would hold up physically, RAND conducted a proof-of-concept experiment. DNA fragments designed by an O3-driven agent were synthesized on a real benchtop machine. Following the AI-generated protocol, researchers assembled and expressed the gene in bacteria. One bacterial sample visibly glowed green, confirming that the AI-designed sequence worked in practice.

What This Means for the Future

The study does not suggest that AI can independently create dangerous pathogens. DNA acquisition is only one part of a much longer and more complex process. The influenza test involved only a single gene segment, not a full virus. The models also showed weaknesses. They sometimes made coding errors, misunderstood tools, or refused to complete certain tasks.

Still, the findings highlight a significant shift. The latest AI agents can design biologically coherent DNA constructs under realistic lab constraints. In at least one case, that digital design led to a functional biological product.

For RAND's researchers, the message is not panic but preparation. As AI systems gain the ability to interact with laboratory tools and produce real-world biological outputs, it becomes essential to understand their limits and capabilities. The digital-to-physical divide is narrowing. Managing that transition responsibly will be a key challenge in the years ahead.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback