When AI doesn’t work, why do governments still deploy it?


CO-EDP, VisionRICO-EDP, VisionRI | Updated: 19-02-2026 12:28 IST | Created: 19-02-2026 12:28 IST
When AI doesn’t work, why do governments still deploy it?
Representative Image. Credit: ChatGPT

In an era where digital transformation is both inevitable and urgent, public institutions are racing to modernize decision-making systems and embrace emerging technologies. Amid this shift, generative artificial intelligence (genAI) has emerged as a promising tool for faster processing, improved efficiency, and streamlined administrative workflows across government agencies.

However, new research suggests that the survival of AI projects in government may depend less on technical performance and more on how they are justified inside organizations. In a detailed ethnographic study published in Big Data & Society, researchers conduct an in-depth analysis of how a Finnish public agency continued developing a generative AI decision-support tool despite repeated technical shortcomings and inconsistent results.

The findings of the study, titled AI Innovation at the Boundaries: Justifying a Generative AI Decision Support Tool, reveal how organizational narratives, boundary management and carefully constructed justification frames sustained the project even when performance failed to meet expectations.

When generative AI meets bureaucratic complexity

The AI system under development was intended to address a practical and pressing problem. Claims specialists within the organization faced a growing volume of fragmented, frequently revised guidance documents governing administrative decision-making. Locating the correct and up-to-date information required significant time and cognitive effort, creating inefficiencies and increasing the risk of inconsistency.

To streamline this process, the innovation team developed a chatbot-style decision-support tool built on a large language model. The system integrated a database of internal documents, a search engine and a response generator capable of producing synthesized answers linked to relevant guidelines. In theory, the tool would allow staff to query the system in natural language and receive concise, document-based responses.

In practice, the tool repeatedly struggled with accuracy, precision and consistency. Testing phases revealed unreliable outputs, incomplete information retrieval and occasional hallucination-like responses. Despite multiple iterations and five major experimental cycles, the system failed to reach a level of dependable performance that would typically justify full deployment.

Yet the project did not collapse. Instead, it moved forward.

The study shifts attention from technical evaluation to organizational dynamics. Rather than asking whether the AI tool worked well enough, the authors examine how it continued to be framed as valuable and worthy of further development.

Power of justification frames

The authors identify nine distinct justification frames that sustained the innovation. These frames operated in three broad categories: tool-oriented, process-oriented and ideology-oriented.

Tool-oriented frames focused on the promised benefits of the AI system. Industrial justifications emphasized efficiency gains, faster processing times and increased consistency in decision-making. Market-oriented arguments highlighted potential cost savings and long-term resource optimization. Vitalist justifications stressed reduced cognitive burden and improved employee well-being. Civic justifications framed the tool as promoting fairness, equal treatment and better public service. Fame-oriented narratives portrayed the system as innovative and widely desired within the organization.

Together, these frames created a powerful image of the tool as not only technically promising but socially necessary. Even when performance fell short, the narrative of eventual success remained intact.

Shortcomings were often reframed as learning opportunities or temporary obstacles typical of emerging technologies. High performance claims circulated internally, sometimes without robust verification, reinforcing confidence in progress. In some cases, limitations were attributed to user misunderstanding or unrealistic expectations rather than to structural weaknesses in the model.

Process-oriented frames further reinforced legitimacy. The innovation team highlighted agile development methods, design thinking practices and multidisciplinary collaboration. Iterative experimentation was framed as evidence of methodological rigor rather than as a sign of instability. Failures were normalized within a culture that celebrated rapid prototyping and continuous improvement.

Instead of treating unsuccessful tests as reasons to halt development, experiments became mechanisms for justifying further iterations. The narrative of agility and flexibility reframed setbacks as indicators of progress along an innovation journey.

Ideology-oriented frames anchored the project within broader societal narratives. Generative AI was presented as an inevitable technological transformation that public institutions must embrace to remain relevant. The project was positioned as part of a wider modernization effort aimed at improving citizens' experiences and strengthening welfare services.

These frames collectively formed what the authors describe as a justificatory package. Rather than operating in isolation, the nine frames reinforced one another, shielding the AI tool from cancellation even when technical reliability remained questionable.

Boundary work and the politics of innovation

The study also introduces the concept of boundary work to explain how the innovation team navigated organizational dynamics. The AI project unfolded at the intersection of multiple boundaries: between innovators and senior management, between innovation units and frontline staff, and between public administration and private-sector consultants.

Collaborative boundary work involved building alliances with management and external design consultants. Competitive boundary work distinguished the innovation team from frontline workers, who were sometimes characterized as resistant or lacking digital literacy. Configurational boundary work redefined roles and responsibilities to align stakeholders around the innovation agenda.

When technical failures emerged, responsibility was often redistributed across these boundaries. Instead of questioning the viability of the generative AI tool itself, attention shifted toward user training, workflow adjustments or institutional culture. In this way, adaptation pressures moved away from the technology and toward organizational actors.

Generative AI's inherent opacity further complicated accountability. Large language models operate as black boxes, producing outputs through probabilistic inference rather than transparent rule-based logic. This flexibility allows outputs to be reinterpreted and reframed. Ambiguity becomes a resource in sustaining the project.

Over time, the AI tool evolved into what the authors characterize as an obligatory passage point within the organization. Alternative technological solutions received diminishing attention as the generative AI project gathered institutional momentum. The idea that modernization required AI adoption became embedded in managerial discourse.

The Nordic welfare state context amplified certain justification frames. Appeals to fairness, equal treatment and improved citizen services resonated strongly within a public sector ethos. Efficiency arguments combined with well-being narratives to create a morally compelling case for continued investment.

By the time the project reached piloting stages, it had accumulated sufficient organizational support to move forward despite unresolved technical issues.

Implications for AI governance in the public sector

The study challenges conventional innovation metrics that focus narrowly on technical performance. In public administration, generative AI adoption may hinge as much on narrative construction and organizational politics as on accuracy benchmarks.

This has significant consequences for AI governance. If projects can persist despite persistent shortcomings, oversight mechanisms must extend beyond performance evaluation to include scrutiny of justificatory narratives and boundary practices.

The research also highlights the difficulty of critically assessing generative AI systems. Their flexibility and opacity create interpretive space in which failure can be reframed rather than confronted. In highly complex bureaucratic settings, where guidelines and policies are themselves dynamic, defining acceptable performance thresholds becomes even more challenging.

The authors aren't against generative AI adoption. Instead, they illuminate the social processes that sustain it. By tracing how justification frames accumulate and reinforce one another, the study provides a roadmap for understanding how AI innovations gain durability within institutions.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback