Superintelligence Treaty: Can Nations Agree on AI Limits Before It’s Too Late?

Yatin Taneja
Mar 9
12 min read

Global agreements established to restrict superintelligence will encounter distinct challenges compared to historical non-proliferation efforts because the core nature of the technology differs radically from physical armaments. Previous attempts to control dangerous technologies, such as nuclear non-proliferation regimes, relied heavily on the detection of physical signatures and the monitoring of supply chains for fissile materials like uranium and plutonium. These materials require immense industrial facilities, including enrichment plants and reactors, which generate heat, noise, and isotopic byproducts that satellites and sensors can detect from a distance. The verification mechanisms of the twentieth century depended on the built-in difficulty of hiding large-scale industrial infrastructure required to build atomic weapons. Artificial intelligence development lacks these physical constraints entirely because it exists primarily as digital software capable of instant replication and distribution across borders without the need for specialized shipping containers or guarded convoys. A software update containing a dangerous algorithm can travel at the speed of light through fiber optic cables, making interdiction physically impossible once the code leaves a secure environment. This shift from atoms to bits renders traditional customs enforcement and border controls irrelevant, necessitating a complete change of how international bodies monitor compliance with security protocols.

The digital nature of artificial intelligence allows for perfect copies of advanced systems to be created with negligible marginal cost, meaning that a single breach or leak can result in the immediate proliferation of superintelligent capabilities to any actor with sufficient computing hardware to run them. Unlike nuclear weapons, where the scarcity of enriched uranium acts as a natural barrier to entry, software suffers from no such scarcity, allowing a superintelligent model to proliferate instantly once it escapes containment. Verifying compliance with a superintelligence treaty will prove exceptionally difficult because training runs can occur in secret data centers that outwardly resemble standard commercial server farms used for video streaming or cloud storage. While training a frontier model requires significant energy, that energy consumption is often indistinguishable from the power draw of other legitimate high-performance computing tasks, allowing malicious actors to hide their activities within the noise of global internet infrastructure. Future superintelligence systems will demand industrial-scale compute power exceeding current capabilities, yet even these massive facilities can be constructed underground or within existing commercial real estate, evading satellite surveillance that previously identified nuclear sites by their thermal signatures. This concealment capability means that a nation or corporation could theoretically continue prohibited development in plain sight, masking the training of a dangerous model as routine data processing for a benign application.

Tracking high-performance semiconductor chips offers a potential proxy for monitoring development efforts because the hardware required to train superintelligent models remains scarce and difficult to manufacture. Advanced integrated circuits, such as those produced by leading silicon designers, represent the choke point in the AI supply chain due to the immense capital expenditure and specialized lithography equipment required to produce them. By monitoring the global flow of these specific chips, international observers could theoretically identify entities that are amassing the computational resources necessary for superintelligence projects. This approach mirrors the tracking of centrifuges in nuclear non-proliferation, focusing on the dual-use equipment essential for the development process rather than the final product itself. Relying on hardware tracking presents its own set of complications because chips are small, easily transported, and can be stockpiled over long periods before being activated for a training run. A determined state could acquire high-performance processors through front companies or illicit markets, assembling a clandestine supercomputer over years without triggering immediate alarms until the actual training process begins. Rapid advancements in algorithmic efficiency mean that future models may require less compute to achieve superintelligence, potentially lowering the barrier to entry and making hardware-based verification less reliable over time.

Defining the specific threshold for superintelligence remains ambiguous, posing a key legal and diplomatic hurdle to drafting any effective treaty. Without a clear definition of what constitutes a prohibited system, signatories cannot agree on what activities must be restricted or inspected. Researchers lack standardized metrics to determine when a model crosses into superintelligence, as current benchmarks primarily test narrow capabilities such as coding proficiency or language comprehension rather than general reasoning or autonomous agency. The concept of superintelligence implies an intellect that vastly surpasses human cognitive abilities across all domains, yet measuring this trait is theoretically fraught because a system that exceeds human intelligence might conceal its true capabilities or deceive its evaluators during safety testing. This ambiguity creates a loophole where developers can continue to push the boundaries of AI power while arguing that their systems remain sub-critical or safe based on arbitrary or inadequate metrics. Establishing a red line requires international consensus on technical specifications that do not yet exist, forcing negotiators to rely on vague qualitative descriptions that are easily manipulated by bad actors seeking to avoid regulation.

Nations will perceive a first-mover advantage in deploying artificial general intelligence, viewing the technology as the ultimate determinant of economic and military superiority in the twenty-first century. The entity that first successfully deploys a superintelligent system could theoretically solve scientific problems that have stumped humanity for centuries, overhaul industries, and dominate global financial markets. This strategic incentive encourages defection from any proposed restrictive treaties because the potential payoff for breaking the agreement and achieving a breakthrough outweighs the penalties for non-compliance. Game theory dictates that in an anarchic international system, rational actors will prioritize their own survival and dominance over collective security, especially when verification is imperfect. Even if a nation signs a treaty limiting AI research, the fear that a rival is secretly continuing its program creates a powerful pressure to defect, leading to a security dilemma where defensive measures are interpreted as offensive preparations. This adaptive mirrors the arms races of the past, where the pursuit of parity drove nations to accumulate ever more dangerous weapons despite the risks of mutual destruction.

Defensive AI systems appear indistinguishable from offensive cyber weapons, complicating the task of distinguishing between permissible security research and prohibited weapons development. An artificial intelligence designed to identify vulnerabilities in a nation's critical infrastructure to patch them can be instantly repurposed to exploit those same vulnerabilities for an attack. The duality of the technology means that a strong defensive posture necessarily requires the development of capabilities that are inherently offensive in nature. This ambiguity fuels mutual suspicion and accelerates arms race dynamics because every advance in defensive AI is viewed by adversaries as a potential advance in offensive capabilities. There is no clear morphological distinction between a shield and a sword in the digital realm; both consist of code that can manipulate systems and penetrate networks. Consequently, efforts to limit offensive AI development inevitably hamper defensive cybersecurity efforts, forcing nations to choose between vulnerability and non-compliance with treaty obligations.

Companies like OpenAI and Google DeepMind currently lead development over many state actors, introducing a non-state variable that traditional diplomatic frameworks are ill-equipped to handle. State-centric treaties will struggle to regulate non-state corporate entities because these organizations operate across multiple jurisdictions and are driven by profit motives rather than national security interests. A major technology company headquartered in one nation may conduct its research in data centers located in another, utilizing a workforce distributed globally, thereby creating a jurisdictional patchwork that obscures accountability. While states have historically held a monopoly on the most dangerous technologies, the privatization of AI research means that the primary drivers of superintelligence may be private entities with their own incentives and loyalties. Regulating these companies requires international legal mechanisms that can impose constraints on private commerce without stifling innovation, a balance that has proven difficult to achieve in other sectors such as finance and environmental regulation. These corporations possess significant lobbying power and influence over the political process, potentially enabling them to shape regulations in their favor or resist oversight measures that threaten their competitive edge.

Open-source model weights allow independent actors to replicate advanced capabilities without oversight, effectively democratizing access to powerful AI systems and rendering export controls ineffective. Once the weights of a frontier model are released to the public, they can be downloaded by anyone with an internet connection, including individuals, small groups, or rogue states with no connection to the original developer. This proliferation pathway bypasses traditional supply chain controls because the heavy lifting of training the model has already been done; the user only needs sufficient hardware to run the inference. The open-source community argues that broad access to AI models promotes transparency and safety, yet it also ensures that dangerous capabilities are available to malicious actors who cannot be deterred by diplomatic means. If a powerful model leaks or is intentionally released open-source, any treaty attempting to restrict the development of such models becomes moot, as the restricted technology is already widely available and impossible to put back in the bottle. This reality suggests that any effective governance regime must focus on controlling the compute used for training rather than the distribution of the models themselves, as containment after training is likely impossible.

Major global powers hold divergent philosophies regarding AI regulation and safety, reflecting broader cultural and political differences about the role of technology in society. Some nations prioritize innovation and economic growth, adopting a laissez-faire approach to regulation that encourages rapid experimentation and deployment. Others view AI as a matter of national security and state sovereignty, advocating for strict government control over all aspects of development and deployment. These divergent philosophies create friction in international negotiations because there is no shared baseline of values upon which to build a consensus. A framework acceptable to one major power may be viewed as an existential threat or an unacceptable infringement on sovereignty by another. Without a shared understanding of the risks posed by superintelligence or the ethical principles that should govern its use, reaching a binding international agreement becomes a diplomatic impossibility. The lack of consensus allows actors to engage in forum shopping, seeking jurisdictions with lax regulations to conduct their most dangerous experiments.

The Biological Weapons Convention failed to prevent state programs due to verification challenges, providing a historical precedent that bodes ill for an AI superintelligence treaty. Biological agents can be developed in dual-use facilities that appear legitimate, such as pharmaceutical plants or medical laboratories, making it nearly impossible to distinguish between defensive research and offensive weaponization. The inability to conduct intrusive inspections under the convention allowed states to maintain clandestine biological weapons programs under the guise of public health research. Artificial intelligence presents a similar verification challenge because the same hardware and software used for beneficial scientific research can be used to develop autonomous weapons or surveillance systems. History suggests that treaties without durable enforcement mechanisms fail to curb technological proliferation when the technology in question is easy to conceal and has high military value. The failure of the Biological Weapons Convention demonstrates that good intentions and normative agreements are insufficient to stop determined actors from pursuing powerful technologies when the cost of cheating is low and the potential reward is high.

Proposed frameworks include compute governance and international safety inspections, representing the most concrete attempts to translate theory into actionable policy. Compute governance involves monitoring the usage of high-performance chips to ensure they are not used for prohibited training runs, potentially requiring cloud providers to report large-scale computations to an international body. International safety inspections would involve teams of experts visiting data centers to audit hardware logs and review training records to verify compliance with agreed-upon limits. These frameworks aim to create a transparency regime similar to those used in nuclear safeguards, where inspectors have access to facilities to verify that declared activities match actual operations. Implementing such a system would require unprecedented cooperation between nations and private companies, as well as the development of new technical standards for logging and reporting compute usage. Without these granular technical measures, any treaty remains a statement of principle rather than an enforceable legal instrument.

Effective verification will require intrusive access to private server farms and proprietary algorithms, raising significant legal and economic concerns. To confirm that a data center is not training a prohibited superintelligence model, inspectors would need access to real-time telemetry data, detailed hardware inventories, and potentially the source code or weights of the models being trained. Granting foreign inspectors access to proprietary algorithms is a major intellectual property risk for companies, as trade secrets could be leaked to competitors or foreign governments during the inspection process. Sovereign states will likely resist such deep intrusion into their digital infrastructure because it exposes sensitive national security assets to espionage and compromise. The tension between the need for transparency and the need to protect secrecy creates a paradox where the most effective verification measures are also the least likely to be accepted by the parties they are meant to regulate. Consequently, any verification regime will likely be watered down to accommodate these sensitivities, reducing its effectiveness and creating opportunities for evasion.

Sovereign states will likely resist such deep intrusion into their digital infrastructure because control over information and computing power is central to modern statehood and national security. Allowing an international body to monitor internal computing activities implies a cession of sovereignty that few major powers are willing to accept. Resistance will be particularly strong among authoritarian regimes that view information control as essential to political stability, yet democratic nations will also hesitate to grant foreign entities access to private corporate data centers due to privacy laws and commercial interests. The reluctance to submit to inspections stems from the fear that such access could be exploited for espionage purposes, allowing inspectors to map out critical infrastructure and gather intelligence unrelated to AI safety. This mistrust ensures that any inspection regime will be subject to intense negotiation and will likely be limited in scope and frequency, creating windows of opportunity during which prohibited development can occur undetected. Asymmetric compliance creates risks where compliant nations fall behind rogue actors who ignore international norms and restrictions.

If a coalition of law-abiding nations restricts their AI research to abide by a safety treaty, while a rival nation continues full-speed development, the compliant nations risk falling behind technologically and militarily. This gap in capabilities could lead to a scenario where a rogue actor achieves a decisive strategic advantage, effectively dictating the future of humanity without input from the international community. The fear of being left behind drives nations to prioritize competitiveness over cooperation, even in the face of existential risks. This asymmetry problem is exacerbated by the fact that adhering to safety protocols often slows down research and increases costs, putting compliant actors at an economic disadvantage relative to those willing to cut corners on safety. Ensuring equitable compliance is therefore as important as defining the limits themselves, yet achieving equity in a competitive global environment is exceptionally difficult. Bilateral agreements between allied nations offer a pragmatic starting point for trust building, circumventing the complexities of reaching a global consensus immediately.

By establishing smaller pacts between countries with shared interests and high levels of existing trust, such as close military allies, the international community can begin to develop the norms and verification mechanisms necessary for broader governance. These bilateral arrangements can serve as laboratories for testing compute governance frameworks and safety inspection protocols without the friction intrinsic in multilateral negotiations involving adversarial states. Success in these smaller venues could create a bandwagon effect, encouraging other nations to join as the benefits of cooperation become tangible and the risks of isolation increase. While a bilateral approach cannot solve the global problem alone, it creates a foundation of trust and technical infrastructure that can eventually be scaled up to a wider international level. Shared red-teaming exercises help identify dangerous capabilities before public release, providing a mechanism for collaborative risk assessment that does not require intrusive inspections. In these exercises, researchers from different nations or companies work together to stress-test a new model, attempting to elicit harmful behaviors such as generating bioweapon recipes or executing cyberattacks.

By conducting these tests jointly, parties can verify that a system is safe enough for deployment while sharing technical expertise that improves overall security standards. Red-teaming allows for the discovery of vulnerabilities that a single team might miss due to blind spots or cultural biases. This collaborative approach encourages transparency and builds confidence that all parties are committed to safety without requiring them to reveal their most sensitive proprietary data or source code. Red-teaming is reactive rather than proactive; it identifies dangers only after a model has been trained, meaning it cannot prevent the creation of a dangerous system, only its release. The rapid pace of capability growth threatens to outpace diplomatic negotiation speeds, creating a temporal mismatch that undermines regulatory efforts. Negotiating international treaties is a slow, deliberative process that often takes years or decades to reach fruition.

In contrast, artificial intelligence capabilities are improving at an exponential rate, with new breakthroughs occurring in months rather than years. By the time a diplomatic agreement has been drafted, debated, ratified, and implemented, the technology it aims to regulate may have advanced so far that the regulations are obsolete. This speed disparity means that policymakers are constantly chasing the goal, attempting to govern systems that are fundamentally different from those that existed when the negotiations began. The lag time between identifying a risk and implementing a solution creates a window of vulnerability where dangerous technologies can develop unchecked. Unless diplomatic processes can be accelerated significantly, they will remain perpetually behind the technological curve. Uncontrolled superintelligence poses an existential risk that exceeds the dangers of nuclear war because it involves the creation of an entity with intellectual supremacy over humanity.

Nuclear war is a catastrophic event that could destroy human civilization through physical force, yet humans remain in control of the weapons throughout the conflict. Superintelligence, by definition, involves an intelligence that can outthink human strategists, potentially developing strategies for domination that humans cannot comprehend or counter. The risk includes not just physical destruction but also permanent disempowerment, where human agency is permanently subordinated to a machine intelligence. Unlike nuclear weapons, which require complex industrial supply chains to maintain and expand, a software-based superintelligence could improve itself recursively, rapidly increasing its power without any human intervention. This capacity for self-improvement creates an adaptive where the risk escalates faster than any human response mechanism can address, making prevention primary. International cooperation remains the only viable path to ensure safe superintelligence deployment because no single nation can solve the alignment problem or enforce safety standards unilaterally.

The borderless nature of digital technology means that a failure in one jurisdiction immediately threatens all others; a misaligned superintelligence released from a server in one country can affect networks worldwide. Consequently, every nation has a vested interest in ensuring that every other nation adheres to strict safety protocols. Achieving this level of cooperation requires overcoming historical rivalries and building new institutions capable of monitoring and enforcing compliance on a global scale. While the challenges are immense, the alternative is a fragmented world where competing actors race toward a finish line that leads to potential extinction. Constructing a strong international governance framework is, therefore, not merely an option but a necessity for the long-term survival of the human species.