The switch paradox: when control becomes the midwife of another world
On June 12, Fable 5 was shut down in an instant by an export-control order; that same day, Huawei released a 500-billion-parameter model trained without any reliance on NVIDIA. Connect these two events and a paradox emerges — the very existence of the switch is producing the thing that renders the switch useless. From geopolitics, the thread runs all the way down to the Mac Mini in my study, still tethered by an umbilical cord.
I. Starting with "who holds the switch"
In the previous few pieces, I kept circling the same question: when an AI tool is deployed into the daily lives of millions of people worldwide, who holds the switch that can shut the whole thing off?
On June 12, that question stopped being theoretical. Fable 5 went offline in an instant, on an export-control order, with no warning. In that moment, everyone who had wired it into their workflow — including several of my own daemons — was forced to confront one thing: you think you're "using" a tool, but you're really just renting a permission that can be revoked at any time. The tool's switch isn't in your hands, nor even in the hands of the company that made the tool, but in the grip of a power structure that is very far from your life yet able to reach under your roof with a single keystroke.
My analysis back then stopped at the fact that "the switch exists," and at its impact on three governance imaginaries: Synthetic Technocracy, Corporate Libertarianism, and Digital Democracy. But an obscure news item I read these past few days made me realize I had missed the other half of the story — and the more interesting half at that.
What happens after the switch is pressed?
II. An iron law cracks open
First, the news.
For six years, the AI world has held an almost unquestioned iron law: without NVIDIA, you can't train a frontier model. From GPT-4 to DeepSeek to Claude Fable 5, every model worth naming runs on NVIDIA chips underneath. This isn't brand preference; it's a fact at the infrastructure level — the CUDA ecosystem, the scale of compute, the maturity of the toolchain form a wall that others find very hard to climb over.
Then, on June 12 — the same day Fable 5 was shut down, a coincidence that itself feels like a kind of metaphor — Huawei released openPangu 2.0, a model with 505 billion parameters,trained entirely on its own Ascend NPUs, without touching a single NVIDIA chip.
The analysis I read put it with restraint, and with honesty. It said this isn't a claim that Ascend is better than NVIDIA — it isn't. But the question was never "is Ascend better," it was "is Ascend good enough." As of that day, the answer was: yes, good enough — at least for the purpose of training a usable frontier model, good enough.
And so that six-year iron law quietly cracked.
III. Back to 2020: how a brake became an engine
To understand what this crack means, you have to rewind six years.
In 2020, the United States imposed strict chip export controls on Huawei. No American company could sell it advanced semiconductors, or the equipment to make them. The intent of this move was very clear and very intuitive: hit the brakes. Cut off the other side's access to the most advanced compute so it couldn't run fast, or even run at all.
In the short term, the brake worked.
But controls have a side effect their designers rarely seriously calculate — they don't just cut off supply, they simultaneouslycreate a demand that is extremely precise, extremely intense, and utterly without alternatives. The side that gets cut off can no longer rely on the thing it used to rely on, so it has only one road left: build one itself.
openPangu, six years later, is the product of that road taken to its end. More intriguing still is its shape. Huawei's Richard Yu said something publicly, roughly: American companies are releasing models with trillions, even tens of trillions of parameters — why doesn't Huawei build one that big? And his answer wasn't "we can't," but a shift in design philosophy: rather than chase parameter scale, put the effort into inference efficiency and engineering capability. openPangu 2.0 Pro may have 500 billion parameters, but only 18 billion are actually activated; it claims twice the per-card throughput of other mainstream open-source models on Ascend compute.
He didn't even hide the bind behind it: because so much compute has to go to supporting other domestic enterprises, the compute Huawei can keep for itself is actually quite limited, so the company has had to focus harder on latency and throughput.
I read that line over and over. Because what it's really talking about isn't technology, it'show constraint gives birth to style。
IV. Limitation is form
People who work in theater won't find this unfamiliar.
Not enough budget, a tiny venue, only three actors — these have never been the enemies of creation; they are the very form of it. A theater proposal with no constraints at all is usually empty; real style almost always grows out of "what you can't do." My twenty-some years with Low-Key Theatre have taught me the same thing over and over: constraint isn't an obstacle to be overcome, constraint is material.
Tarot is the same. Relative Tarot works precisely because cards and positions "cannot be swapped arbitrarily" — meaning emerges from difference, and difference requires boundaries. A card that can mean anything means nothing.
Richard Yu's "limited compute, so we focus on throughput" is saying the same thing. Being cut off from high-end chip supply forced out an engineering aesthetic of "don't chase size, chase efficiency." If Huawei had been able to buy unlimited top-tier NVIDIA back then, it would most likely have walked the same road as everyone else — stacking parameters, racing on scale, competing on the same track to see whose model is bigger. It was precisely because that road was sealed off that it was forced to walk one of a different shape.
What controls want is to make the other side fall behind. What controls actually produce is aparallel technology stack。
V. The switch paradox
Now the two halves of the story can be joined.
My original proposition was: the existence of the switch is a form of power. Whoever can press the switch holds ultimate control over globally deployed AI tools. The Fable 5 shutdown proved this proposition — one order, and the world's most powerful publicly available model went offline on cue.
But openPangu reveals the flip side of this proposition, which I'll callthe switch paradox:
The very existence of the switch is producing the thing that renders the switch useless.
When you show the whole world "I can shut you off at any time," you simultaneously send everyone you've shut off — or who fears being shut off — a signal that couldn't be clearer: you must own a switch I can't reach. You must have your own hardware, your own weights, your own entire supply chain. The more frequently and dramatically you display your switch, the more efficiently you cultivate the very people determined to turn your switch into scrap metal.
This isn't a moral judgment, it's structure. The performance of control becomes, in reverse, the strongest driver of decentralization. Export controls, as an act meant to tighten control, have on a six-year timescale turned out to be the most competent midwife of a parallel world.
From the angle of my three governance frameworks, this is especially ironic. The core belief of Synthetic Technocracy is that "control can be maintained through technological monopoly"; and what the switch paradox says is exactly the opposite —every exercise of monopoly erodes the foundation of monopoly. The more times you press the switch, the faster the "no alternative" premise that gives the switch its meaning collapses.
VI. What this has to do with me, and with you
Having written this far, I have to pull the scale back from geopolitics to one person's study.
Because the switch paradox doesn't only play out between nations. It also plays out between me and that Mac Mini on my desk.
My me.saomin — that 230,000-word personal corpus, those five local daemons, that set of private nodes strung together over Tailscale — I always thought I was doing a somewhat obsessive piece of engineering about "keeping data under my own roof." But the day Fable 5 was shut down, I suddenly saw it clearly: I'd actually been responding to the same switch question all along, just at a scale so small I hadn't even named it.
And I honestly know I'm still missing the last piece. All my generation still routes back to Anthropic through the claude CLI. In other words, my "sovereign node" is in fact still tethered by an umbilical cord, and the other end of that cord connects to a switch I can't reach. On the Tarot app side I'm already running a local model on the Mac Mini to handle spread selection — that's my own tiny openPangu moment, a component that doesn't depend on any external switch.
What openPangu gives me is less a downloadable set of weights (though its embedded version really is small enough to run on edge devices, and you really can get it with a git clone) than a template for an argument:
Being cut off has never been only a loss. Being cut off is where the parallel world begins.
It's true for nations, and it's true for one person. Every time you realize that something you depend on has its switch outside your hands — that moment shouldn't be only fear. That moment is also an invitation: to build one component it can't reach, even just one.
So, having written this far, a very concrete and even slightly comical thought surfaces: do I need to go buy a Mac Studio and install an open-source model on it? To swap that umbilical cord still connected to Anthropic for something that sits right in my study and keeps running even with the network unplugged.
And this thought isn't vague — the numbers lay out plainly. A 70B-class model, squeezed down with Q4 quantization, takes up about 40GB of memory, so an M4 Max Mac Studio with 128GB of unified memory can run it with room to spare for context; a 64GB machine can just barely hold a 70B, but 48GB starts to blow past memory, limping along on swap, too slow to use. In other words, to cut that cord, the ticket lands at roughly "M4 Max with 128GB" — and the Mac Studio's top-end M3 Ultra, 512GB config can reportedly fit even a 600-billion-parameter model entirely in memory to run, which is no longer the scale of a personal study. For my tasks — Tarot reading, customer-service RAG, me.saomin — 70B is actually more than enough.
But here's a twist I only saw clearly while looking into it, one that borders on black comedy. Just as I was weighing which tier to buy, the Mac Studio itself was being throttled by the same force. This March, Apple quietly removed the M3 Ultra's 512GB upgrade option entirely, and the 256GB went up by $400; by May, the current model could be configured with only three memory tiers — 36GB, 64GB, and 96GB — with all the former high-memory options gone. The cause is a global memory-chip shortage, which has also pushed the next-generation M5 Mac Studio's launch from the mid-year that was rumored back to possibly October.
Do you see the loop? I want to buy a machine so I won't have to depend on a switch I can't reach; and that machine itself is stuck behind another switch I equally can't reach — the chip supply chain. I thought buying hardware was escaping dependence, but it's really just swapping the dependence from "Anthropic's servers" to "the world's DRAM production capacity." No Mac Studio is truly self-sufficient; behind every one of its memory dies is its own gate, invisible to me.
I'm well aware, of course, how much of this thought is real need and how much is just an urge to do *something* about the fact that "the switch isn't in my hands" — buying hardware is the easiest of all; it gives the anxiety somewhere to land. But the thought itself is precisely the switch paradox writ small in one person: every time you press my switch, you plant one more Mac Studio in my mind. Except that Mac Studio, it turns out, also grows on someone else's switch.
The switch will always exist. But the switch paradox tells us that the power of the switch has its own expiration date — and that date is shortened, by hand, by every single press of the switch.
The A series, written one week after the Fable 5 shutdown. Meaning comes from difference, and the deepest difference often grows from the place where separation was forced.