Apple’s Private Cloud Compute Goes Multi-Cloud With Google and NVIDIA

Apple is extending Private Cloud Compute to Google Cloud with NVIDIA GPUs, making confidential inference, attestation, and public verification central to its AI privacy claim.
Illustration of a secure cloud representing NVIDIA Confidential Computing for Apple Private Cloud Compute
NVIDIA says its confidential computing GPUs will support Apple Private Cloud Compute workloads running on Google Cloud.

Apple is taking Private Cloud Compute outside its own data centers for the first time, using Google Cloud and NVIDIA GPUs to run the largest server model in its new Apple Foundation Models family. The move, detailed by Apple’s security and machine-learning teams after WWDC 2026, is not a simple cloud-capacity deal. Apple is trying to keep its core PCC privacy promises while adding third-party infrastructure powerful enough for more demanding Apple Intelligence requests.

The model driving the change is AFM 3 Cloud Pro, Apple’s most capable server-side model for complex reasoning and agentic tool use. Apple says its third-generation foundation models were custom-built with Google, drawing on technologies behind Gemini, but the company is presenting AFM 3 as Apple’s own model family. NVIDIA’s role is narrower and more infrastructure-specific: Cloud Pro is optimized for NVIDIA GPUs, unlike Apple’s other third-generation models, which are purpose-built for Apple silicon.

That split makes this one of Apple’s more consequential AI architecture decisions. Apple has long framed Apple Intelligence around on-device processing and tightly controlled cloud inference. AFM 3 Cloud Pro shows the limit of that strategy: the heaviest assistant tasks need GPU infrastructure Apple does not fully own. The privacy question now is whether Apple’s verification model can survive that shift.

The Model Family Explains The Cloud Deal

Apple describes five third-generation foundation models. AFM 3 Core is a 3-billion-parameter dense on-device model. AFM 3 Core Advanced is a 20-billion-parameter sparse model for Apple’s most capable hardware, activating only 1 billion to 4 billion parameters per request while storing the full model in flash and loading selected experts into memory.

The server side has three models: AFM 3 Cloud for general server-side inference, ADM 3 Cloud for image generation and editing, and AFM 3 Cloud Pro for the hardest reasoning and tool-use workloads. AFM 3 Cloud and ADM 3 Cloud still run on Apple silicon. Cloud Pro is the exception, which is why Google Cloud and NVIDIA enter the picture.

Apple’s technical overview adds useful detail beyond the vendor names. The company says the new models were trained from a shared foundation and then specialized for their target architectures. It also says AFM 3 Core Advanced uses prompt-level expert selection rather than token-by-token swapping, an important practical choice for consumer hardware where NAND-to-DRAM bandwidth is too slow for conventional mixture-of-experts behavior.

What Changes In Private Cloud Compute

Private Cloud Compute was introduced in 2024 as Apple’s answer to a hard privacy problem: some AI requests are too large for a phone or laptop, but sending personal context to a conventional cloud service would undermine Apple’s device-first pitch. The original version ran on Apple silicon in Apple-controlled data centers, with software builds that devices could verify before sending sensitive requests.

The Google Cloud deployment keeps Apple’s stated PCC requirements: stateless computation, enforceable guarantees, no privileged runtime access, non-targetability, and verifiable transparency. The hardware stack is different. Apple says PCC on Google Cloud uses NVIDIA Confidential Computing with NVIDIA GPUs, Intel CPUs with TDX, and Google’s Titan chip. NVIDIA has separately framed the work around confidential inference on Blackwell-class GPUs.

The key difference from ordinary rented GPU capacity is control. Apple says it retains control of PCC software, that devices will trust only Apple-approved PCC builds, and that all binaries will be published for public inspection. Google supplies the facility and part of the trusted hardware environment. Apple is trying to keep the request path governed by its own software, attestation, and transparency systems.

The Verification Burden Gets Heavier

Apple’s strongest claims are technical, not rhetorical. For the Google Cloud PCC fleet, it says it will maintain a cryptographically verifiable, append-only ledger of participating hardware. For components that could leak user data if compromised, software attestation is rooted in at least two independent vendor roots of trust. That is meant to reduce the danger of a single supplier or operator becoming the decisive point of failure.

The inference path is also broken into constrained pieces. Apple says initial network parsing happens in a dedicated process with its own namespace. Shared inference software is recycled after a short time to live. Attested keys are held in a separate confidential virtual machine isolated from external inputs. Those design choices are aimed at privileged-access and supply-chain risks that ordinary confidential VMs do not automatically solve.

The remaining test is external verification. Apple says it will publish PCC binaries, provide research tooling, and give researchers access to live PCC nodes in research mode through the Apple Security Bounty Program. The Google Cloud version will ramp toward the full protection set during the summer preview, with more detail promised in an updated PCC Security Guide later this year.

What This Means For Apple Intelligence

For users, the practical result is that more Apple Intelligence work can run on larger cloud models without Apple conceding that every advanced request must leave its privacy framework. Siri, Visual Intelligence, dictation, image tools, and multi-step app actions can be routed across a tiered model stack: local models when possible, Apple-silicon PCC for many server tasks, and AFM 3 Cloud Pro on Google Cloud and NVIDIA GPUs for the most demanding cases.

That also makes Apple’s privacy language more precise. Private does not mean every request stays on the device. It means Apple is trying to make each remote request auditable, ephemeral, and constrained enough that neither Apple nor the infrastructure provider can use it like ordinary cloud data. That is a higher bar than a policy promise, but it is still a bar that has to be tested in public.

The next useful evidence will come from the updated PCC Security Guide, the promised research access, and shipping features that show when Cloud Pro is actually used. Until then, the importance of the announcement is clear: Apple has accepted that its most capable AI features need third-party cloud infrastructure, and it is betting that cryptographic approval, public binaries, hardware ledgers, and researcher access can keep that infrastructure inside the PCC trust model.

Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Post
A laptop screen showing code in a development editor

OpenAI’s Ona Deal Is Really About Where Codex Runs

Next Post
Football fans using smartphones in a crowded stadium, illustrating World Cup ticket and phishing scam risks.

World Cup 2026 Scams Are Already Targeting Fans

Related Posts