Browsing Tag

AI Inference

4 posts

AI inference infrastructure, model serving, token generation, inference clouds, latency, throughput, and production AI workloads.

Etched inference rack with cooling loops and server hardware

4 min

Etched’s $1B Sohu Backlog Turns AI Inference Into the Next Chip Fight

Etched says it has raised $800 million, signed more than $1 billion in customer contracts, and started production of its Sohu-based inference racks. The startup’s transformer-specialized chip is a serious bet that AI’s next hardware fight will be won on serving models, not just training them.

Akshay

June 30, 2026

OpenAI CEO Sam Altman and Broadcom CEO Hock Tan holding a display with the Jalapeño inference chip wafer

4 min

OpenAI’s Jalapeño Chip Puts Inference Costs at the Center of the AI Race

OpenAI and Broadcom unveiled Jalapeño, OpenAI’s first custom inference accelerator for large language models. The chip is less about replacing Nvidia overnight than controlling the cost, latency, and supply of the compute that runs products like ChatGPT, Codex, and the API.

Akshay

June 24, 2026

Close-up of a computer chip on a circuit board

4 min

Qualcomm’s Modular Deal Is a $3.9 Billion Bet on AI Software Portability

Qualcomm agreed to acquire Modular in a nearly $4 billion stock deal, giving its AI data center push a software layer built around portable model deployment. The move is aimed at a practical bottleneck in AI infrastructure: making models run efficiently across CPUs, GPUs, NPUs, and custom accelerators without locking developers into one hardware stack.

Akshay

June 24, 2026

Groq press graphic announcing $650 million in new growth capital

4 min

Groq’s $650M Raise Makes AI Inference the New Cloud Fight

Groq raised $650 million to expand its AI inference cloud, with 13 data centers, more than five million developers, NVIDIA LPX integration, and a 200 MW capacity target by the end of 2027. The deal shows why serving AI models is becoming its own infrastructure market, separate from the training race.

Akshay

June 23, 2026

Hand-Picked Top-Read Stories

Apple’s OpenAI Lawsuit Turns AI Hardware Into an IP Fight

IBM Bob Makes AI Coding Costs a First-Class Engineering Metric

CMS Webshell Campaign Puts WordPress Plugins on an Emergency Checklist

Trending Tags

AI Inference

Etched’s $1B Sohu Backlog Turns AI Inference Into the Next Chip Fight

OpenAI’s Jalapeño Chip Puts Inference Costs at the Center of the AI Race

Qualcomm’s Modular Deal Is a $3.9 Billion Bet on AI Software Portability

Groq’s $650M Raise Makes AI Inference the New Cloud Fight