OpenAI Unveils Jalapeno, a Custom Chip Designed to Cut LLM Inference Costs
Jalapeño is the company’s first custom silicon, built in partnership with Broadcom and fabricated by Taiwan’s TSMC. The chip is engineered specifically for LLM inference and is meant to replace the high‑margin Nvidia GPUs that currently dominate OpenAI’s data‑center budget.
The need for a dedicated chip is clear in the numbers. While Nvidia’s high‑end processors can command a 75 % profit margin, OpenAI’s operating margin sits at roughly 33 cents per dollar. Last year, running ChatGPT servers cost the company US$8.4 billion, and the expense is projected to climb to US$14 billion this year as the platform serves 900 million weekly users. Over the next eight years, OpenAI has earmarked about US$1.4 trillion for computing power—a sizable chunk of its US$25 billion annual revenue.
Unlike a general‑purpose accelerator, Jalapeño is a tightly focused “Intelligence Processor.” OpenAI supplied the core architectural design, drawing on its model roadmaps and serving systems. Broadcom handled silicon engineering and integrated high‑performance networking, while TSMC produced the wafers and Celestica assembled the board and rack systems.
Early lab samples already run frontier workloads, including an unreleased GPT‑5.3‑Codex‑Spark model, hitting target production frequency and power. Richard Ho, head of OpenAI’s hardware program, explained that the architecture reduces data movement, bringing realized utilization closer to theoretical peak performance. “Unlike legacy AI accelerators, the design balances compute, memory, and networking resources to address data‑movement bottlenecks in interactive LLM serving,” he said.
The chip embeds Broadcom’s Tomahawk networking silicon, enabling processors to communicate across large, clustered data‑center environments. Broadcom CEO Hock Tan confirmed that the rollout will scale alongside infrastructure partners, including Microsoft, to prepare for gigawatt‑scale data‑center integration.
By moving into custom silicon, OpenAI is shifting from a software‑centric model to a vertically integrated infrastructure company. The full‑stack strategy covers chip architecture, software kernels, memory systems, network scheduling, and the application layer, creating a continuous operational flywheel: more efficient infrastructure lowers the cost of training and serving models, which in turn drives user volume and revenue that can be reinvested in the next generation of custom infrastructure.
OpenAI entered a landscape where competitors have spent nearly a decade building proprietary hardware. Google began deploying Tensor Processing Units (TPUs) in 2015 and now controls roughly a quarter of global AI computing capacity outside Nvidia’s supply chain. Amazon has shipped over one million custom chips, while Meta and Microsoft continue to scale their own infrastructure.
Greg Brockman, president and co‑founder of OpenAI, said Jalapeño is part of the company’s long‑term full‑stack infrastructure strategy to make compute more abundant. To close the timeline gap, OpenAI accelerated the development phase; the chip transitioned from a blank‑slate design to manufacturing tape‑out in just nine months. The engineering teams used OpenAI’s own language models to automate and optimize portions of the hardware design process.
Initial deployment of the Jalapeño chip into data centers is scheduled to begin by the end of 2026. The launch marks a significant step in OpenAI’s effort to control the cost of AI infrastructure and to maintain competitiveness in a market dominated by large hardware vendors.
The announcement follows a broader trend of AI companies investing heavily in custom silicon. OpenAI’s move aligns with its history of rapid scaling and its commitment to delivering high‑performance, cost‑effective AI services.
As the AI industry continues to grow, the Jalapeño processor exemplifies how software‑centric firms are turning to hardware innovation to sustain long‑term growth and manage operating expenses. The chip’s performance hinges on close collaboration between AI firms and established semiconductor and networking companies—Broadcom’s networking silicon and TSMC’s manufacturing capacity are critical to its scalability.
OpenAI’s investment in custom silicon is part of a broader strategy to reduce reliance on third‑party hardware and to create a more efficient, vertically integrated AI ecosystem. The company’s next steps will likely involve scaling production, integrating the chip into its data‑center infrastructure, and exploring further hardware‑software co‑design opportunities.
In summary, OpenAI’s Jalapeño chip marks a milestone in the company’s pursuit of cost‑effective, high‑performance AI inference. The collaboration with Broadcom, manufacturing by TSMC, and deployment plans with Microsoft position OpenAI to better manage the financial burden of running large language models and to maintain its competitive edge in the AI market.