Google creates custom chips for training Apple's AI models and its own chatbot, Gemini.

Google creates custom chips for training Apple's AI models and its own chatbot, Gemini.
Google creates custom chips for training Apple's AI models and its own chatbot, Gemini.

At Google's headquarters in Mountain View, California, hundreds of server racks hum across several aisles, performing tasks far less common than running the world's dominant search engine or executing workloads for Google Cloud's millions of customers.

Google is conducting tests on its own microchips, known as Tensor Processing Units (TPUs).

Since 2018, Google's TPUs, initially designed for internal workloads, have been accessible to cloud customers. Recently, it was disclosed that TPUs are used by Apple Intelligence to train AI models. Additionally, Google utilizes TPUs to train and operate its Gemini chatbot.

Google has deviated from the general belief that all AI and large language models are trained on Nvidia, with Futurum Group CEO Daniel Newman stating this in a recent interview. Newman has been covering Google's custom cloud chips since their launch in 2015.

In 2020, Google was the first cloud provider to create custom AI chips. Three years later, Web Services unveiled its first cloud AI chip, Inferentia. However, it wasn't until the end of 2023 that 's first custom AI chip, Maia, was announced.

Despite being first in AI chips, Google has not achieved the top spot in the generative AI race. The company has faced criticism for its botched product releases, and Gemini was released more than a year after OpenAI's ChatGPT.

Despite this, Google Cloud has gained traction thanks to its AI offerings. In the latest quarter, Alphabet, Google's parent company, reported a 29% increase in cloud revenue, which exceeded $10 billion in quarterly revenues for the first time.

"Google's AI prowess has made it stand out in the cloud era, with TPU being a significant factor in its rise to parity with other clouds and even surpassing them in some eyes," Newman stated.

'A simple but powerful thought experiment'

In July, Google's head of custom cloud chips, Amin Vahdat, gave CNBC the first on-camera tour of the company's chip lab. Vahdat has been with Google since it first considered making chips in 2014.

"The idea originated from a basic yet potent thought experiment," Vahdat stated. "Several leads at the company posed the query: What would transpire if Google users desired to engage with Google through voice for merely 30 seconds daily? Additionally, how much computational power would be necessary to sustain our users?"

To find a better solution, the group decided that Google needed to double the number of computers in its data centers.

Vahdat stated that we could create custom hardware, specifically Tensor Processing Units, to support our needs more efficiently, up to 100 times more efficiently than if we used general-purpose hardware.

Google's TPUs and Video Coding Unit are custom-built ASICs designed for specific purposes, with the TPU focusing on AI and the Video Coding Unit concentrating on video processing.

Google, like Apple, utilizes custom silicon in its devices, including the Tensor G4 in the Pixel 9 and the A1 chip in the Pixel Buds Pro 2.

Google's TPU was what distinguished it from others when it was launched in 2015. Currently, it holds 58% of the market share among custom cloud AI accelerators, as per The Futurum Group.

The term "tensor" in AI refers to the large-scale matrix multiplications that occur quickly for advanced applications, as coined by Google.

In 2018, Google released its second TPU and shifted its focus from inference to training, allowing its cloud customers to run workloads alongside leading chips like Nvidia's GPUs.

According to Stacy Rasgon, senior analyst covering semiconductors at Bernstein Research, GPUs are more programmable and flexible, but they've been in short supply.

Nvidia's stock has soared due to the AI boom, making it a $3 trillion company in June, surpassing Alphabet and competing with Apple and Microsoft for the title of the world's most valuable public company.

Newman stated that while specialty AI accelerators are not as flexible or powerful as Nvidia's platform, the market is eager to see if anyone can compete in that space.

The true evaluation will occur when Apple releases its AI features on iPhones and Macs, which are powered by Google's TPUs.

Broadcom and TSMC

Google's upcoming sixth generation TPU, Trillium, is expected to be released later this year.

"Rasgon stated that it is expensive and requires a significant amount of scale, making it inaccessible to everyone. However, hyperscalers have the necessary resources, money, and scale to pursue this path."

Google has partnered with Broadcom, a chip developer, to design its AI chips, which is a complex and costly process even for hyperscalers. Broadcom has spent over $3 billion to make these partnerships successful.

"Google provides compute for AI chips, while Broadcom handles the peripheral tasks such as I/O and SerDes, as well as the packaging."

The final design is sent off for manufacturing at a fab, primarily those owned by the world's largest chipmaker, which makes 92% of the world's most advanced semiconductors.

Vahdat stated that while Google prepares for the possibility of a geopolitical conflict between China and Taiwan, they are hopeful that it will not occur.

The White House is allocating $52 billion in CHIPS Act funding to U.S. companies constructing fabs, with the largest portions going to Intel, TSMC, and Samsung. The primary objective is to safeguard against potential risks.

Processors and power

By the end of the year, Google's first general-purpose CPU, Axion, will be available, despite any risks involved.

""With the addition of the CPU, we can now run our internal services such as BigQuery, Spanner, YouTube advertising, and more on Axion," Vahdat stated."

Amazon launched its Graviton processor in 2018, Microsoft announced its CPU in November, and Google is now late to the CPU game.

Vahdat explained that Google's focus has been on delivering value to customers through its TPU, video coding units, and networking, and that the time was right for the company to develop a CPU.

These processors from non-chipmakers, including Google's, are made possible by chip architecture, a more customizable and power-efficient alternative that's gaining traction over the traditional x86 model. Power efficiency is crucial because, by 2027, AI servers are projected to use up as much power every year as a country like Argentina. Google's latest environmental report showed emissions rose nearly 50% from 2019 to 2023 partly due to data center growth for powering AI.

"If these chips were not efficient, the numbers would have ended up in a different location," Vahdat stated. "We are dedicated to reducing carbon emissions from our infrastructure around the clock, working towards zero."

The servers used in AI training and operation require a significant amount of water to cool them down. This is why Google's third-generation TPU utilizes direct-to-chip cooling, which consumes less water. Similarly, Nvidia's Blackwell GPUs are being cooled using this method.

Google remains dedicated to its AI technology and manufacturing its own chips, despite facing obstacles from geopolitics, power, and water.

"Hardware will play a crucial role in this phenomenon, as Vahdat stated, 'I've never seen anything like this and no sign of it slowing down quite yet.'"

by Katie Tarasov

Technology