IBM Announces Telum II Processor and Spyre Accelerator
Christian Jacobi, IBM Fellow and CTO of IBM Systems Development, describes how the Telum II processor and Spyre accelerator will help scale artificial intelligence capabilities on the mainframe
When it comes to artificial intelligence (AI), keeping up means scaling up, and IBM recently announced new hardware advances to help mainframe clients do just that.
IBM Telum II Chip and Spyre Accelerator
The IBM Telum II processor and complementary Spyre accelerator promise to enhance the use of AI directly on the mainframe as organizations increasingly apply the technology to transactions. Announced at the Hot Chips 2024 conference in Palo Alto, California, last month, the unveiling marked “a big, big milestone in a very large project,” says Christian Jacobi, IBM Fellow and CTO of IBM Systems Development. “We have 800 or so engineers working on a chip like this.”
In product literature, IBM says the Telum II brings a “40% increase in cache and a coherently attached data processing unit (DPU), potentially improving performance by up to 70% across key system components.” The Spyre accelerator, for its part, was designed to provide additional AI compute when paired with the Telum II, which will be incorporated into a new, yet-to-be-announced system set for release next year, Jacobi explains.
AI on the Mainframe
The original Telum microprocessor brought AI for transaction processing to the IBM mainframe for the first time as part of the IBM z16 system in 2022. Altogether, IBM mainframes handle 70% of the world’s transactions, according to the company.
“IBM Z and LinuxONE mainframe systems are important cornerstones in the IT infrastructure of many of the largest enterprises and governments. We run massive amounts of data and transaction volume through those systems,” Jacobi says.
He adds that bringing AI directly onto the mainframe increases efficiency by eliminating the data movement that would be required if bits had to be sent to a separate computer for AI tasks. “And that is just incredibly energy efficient compared to taking that data, sending it across a network to a separate computer that has a bunch of GPUs that then perform that work,” Jacobi says. “Worldwide, we’re seeing a strong push for energy efficiency and a big concern around energy efficiency of AI.”
With the Spyre accelerator, large language models (LLMs) can now be used directly on the mainframe. “We did not have that before,” Jacobi says.
“We designed Spyre as a dedicated accelerator optimized for these kinds of models,” he says. Combining large and small models “is what we call ensemble AI, where you use a small model for a first pass and then you do a second pass on a larger model, but not with all the transactions, only a subset of the transactions.”
He adds that, beyond security, other use cases for AI on the mainframe include code modernization and enhanced understanding of complex code and the behavior of systems.
A Surge in AI Demand
As IBM rolls out its latest advances, it expects AI-driven computational needs to surge in the coming years. Mainframe companies are in various stages of adapting AI into their transactions. “It’s not everything and everywhere, but many of the advanced clients certainly are now pushing the technology pretty hard,” Jacobi says.
The first Telum chip enabled “AI inferencing at the speed of a transaction—like checking for fraud during a credit card swipe—to IBM Z,” notes an IBM blog post announcing the Spyre accelerator. Jacobi explains that the eight-core Telum II improves on its predecessor by enabling any core to not only use its local AI accelerator, but also the pooled AI capacity of eight Telum chips in a processor drawer.
Advanced technologies like the Telum II chip will help satisfy what Goldman Sachs projects to be an AI-driven 160% rise in power demand at data centers by 2030. Meanwhile, the consulting firm McKinsey credits interest in generative AI for “skyrocketing” demand for computational power.
As AI technology grows, its uses evolve. “We’re seeing AI not only being the chatbot where you ask questions, but it’s getting agentic and doing things,” Jacobi says. For an organization, this could include tasks ranging from travel reimbursements to medical image analysis.
The transaction use case is “not the poster child of AI right now,” Jacobi says. “Not a lot of people are writing papers about it, but there’s an incredible amount of value for clients still to harvest that by adding that into the transactions.” As a business-based example, IBM aims to save clients millions of dollars by helping them enlist AI in the fight against transaction fraud.
The technology is also being used for cybersecurity in general. “We know that our clients are adopting AI capabilities for cyber defense and intrusion detection and reaction,” Jacobi says.
Clients know bad actors also have AI capabilities. “The bad guys aren’t stupid and know how to use AI and get better and better. And so, it’s an arms race,” Jacobi says. “Every tool that gets invented gets used on both sides, and AI is no different.”