Skip to main content

Spyre, Power11, and How to Beat the Odds in Enterprise AI

Sebastian Lehrig, worldwide AI on IBM Power team leader, explains how the Spyre accelerator and an integrated AI software stack are vital in a world where 95% of enterprise AI projects fail

TechChannel AI

It’s been three months since IBM’s Spyre AI accelerator became generally available for Power11 servers, but much of the IBM Power community still has a lot to learn about the adapter IBM is positioning as a key component of its AI strategy. On Feb. 26, Sebastian Lehrig, worldwide AI on IBM Power team leader, appeared on a Virtual User Group webinar to explain how IBM is seeking to shape enterprise AI with Spyre and Power11.

Rather than treating AI as an external service bolted onto enterprise applications, Lehrig described a platform approach where AI is engineered directly into the infrastructure that already runs mission‑critical workloads. By combining innovations in the Power11 processor, the Spyre AI accelerator, Red Hat’s AI software stack and a growing ecosystem of open-source AI services for Power, IBM is positioning Power systems as a fully integrated platform for operational AI.

IBM hopes its approach can help customers be among the 5% of enterprises whose AI projects ultimately produce a return on their investment.

“That’s a real challenge now. If you want to do AI, it appears as not such a low-hanging fruit. It appears more or less like a moonshot project,” Lehrig said. “And for me, as a product manager, I would kind of turn it around and argue, well, if it’s 95% failure here, there’s a huge opportunity to do better than that and to do it right.”  

Lehrig described an architecture of several coordinated layers that have been constructed in that aim.

  • Accelerated infrastructure: IBM Power11 systems combined with the Spyre AI accelerator and support for IBM Power Virtual Server (PowerVS).
  • Integrated inference platform: Red Hat AI Inference Server and OpenShift AI provide the runtime environment for deploying and managing AI models.
  • AI services for IBM Power: A growing set of open-source services services have been engineered to run efficiently on Power11 systems and fully optimized for the Spyre accelerator.

Creating a Turnkey AI Platform

A key theme of the user group session was simplifying AI adoption for Power customers. The IBM Open‑Source AI Foundation for Power provides a catalog of reusable solutions and deployment patterns designed to accelerate AI adoption.

These solutions include:

  • Workload solutions such as IT Service Desk assistants, private document processing and real estate advisory systems.
  • Adoption patterns that help organizations implement common AI architectures such as digital assistants and document intelligence systems.
  • AI services including knowledge management, question‑and‑answer capabilities, document processing and translation.

These solutions are designed to be one‑click deployable on Power11 systems with Spyre, with future support planned for Power Virtual Server in IBM Cloud, enabling consistent hybrid deployments.

Customer Examples

Lehrig also highlighted several real‑world implementations demonstrating how AI can be applied to enterprise workloads on the Power platform.

One example involved GEIS running on IBM i, where AI is used to extract order information from incoming emails and automatically populate order entry systems. The result was a reported five‑fold increase in business process throughput.

Another example came from Digiworks, an IBM partner using AI to automatically redact personally identifiable information (PII) from documents, helping organizations improve privacy protection and regulatory compliance.

IBM Power11 for Enterprise AI

IBM’s strategy with Power11 and Spyre reflects a broader assumption: The long‑term value of AI will come from embedding inference directly into enterprise systems where business transactions already occur.

Power11 introduces several hardware innovations designed to support this model:

  • AI‑optimized silicon: Enhancements such as Matrix Multiply Assist (MMA) and expanded SIMD vector acceleration improve the performance of AI and analytics workloads.
  • Dedicated AI acceleration: The Spyre accelerator provides purpose‑built AI inference capability that can scale within Power systems.
  • Enterprise economics for AI: While cloud LLM services are typically priced per token, deploying models on Power11 with Spyre shifts the economics toward a traditional enterprise infrastructure model based on hardware, software subscriptions and operational costs.

Combined with the Open‑Source AI Foundation for Power, these capabilities enable organizations to deploy AI solutions that run close to enterprise data and applications while maintaining the security, reliability and governance expected of enterprise platforms.

Understanding the Spyre Accelerator

At the center of this architecture is the optional IBM Spyre accelerator, working in concert with the enhanced AI capabilities of the Power11 processor. Together they enable what IBM describes as “ensemble AI,” an approach where the Power11 CPU and Spyre accelerator dynamically work together to determine where inference should run for optimal performance, latency and efficiency. In this model, the Power11 processor can execute smaller or latency‑sensitive models directly using its built‑in AI acceleration (including MMA and SIMD enhancements), while larger or more computationally demanding workloads can be offloaded to Spyre.
Unlike general‑purpose accelerators originally designed for graphics or large‑scale model training, Spyre was engineered from the ground-up to deliver efficient, scalable inference directly within enterprise systems and to complement the AI capabilities built into the Power11 architecture.

Each Spyre card contains a highly parallel system‑on‑chip architecture with 32 dedicated AI accelerator cores and 128 GB of high‑speed LPDDR5 memory, allowing large language models and AI workloads to execute close to the compute engines that process them. Built using advanced 5‑nanometer semiconductor technology and operating at roughly 75 watts, Spyre delivers substantial AI processing capability while maintaining the energy efficiency required for dense deployment in enterprise servers.

This architecture is particularly important for modern AI workloads. Large language models (LLMs) rely on repeatedly accessing model weights and tensors during inference. By placing substantial high‑bandwidth memory directly on the accelerator and tightly coupling it with the Power11 processor, Spyre reduces the data‑movement bottlenecks that often limit AI performance in traditional architectures.

Power11 systems can deploy multiple Spyre cards within a single system, allowing organizations to scale AI inference capacity while keeping workloads close to the applications and data that drive business processes. This combination of compute density, memory capacity and energy efficiency enables Power11 systems to run demanding AI workloads, including generative AI, retrieval‑augmented generation (RAG) and document intelligence, directly within enterprise infrastructure.

How Spyre Is Different From GPU Architectures

While GPUs remain the dominant architecture for training large AI models, Spyre is designed with a different goal in mind: infusing AI inference directly into enterprise infrastructure.

Compared with GPU clusters typically used for model training, the Power11 and Spyre approach focuses on:

  • Optimizing inference workloads rather than large‑scale model training
  • Reducing infrastructure complexity by integrating acceleration directly into enterprise servers
  • Lowering latency by keeping AI workloads close to enterprise applications and data
  • Simplifying deployment for organizations that want AI embedded within existing business systems

This design reflects IBM’s view that the future of enterprise AI will rely less on massive training clusters and more on integrating AI inference into operational business platforms.

Delivering on IBM’s Full‑Stack AI Vision

As Lehrig described the Power11 and Spyre architecture, he echoed a theme from an earlier interview with Mukesh Khare, General Manager of IBM Semiconductors. In that interview, Khare emphasized IBM’s approach to AI as a full‑stack strategy spanning silicon, systems, software and consulting:

“IBM is a full‑stack company from semiconductors to chip design, system, low‑level software, operating system, middleware, applications and consulting. We bring the entire stack together.”

The Power11 platform, combined with the Spyre accelerator and the growing ecosystem of open‑source AI tools and services for Power, represents an important step toward realizing that vision.


Key Enterprises LLC is committed to ensuring digital accessibility for techchannel.com for people with disabilities. We are continually improving the user experience for everyone, and applying the relevant accessibility standards.