Once upon a time, advances in computing speeds were defined by Moore’s Law. That’s no longer the case, however, with the law having hit a wall.
As a result, tech providers have had to become more creative when addressing compute speeds, especially when it comes to new technological advances in artificial intelligence (AI), cloud computing and workload-intensive solutions such as SAP HANA.
“We used to run a relatively small number of processors and get increased frequency with each new generation, which meant everything simply and noticeably got faster. Now, we have emerging AI workloads but a lack of performance delivery from just scaling general purpose cores,” notes Jeff Stuecheli, POWER* hardware architect, IBM. “Additionally, AI kernels have higher computational demands versus programmatic systems, so there’s a difference in how they’re optimized. This is true in the case systems such as SAP HANA, which rely on both online transaction processing (OLTP) and analytics to speed up real-time data analytics.”
Specialized Accelerators for New Workloads
This has resulted in the creation of specialized accelerators meant to optimize these new workloads. Depending on the workload, they may require different types of configurations. For example, in the case of AI, this may involve the acceleration to and from storage, which would require building a storage infrastructure that performs beyond what’s needed for a transactional database.
“Fundamentally, all of this comes down to improving performance in systems without relying on faster, cheaper transistors. So, an optimized memory subsystem might be part of this,” Stuecheli says. “Some people might be working with containers and need to manage a lot of images. Somebody else might be solving a problem where they need lots of memory bandwidth but not a lot of capacity. We need to design for all of these different scenarios.”
Stuecheli calls infrastructures that support this “composable systems,” or system compositions that are optimized around a certain set of design goals that could include, for example, acceleration or different memory capacities. This essentially means that the same processor silicon is used but specific technologies are plugged in based on different use cases.
Opening POWER With AXON
This might include different types of memory that are better optimized for a given system and built upon existing CPUs. As Stuecheli explains, “It’s fundamentally about being able to aggregate things around the processor, be it acceleration, connectivity or memory, as in the case of the OpenCAPI Memory Interface.”
Another case in point is heterogeneous computing environments, which are pretty much all of them now—especially in the cloud space. After all, it’s unlikely an organization that needs to address 10 different workloads is going to purchase 10 different computers.
Rather, there have to be different interconnects to support specific workloads across platforms. So, for example, PCIe devices would require PCIe G4 interfaces to improve transactional movement to IBM POWER9* systems. The same is true with ASIC/FPGA devices and Nvidia GPUs, which would, respectively, rely on OpenCAPI and NVLink interconnects. But all of these are cleverly composed into POWER9 systems.
“POWER8* was a custom chip just for NVlink systems. Now, POWER9 chips have the ability to run NVLink, OpenCAPI and symmetric multiprocessing (smp). We call it AXON, with the ‘AX’ representing smp, ‘O’ representing OpenCAPI and the ‘N’ representing NVLink. Think neuron-to-neuron axon connections in the brain,” Stuecheli says. “So, with this new interface, we can build different devices and plug them into the same chip, deploying technologies optimized for a particular solution without having to build a new, specialized version of the processor for it, as had been done in the past.”
Flexibility for Enterprise Workloads
This is how POWER9 systems have been designed to handle, for example, SAP HANA- and AI-optimized workloads, as well as high-reliability enterprise computing environments. They’re using the same processor silicon but allowing different container and cloud services to be built into it to create composable systems for many different use cases.
“You want to have a standard that describes the service you’re providing, not the specific architecture you’re operating on,” Stuecheli notes. “We want to be able to move the abstraction layer of services so we can build different infrastructures and hide what exactly is being done from users. This allows them to deploy a variety of workloads without the complications of having to worry about job-specific chips.”
The result of these composable systems is akin to managed heterogeneous cloud computing environments. Everything is invisibly flexible and compatible, allowing users to introduce a variety of workloads and—depending on their intensity—resource memory, storage and other technologies as required, and not having to load up on CPUs to do so.
As Stuecheli puts it, “With POWER9, we’re using one chip to do three things on the same wire. And that’s only the beginning. In five years, who knows how far we’ll advance beyond that. Well, I know, because I am currently working on those systems, and we have some very interesting plans.”