How Deep Learning Reveals Data Insights
The average human brain has more than 80 billion neurons. The common fruit fly’s brain has about 250,000. Early artificial neural networks (ANNs)—computer systems inspired by the human brain—had even fewer neurons than a fruit fly.
“If all you can cough up is the intelligence of a fruit fly, you’re not going very far in evolutionary terms,” says Michael Gschwind, chief engineer, machine and deep learning, IBM.
Fortunately, computers have evolved since the late 1980s, and Gschwind has played a significant role in that progression. In the early ’90s, he invented one of the first neural accelerators: a hardware circuit that could think like an animal neuron and was small enough for several circuits to fit on a single chip. “We made great progress, but the machine learning was still too slow and the mental capacity of those early ANNs was limited,” he says.
“Companies that use Power Systems servers today are in a great position because AI is all about infusing big data with intelligence at a scale that is impractical to accomplish with human experts.
—Michael Gschwind, chief engineer, machine and deep learning, IBM
Two decades later, ANNs rebranded as deep learning—a key focus of IBM’s artificial intelligence (AI) and cognitive computing research—and are ready to become productive members of society.
IBM Systems Magazine sat down with Gschwind to get a better understanding of deep learning at IBM and what it means for IBM Power Systems* users.
IBM Systems Magazine (ISM): How do you describe deep learning to the layperson?
Michael Gschwind (MG): Deep learning is a branch of machine learning that encompasses algorithms based on ANNs. It’s inspired by biological neural networks in the human brain, which has multiple layers of neurons.
Think of deep learning as the untrained brain of an infant who has the potential to grow up to be very intelligent. The neurons can’t solve a problem by themselves. However, they can learn to solve many problems by repeatedly feeding the network questions and answers, as you would explain the world to an infant.
ISM: Why is IBM focusing so much research in this area right now?
MG: This is the future. Deep learning, machine learning and cognitive applications enable us to handle big data: millions of pictures, constant video streams, tons of information that humans can’t process or make sense of.
IBM Research has always pushed the boundaries of what’s possible in cognitive computing and AI. This includes milestones such as Deep Blue beating world chess champion Garry Kasparov, or IBM Watson* technology winning “Jeopardy!” Today’s programmable numerical accelerators that enable deep learning trace back to IBM’s cellular computing research from 20 years ago.
In 2017, we have faster computers and accelerators. Everything is lining up for us to finally model a brain that has enough cells or layers to produce an outcome that people care about, such as assisting drivers to improve road safety or recognizing cancer cells in images.
Our research will continue to serve as a guiding light for Power Systems technology by innovating new system structures, software released through PowerAI and Watson technology, and ways to apply those technologies to business processes.
ISM: What must happen before cognitive computing becomes the norm in the business world?
MG: Because many cognitive computing advancements are coming straight out of university research labs, they aren’t ready to deploy for enterprise users or normal IT users, so few people know how to use them. It might be amazing software, but most users are not in the mood to download source code from a university. Even if they were, they aren’t in the position to sit someone down for a month to only learn how to compile the application.
“Whereas the closed Intel ecosystem tries to force x86 users to use only their accelerators, IBM created an open ecosystem with OpenPOWER that’s ideal for deep learning.”
—Michael Gschwind
To take full advantage of deep learning, businesses need developers who know what to do. IBM must work with clients to figure out what problems they need to solve. Like the Linux* OS, it’s not just about developing the system—it’s about first creating all of the tools, knowledge, infrastructure and everything surrounding it before it reaches its full potential.
IBM is uniquely positioned to bring these things together because it’s the only company in the industry that has the breadth and integration of systems expertise across the stack: services, software and hardware. As always, IBM will continue to build exciting new systems and enable new applications that clients can easily deploy.
ISM: How is IBM applying deep learning research to Power Systems technology?
MG: IBM is building on-ramps to this superhighway of cognitive computing. PowerAI is an on-ramp that combines the most popular cognitive middleware for deep learning, which is optimized for Power Systems processors and accelerators. At that level, PowerAI is a software toolkit for deep learning for OpenPOWER systems.
PowerAI is also a strategy for enabling cognitive application development. This means building an ecosystem and bringing together the best-of-breed systems. It also means training and deploying architects from IBM Lab Services to figure out solutions for clients.
Additionally, IBM has focused on systems that support field-programmable gate arrays (FPGAs) and numeric accelerators based on GPUs that are a perfect match for big data workloads. Meanwhile, we’ve opened up our architecture with high-performance interfaces such as the Coherent Accelerator Processor Interface (CAPI) and NVLink, which enable accelerators to be easily integrated into Power Systems platform.
ISM: How will this help Power Systems clients?
MG: These advancements are important on several levels. For performance, the biggest challenge is getting data to the accelerator and getting the results back. CAPI and NVLink make accelerators a full partner in the system. Accelerators can directly access the data in the Power Systems platform without relying on the CPU. This helps performance and uses CPUs effectively. It also makes programming easier because no complex interplay occurs between the CPU and the accelerator.
From a technical perspective and view of the entire IT landscape, CAPI and NVLink are also of great importance because they create the basis for a broad ecosystem where different players can bring their accelerators to the Power Systems platform. This ensures that the best accelerators are used for Power* servers.
Whereas the closed Intel* ecosystem tries to force x86 users to use only their accelerators, IBM created an open ecosystem with OpenPOWER that’s ideal for deep learning. This open model resonated with many people in the industry, and they approached us to make CAPI available to other architectures, such as ARM and AMD. IBM created the OpenCAPI Consortium to make this even more widely available.
ISM: How do enhancements in cognitive computing affect enterprise workloads on Power Systems servers?
MG: The same system enhancements that enable cognitive computing also accelerate traditional computing workloads. For example, the numeric accelerators used in GPU computing can also be used for high-performance data analytics and many traditional compute-intensive applications for modeling supply chains, the weather or drug reactions.
Similarly, FPGAs and CAPI have been used to accelerate database processing with Netezza* technology, and provide the fastest, most efficient key-value store used in applications such as NoSQL.
ISM: What business benefits exist for companies that use Power Systems servers?
MG: Companies that use Power Systems servers today are in a great position because AI is all about infusing big data with intelligence at a scale that is impractical to accomplish with human experts. Having data that will form the basis of the cognitive processing on the systems that offer the best processing of that data is a match made in heaven.
One example is the new S822LC for high-performance computing. Used together with PowerAI, it’s the world’s fastest enterprise deep-learning server. Imagine how much better business processes will work when infused with this intelligence: Supply chains can learn to understand the impact of events and adjust to them; ordering systems can learn to understand product trends and adjust to them; and transactional systems can learn to understand the contents of databases. The key is that the system might recognize patterns a human might not think about.
ISM: What future do you envision for IBM Power Systems and cognitive computing?
MG: Because so much data exists today, the challenge is analyzing that data in a useful way. These technologies can help make sense of the data on three levels: descriptive, or what’s going on; predictive, or what might happen; and prescriptive, or what the business should do about it.
Many businesses have huge databases with a lot of untapped potential they can extract information from. Recommendation systems, for example, can track when a customer orders a product and then orders something else shortly after. This area will continue to evolve for all types of industries.
Another focus area is image analysis. Think about security systems today: Cameras are all over the place—in shopping malls, airports, etc.—to monitor various activities. All of the video feeds into a control room where one or two people sit in front of a wall of monitors, trying to look at 20 or 30 video screens. That’s not efficient or effective.
Instead, you could feed those video streams into a system that looks for objects being left behind, or recognizes a fight or a theft. You could use the same technology for a power company looking for service disruptions or an insurance company wanting to quickly assess the damage to a car by classifying a photo based on similar photos. IBM is also a leader in speech recognition, so it uses ANNs to recognize speech and translate the sounds to text. Today, IBM research teams are working to improve the algorithms as well as take advantage of new system improvements and new partnerships with various data centers. The first wave of analytics was understanding basic properties. Deep learning is the next wave, extracting real information and insights in a more sophisticated way with the goal of delivering better business outcomes.