Dispelling Myths About Training AI Models
AI is already more than proving its capabilities, addressing earlier, seemingly intractable issues, such as on-the-spot fraud detection and automated customer service.
By Jim Utsler06/01/2019
"If you’re a credit card company, you get millions and millions of transactions a minute, and, yes, your most talented person can be very accurate in spotting a problem,” says Scott Soutter, portfolio offering manager for PowerAI, IBM. But you really have to be able to find that needle in the haystack. The problem is, that needle looks a lot like hay. By introducing AI—which doesn’t blink—into your operations, however, you can find that needle at machine scale and machine speed."
That said, getting to the point of truly valuable AI may seem like a Herculean task that just isn’t worthwhile, given all of the perceived work that goes into creating models. In truth, however, AI training isn’t as cryptic or intimidating as some might think.
Iterative TrainingWhen beginning AI model training, both rookies and veterans first need to recognize the problem they’re attempting to address with AI and identify good versus bad outcomes. This typically requires input from organizational lines regarding their business objectives. The technical team should also be involved in this process, explaining to the organizational lines, for example, which data is available and what infrastructural capabilities are available to meet their goals.
Once those topics have been refined, modelers should understand that the training itself is very iterative. Using already available data as well as, in some cases, data acquired from external sources, the goal is build a machine learning framework that essentially trains itself.
As Soutter explains, “You set up the parameters and the objectives and then the system, through an iterative examination of the data set, actually refines the model that describes your inference function. So, as a data scientist or machine learning developer, you’re very focused on validating the data that goes into the system. You’re also focused on shaping the model throughout the process to ensure you’re using the correct parameters or hyper-parameters that extract and identify the features of the data to create the most accurate model you can."
Tools for Training at ScaleIBM Watson* Studio can assist in this endeavor by accelerating the machine and deep learning workflows required to introduce AI into business operations. It offers a series of tools that allow data scientists, application developers and subject matter experts to more collaboratively and efficiently work with data to build and train models at scale.
That’s also the goal of tools such as IBM PowerAI Vision, as Soutter explains. “Manually labeling and annotating a million images—which isn’t an outlandish number anymore—to train a model is a fool’s errand,” he says. “You don’t want to have to go through each image to create a model for object detection or classification. You want to annotate 10 images and run that against a hundred images, which you can then audit and correct so you can retrain your model with those 100 images and then accurately label a thousand, tens of thousands or a million images. This saves who knows how much time and results in more accurate models."
The training platform is also key to speeding up model development. The IBM AC922 is built as a compute node. As such, it has the same capability of a node within the two largest supercomputers in the world, Summit (at U.S. Department of Energy’s Oak Ridge National Laboratory in Tennessee) and Sierra (built for the Lawrence Livermore National Laboratory in California), respectively. These large systems were developed to support AI at very large scales, with the hardware and software architectures allowing IBM to treat AI as a scale-out solution.
"You can combine additional servers and GPUs with near-ideal linear scalability to move a problem from four GPUs up to 4,000 GPUs,” Soutter says. “By doing so, you get additional compute capabilities to very quickly address larger and more complex problems."
Notably, training doesn’t end with initial model development—and in some sense, this is the beauty of AI. It can and should be constantly updated to reflect drifts that occur as data changes. Thankfully, many tools allow AI users to evaluate data-driven drift to make sure models are as accurate as when they were initially released. AI users can initiate this process by either revising or potentially retraining models to reflect the introduction of real-time data that almost necessarily enhances the model.
Addressing Drift and Bias
"Think about a natural-language model, which analyzes how people speak. New vocabulary determines how we interact with each other, and patterns of language change,” Soutter says. “As a result, a model that interprets natural language may no longer be able to accurately explain or accurately understand what a person on the other end of the line is saying or may miss the tone or intonation that was defined in the original model. So, you have to be able to reflect those linguistic changes in your model."
The same holds with model bias, with some AI installations perhaps giving different weight to different genders or cultural backgrounds. This can skew model results and perhaps cause AI-enabled departments to miss what otherwise may have been pertinent results, thereby lessening the true value and veracity of the model. IBM Watson OpenScale, an IBM Cloud* and IBM Cloud Private-based AI management tool, helps counter issues such as drift and bias. It both monitors the ongoing accuracy of models and examines where training may have inadvertently introduced bias into what is generally a fundamentally fair and accurate model. Watson OpenScale helps with the monitoring of ongoing model delivery by looking at model accuracy,” Soutter notes. “It looks at model drift. Are there data changes that have caused an outcome that wasn’t part of your initial goals? It looks for bias. Is one gender disproportionately impacted by the model?"
"We’ve spent a fair amount in the last few years trying to influence AI toward fairness, and Watson OpenScale is an example of that. And just recently, we released a massive data set of faces to better train models and reduce some of the bias involving ethnicity or gender."
Reducing AI BarriersA good AI model-training starting point is often other organizations—whether external or internal, at another division within a company—that have already had success with AI. This will help demystify much of the training process and result in more accurate models.
IBM is one of those resources, having developed a bevy of AI models for its own internal purposes—as well as building the tools that enable more efficient model training and monitoring. If an organization needs initial advice to get things started or assistance during the training process, IBM is more than willing to provide hands-on guidance, with an eye toward why AI is or has been deployed in the first place.
"We try to reduce barriers so clients can develop accurate models as efficiently and in as short a window as possible, because we know they’re spending money during the development process,” Soutter remarks. “We want to get them to a place where they can use accurate models to effectively infer against real-time data to make money, which is the ultimate goal.”
AI ResourcesIf you’d like a helping hand when training AI models, check out the IBM Client Center (ibm.co/2UJiGsV) and IBM Systems Lab Services (ibm.co/2Zkvg0I). Their hands-on services are a great place to start. Additionally, you can also engage with business partners who may also have similar AI training skills.
Jim Utsler, senior writer, has been writing for IBM since the mid-1990s.