The Business of AI: From Foundation Models to Large Language Models
Sales and marketing leader Brian Silverman explains the difference between these models, and how organizations can select the right ones for their generative AI goals
Continuing our “The Business of AI” series, the focus of this article is on selecting the right foundation and large language models (LLMs) to support an organization’s generative AI (GenAI) requirements and use cases.
Quick Definitions:
- Foundation models (e.g., Google DeepMind’s Gemini and Meta’s multi-modal Llama models) are broader AI systems trained on diverse data types including text, images, audio or multi-modal inputs (supporting different modes such as language, image, video and audio)—and can support a wide range of applications beyond language.
- Large language models (LLMs) (e.g., OpenAI’s GPT-4, Anthropic’s Claude, High-Flyer’s DeepSeek) are designed specifically for language tasks such as text generation, summarization, coding and answering questions. The key distinction is that LLMs are a specialized subset of foundation models, focused on language-related tasks.
Organizations planning an AI strategy that includes GenAI must carefully evaluate LLMs and foundation models—including open source, other publicly available models and closed models—for considerations such as investment of time and money, along with the potential value of developing and training their own custom models.
It is also important to understand that with rapid improvements in models, AI technology and solutions, organizations will need to continuously assess available models for their fit and relevance to their AI requirements and desired business value.
We have included an addendum, below, with information on different resources and directories for researching available models.
Publicly Available and Accessible Foundation Models
Publicly available and accessible foundation modelsare ideal for general productivity tasks such as editing emails, conducting research, drafting content and more. They can be accessed via assistants such as Microsoft’s Co-Pilot to draft documents, Google Gemini to aid in research and generate creative ideas, Anthropic’s Claude for coding, and OpenAI ChatGPT for additional reasoning use cases.
For example, organizations can use available foundation models to augment work that leverages publicly available information, such as product manuals and information found on the web, to make competitor comparisons. Leveraging the latest learning and updates, these models can be eye-opening, cost-effective and the quickest to implement successfully and realize value with limited concerns about confidentiality and trust.
For specific business requirements and greater flexibility, organizations can integrate these models more directly into their applications by accessing them via application program interfaces (APIs). For non-proprietary and non-confidential usage, this could be a very good choice. API access is charged by token usage, and organizations will need to monitor usage, track API token consumption and evaluate licensing agreements to contain costs. (See the addendum below on the costs of AI).
As these models are publicly available, organizations need to put in place the right governance and controls to ensure access adheres to confidentiality and compliance standards. This means conducting regular audits, including verifying API providers’ adherence to regulatory requirements.
Privately Available LLMs and Foundation Models
Organizations can create “private” versions of the publicly available foundation models. This leverages the best capabilities of models from companies like OpenAI, Google and many others. By hosting or accessing these privately deployed models, organizations can fine-tune and train them for industry-specific terminology and proprietary data, ensuring that private data and intellectual property does not find its way into publicly available models.
Private models may add infrastructure and computing costs, depending on deployment, as well as higher costs per token. They also can add complexity, including maintenance and ongoing support of AI-powered applications. Consideration for new or updated models will require updated training, fine-tuning and testing.
Not all vendors offer the option of deploying their models on-premises or in the organization’s private cloud. One example of such a vendor is OpenAI, which provides a private deployment but handles the hosting and management, potentially impacting an organization’s adherence to security and compliance requirements.
Organizations should evaluate and select models that best fit their expected business value, balancing capabilities and objectives with long-term costs. Privately deployed models can accelerate time-to-value for organizations that tailor them to their business and AI strategy.
Open-Source LLMs and Foundation Models
Open-source models—including the one that has dominated AI news of late, DeepSeek—bring a unique advantage in their flexibility and accessibility, enabling organizations to tailor solutions to their specific needs. For example, Hugging Face provides a platform that simplifies accessing and experimenting with open-source models, making it easier for organizations to compare, fine-tune and deploy these solutions.
Models such as Llama from Meta can be accessed via APIs in Meta’s systems or downloaded and deployed locally under an open source license. (API access to Meta-hosted Llama models is charged by token for each API call, depending on the pricing plan.)
Organizations that choose to host models in their own environment, whether on-premises or in their own private cloud infrastructure, will need to size and allocate the required compute resources to run the specific model and ensure they have the skills to develop and operate these models reliably.
One consideration is the availability of AI tools that help with the skill, management and governance required for AI solutions. For example, IBM’s watsonx can help lessen the technical expertise required for deploying and managing models, including open source, making them more accessible for organizations with limited in-house expertise.
Custom Foundation Models
For organizations with highly specific needs, developing custom foundation models or LLMs might be the best option. While this approach offers unparalleled flexibility and control, it also comes with significant challenges that must be carefully weighed. Developing a custom model can take months or even years, delaying time-to-market and return on investment (ROI).
Training a custom model requires extensive computational resources, and, depending on the size and training data, these systems can require millions of dollars in GPU costs, data storage and energy. Hidden costs, such as maintaining infrastructure and retraining, further increase expenses.
Custom models are most appropriate for industries with stringent privacy requirements (e.g., healthcare, finance) or highly specialized domains (e.g., rare-language translation, proprietary scientific research). However, they require highly skilled AI engineers and data scientists, which can be difficult and expensive to hire.
Finally, managing custom models introduces risks such as model drift, regulatory compliance and lifecycle management complexities. Planning for ongoing updates, governance and performance monitoring is critical for success.
Below is a table to summarize considerations for each type. Use this table as a quick reference to evaluate the trade-offs between cost, privacy, ease of use and scalability for different models. It can guide decision-making by highlighting which options align best with your organization’s specific needs and typical use cases:
What to Evaluate When Weighing Options
There are numerous models to choose from, and new ones being added every day, so organizations should continuously evaluate their options and consider tradeoffs:
Model Size and Performance
There are models of different sizes, such as OpenAI GPT 4.0 mini, that require fewer resources and cost less to implement. For example, smaller models can be ideal for edge devices with limited computing capabilities, while larger models may be prioritized for tasks requiring high accuracy. However, advancements in available smaller models, such as IBM’s Granite models, showcase high accuracy.
Multi-Model Deployment
Many organizations may look at multiple models to support their AI strategy and use case. For example, an organization might use Google Gemini for public research but use their own deployed internal model to analyze and augment that result.
Centralized vs. Edge Deployments
The smaller models from IBM, Meta and others enable AI-based applications to run on smaller systems and devices, even smart phones.
Potential Use Cases Beyond Current Requirements
Consider that the broad training required for current applications can be used by future AI applications, increasing the ROI on training and development costs.
Research and Test Different Models
There are many quality models that may be more applicable for a particular use case and application. By testing various models, you can make the right choices for your current and future use cases.
Consider a Combination of Different Models
Select the right model for your AI application—whether publicly available assistants, private-access models, public models such as Google’s Gemini, or internally developed applications that may use available open-source LLMs.
Evaluate the Lifecycle Management of the Chosen Model
Consider frequency of training, management of model drift (the decline in model performance over time as the real-world environment diverges from the training data) and updates with new data. Tools such as performance dashboards and machine learning operations (MLOps) platforms can assist in monitoring and managing these updates effectively.
Don’t Underestimate Costs vs. Benefits
Many analysts emphasize that organizations often underestimate the costs of using LLMs and foundation models to develop and test AI applications. In fact, Gartner in a recent webcast said that organizations can underestimate the cost of implementing AI by anywhere from 500-1,000%. Strategies such as optimizing inference pipelines and selecting models with usage-based pricing can help control expenses.
The Importance of Data
To be successful, organizations will need to expand their data strategy to ensure they are using the right data to train, test and deploy their AI applications successfully. This includes using processes to update the training as new data and information become available.
Organizations should ensure that the data being used—whether within their own LLMs, a publicly available foundation model or LLM, or within a solutions provider’s infrastructure—comply with industry and government regulations. It is also important that the organization can see the steps taken to train the models, ensuring the trust of their organization and customers.
Ensure Security, Trust and Transparency of Each Model
For example, the European Union has implemented General Data Protection Regulation (GDPR) to protect personal information, restricting the transfer of citizens’ data outside the EU. In addition, the EU recently approved the Artificial Intelligence Act, a comprehensive regulation addressing AI-specific risks such as bias, transparency and accountability. These regulations significantly impact the use of publicly available LLMs, requiring organizations to ensure compliance with data residency and ethical AI requirements.
Going Forward With Confidence
By following these recommendations, organizations can confidently begin their journey with AI, leveraging Generative AI and foundational models to achieve meaningful outcomes.
The next article this series will explore strategies to leverage current applications and data to maximize the business value and ROI of AI initiatives.
Resources and Directories on Available Models
There are a large number of LLMs and foundation models that are publicly available, and more coming to market every day.
Below is a list of directories that can help you assess whether these models can be leveraged to minimize the broader training required for your own GenAI solutions.
Papers with Code: Foundation Models: A comprehensive directory featuring open-source models, benchmarks and performance comparisons
Hugging Face Model Hub: Hosts thousands of open-source LLMs and tools, categorized by task and architecture
Google Model Garden: Offers pre-trained models from Google, including PaLM and BERT, with deployment guidance
OpenAI Documentation: Documentation and resources for OpenAI‘s proprietary models like GPT-4, with API details
AWS AI & ML Models Directory: Pre-trained LLMs available for deployment via AWS services like SageMaker, including proprietary and open-source options
IBM Granite AI Models: AI language models that are enterprise-ready, open source and designed for exceptional performance in safety benchmarks
LLMs Tools & Projects: A list of LLM tools and projects, including both open-source and closed-source models
Google AI: Information on Google’s LLMs and AI research, including some open-source models and datasets
Meta AI: Information on Meta’s LLMs and AI research, with some open-source models and datasets
OpenAI: Home of some of the most well-known LLMs, including GPT-3 and DALL-E, though access is often restricted
Considerations for researching available LLMs and foundation models:
- Is the model open-source or proprietary, and what licensing options are available?
- Does the model allow for fine tuning, and how complex or costly is the process?
- How well does the model fit the specific task your organization needs it to perform?
- Are there available benchmarks to compare the model’s performance, and do they align with your operational needs?
- Is the model compatible with your existing IT infrastructure and tools?
- How does the model handle data privacy, and what safeguards are in place for sensitive applications?
- What is the total cost of ownership, including licensing, API usage and potential fine-tuning expenses?
- Does the vendor provide robust support, documentation and community resources?
Understanding Cost-to-Value for AI
When developing your AI Strategy, beyond the requirements and fit, it is important to understand the considerations that impact the costs-to-value for different approaches.
Consider traditional AI, which can be lower-cost and more predictable than Generative AI:
When GenAI is the right approach, the choices and applicability can greatly impact costs, predictability and the return of value for an AI implementation.
Finally, for specific use cases there may be publicly available AI solutions that can be leveraged, providing value for the organization at predictable costs.
Key Definitions
Tokens: Tokens are the building blocks for how a prompt is processed by the LLM. They can represent words or parts of words. To understand costs, it is important to understand that token usage and charges include the prompt, the processing of the LLM and the output, making the cost difficult to predict.
Context window: The context window is the maximum text an LLM can process at once, sometimes up to 1 million tokens. While larger context windows increase costs, they can also reduce overall token usage by minimizing the need for follow-up prompts. A well-structured prompt with clear instructions helps optimize efficiency and control expenses.
Assistants: These range from publicly available assistants to public LLMs and foundation models; examples are ChatGPT, Google’s Gemini and Anthropic’s Claude.
API access: Publicly available LLMs and foundation models accessible via organizations’ own or purchased applications. API access charges are token-based.
Private access: Private access means the LLM or foundation model is privately provisioned and accessible via APIs. This enables training, fine tuning and increased control over confidential and private information. Some publicly available models can be deployed on-premises, while others, such as those from OpenAI, can be privately provisioned but are hosted and managed by the solution provider.
Open-source models: Open-source models are available for download, and can be found on directories such as Hugging Face. Open-source models, like Meta’s Llama and IBM’s Granite, have become more sophisticated and capable.
Build-your-own models: As the term implies, the organization develops, trains and maintains their own LLMs or foundation models.