Global AI and Data Science

 View Only

Foundation Models in AI

By Moloy De posted Fri January 13, 2023 10:27 PM

We're witnessing a transition in AI. Systems that execute specific tasks in a single domain are giving way to broad AI that learns more generally and works across domains and problems. Foundation models, trained on large, unlabeled datasets and fine-tuned for an array of applications, are driving this shift.

The Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM) coined the term foundation model to refer to "any model that is trained on broad data, generally using self-supervision at scale, that can be adapted or say fine-tuned to a wide range of downstream tasks". This is not a new technique in itself, as it is based on deep neural networks and self-supervised learning, but the scale at which it has been developed in the last years, and the potential for one model to be used for many different purposes, warrants a new term, the Stanford group argue.

A foundation model is a "paradigm for building AI systems" in which a model trained on a large amount of unlabeled data can be adapted to many applications. Foundation models are "designed to be adapted (e.g., finetuned) to various downstream cognitive tasks by pre-training on broad data at scale".

Key characteristics of foundation models are emergence and homogenization. Because training data is not labelled by humans, the model emerges rather than being explicitly encoded. Properties that were not anticipated can appear. For example, a model trained on a large language dataset might learn to generate stories of its own, or to do arithmetic, without being explicitly programmed to do so. Homogenization means that the same method is used in many domains, which allows for powerful advances but also the possibility of "single points of failure".

A 2021 arXiv report listed foundation models' capabilities in regards to "language, vision, robotics, reasoning, and human interaction", technical principles, such as "model architectures, training procedures, data, systems, security, evaluation, and theory, their applications, for example in law, healthcare, and education and their potential impact on society, including "inequity, misuse, economic and environmental impact, legal and ethical considerations".

An article about foundation models in The Economist notes that "some worry that the technology’s heedless spread will further concentrate economic and political power".

IBM has also seen the value of foundation models: We implemented foundation models across our Watson portfolio already and have seen that their accuracy clearly surpasses the previous generation of models by a large margin, while still being cost-effective. With pre-trained foundation models, Watson NLP could train sentiment analysis on a new language using as little as a few thousand sentences — 100 times fewer annotations required than previous models. In its first seven years, Watson covered 12 languages. Using foundation models, it jumped to cover 25 languages in about a year.

What makes these new systems foundation models is that they, as the name suggests, can be the foundation for many applications of the AI model. Using self-supervised learning  and transfer learning , the model can apply information it’s learnt about one situation to another. While the amount of data is considerably more than the average person needs to transfer understanding from one task to another, the end result is relatively similar: You learn to drive on one car, for example, and without too much effort, you can drive most other cars — or even a truck or a bus.

We believe that foundation models will dramatically accelerate AI adoption in enterprise. Reducing labeling requirements will make it much easier for businesses to dive in, and the highly accurate, efficient AI-driven automation they enable will mean that far more companies will be able to deploy AI in a wider range of mission-critical situations. Our goal is to bring the power of foundation models to every enterprise in a frictionless hybrid-cloud environment.

QUESTION I : Could we synonymies Foundational Models with Data Preprocessing? 
QUESTION II : Could we think applying this concept outside NLP?

REFERENCE : What are foundation models?, Foundation Models IBM, IBM Video on Foundational Models