Google unveils new family of open-source AI models called Gemma to take on Meta and others—deciding open-source AI ain’t so bad after all

Fortune· Alain Jocard—AFP/Getty Images
In this article:

Google has unveiled a family of free open-source AI models that will compete with those from Meta and several well-funded AI startups such as Mistral and Hugging Face.

The company is calling the new models Gemma, in a nod to its own more capable Gemini models, that customers must pay to use.

Over the past year, two sometimes opposing camps have developed between purveyors of proprietary AI models and those offering open-source ones. The open-source models are smaller and less capable, but that performance gap has been closing. Meanwhile, the smaller size and complete customizability of the open-source models have proved popular with many programmers and engineers looking to build AI applications, as well as with companies looking to keep tighter control over the skyrocketing costs associated with implementing generative AI.

Until now, Google had been firmly in the proprietary model camp along with its rivals at OpenAI and Microsoft. But with today’s announcement it is acknowledging that, to some degree, open-source—whose biggest boosters include Facebook parent Meta—may be winning. At the very least, Google is hedging its bets. Amazon's cloud computing division AWS has also straddled both camps, initially cozying up to open-source players such as Hugging Face and then later inking a partnership with Anthropic, which offers proprietary models.

Tris Warkentin, director of product management at Google DeepMind, which developed the new Gemma models, said that the company was responding to software programmers telling it that they often used a combination of proprietary and open-source models in creating AI applications. The developers, he said, tended to rely on the more expensive proprietary models only when they really needed to use the extra capabilities—such as composing text about what is happening in an image—at which those models excelled.

At the same time, it's advantageous for businesses creating applications in this way to run both sorts of models, plus the data the application is using, on the same cloud computing platform. Having to hand off data between multiple clouds as well as multiple models is more difficult and can introduce time delays and errors. Warkentin said that Google hoped that by offering both proprietary and open-source models it could attract more customers to build their AI applications completely within Google’s Cloud Platform.

While Google said that the Gemma models were based on some of the same principles as its Gemini models, the Gemini models are designed to input and output audio data, visual data, and text, while the Gemma ones are text-only. Gemini is also multilingual, while Gemma will be available only in English to start.

Google has tried to position itself as more responsible than OpenAI and other competitors when it comes to the safety of the AI models it is releasing. But open-source models are potentially more dangerous than proprietary ones. It can be extremely difficult to stop people from misusing the model for a nefarious purpose, such as creating malware code or writing racist disinformation. It can also be easy to tune open-source models to imitate copyrighted material or to encourage self-harm. Most guardrails the original model developer puts around the model’s outputs can be overcome by a knowledgeable programmer.

Offering an open-source model that shares some commonalities with its proprietary Gemini models may open up new avenues for skilled attackers to overcome Gemini’s guardrails too. Researchers at Carnegie Mellon University showed last year that if they had access to the settings of the neural network of an open-source language model they could create automated software that would design prompts that could overcome the guardrails of not only that open-source model but also most proprietary models. This may be even easier if the models have some commonalities, as Google has said Gemma and Gemini do.

Google said it had built Gemma with the most robust safeguards it could for open-source technology. These included carefully curating the data on which Gemma was trained to filter out any personal information so that the model would not later be able to leak it. It also carried out extensive safety testing and evaluations of the models, including extensive red-teaming, where people play the role of bad actors and try to see if they can successfully overcome the model’s guardrails. Warkentin said the safety testing of Gemma was even more extensive than for Google’s Gemini models because it knows that people can do more to potentially overcome any guardrails with the open-source models.

Google said that it was releasing Gemma alongside guidelines for responsible use and deployment of the models and a new technique for building safety filters that people using the model could apply to help block any problematic outputs from the model.

Jeanine Banks, vice president and general manager and head of developer relations at Google, said the company had extensive licensing terms and conditions that would prohibit programmers from using Gemma for any nefarious purposes. This is one of the reason the company prefers to call Gemma an “open model” to distinguish it from traditional “open-source” software, which had licenses with no usage restrictions. And unlike Meta, which put license terms in place that would prevent its own Big Tech competitors from using its Llama 2 open-source models, Google said Gemma had no such commercial restrictions.

The Gemma models come in two sizes, one that consists of a neural network with 2 billion adjustable variables, called parameters, and the other a neural network with 7 billion parameters. Both are larger than Google’s smallest proprietary model, the Gemini Nano, which is designed to run on smartphones and other “edge” devices, and has 1.8 billion parameters. But they are likely significantly smaller than Gemini’s Pro and Ultra models. The Pro likely has tens or hundreds of billions of parameters, while the Ultra may have 1 trillion or more.

This story was originally featured on Fortune.com

Advertisement