Snowflake Releases Its Own Flagship Model for Generative AI

All-encompassing, highly generalizable generative AI models were once the be-all and end-all and arguably still are. However, as more cloud providers large and small join the generative AI fray, we are seeing a new generation of models focused on the most financially powerful potential customer: the enterprise.

Case in point: Snowflake, the cloud computing company, today introduced Arctic LLM, a generative AI model described as “enterprise-grade.” Available under an Apache 2.0 license, Arctic LLM is optimized for “enterprise workloads,” including database code generation, according to Snowflake, and is free for research and commercial use.

“I think this will be the foundation that will enable us – Snowflake – and our customers to build enterprise-class products and actually begin to realize the promise and value of AI,” CEO Sridhar Ramaswamy said in a statement Press conference. “You should consider this our first but major step into the world of generative AI, with many more to come.”

A corporate model

My colleague Devin Coldewey recently wrote about how there is no end in sight to the onslaught of generative AI models. I recommend you read his article, but the gist is: Models are an easy way for vendors to generate excitement for their R&D, and they also serve as a funnel into their product ecosystems (e.g. model hosting, fine-tuning, etc. ). .

Arctic LLM is no different. Snowflake’s flagship model in a family of generative AI models called Arctic, Arctic LLM – which took about three months, 1,000 GPUs and $2 million to train – follows Databricks’ DBRX, a generative AI model also known as Space is marketed optimized for companies.

Snowflake makes a direct comparison between Arctic LLM and DBRX in its press materials, saying that Arctic LLM outperforms DBRX on the two tasks of coding (Snowflake did not specify which programming languages) and SQL generation. The company said Arctic LLM is also better at these tasks than Meta’s Llama 2 70B (but not the newer Llama 3 70B) and Mistral’s Mixtral-8x7B.

Snowflake also claims that Arctic LLM achieves “leading performance” on a popular general language comprehension benchmark, MMLU. However, I would like to point out that MMLU is intended to evaluate the ability of generative models to solve logical problems. It includes tests that can be solved by memorizing. So take this point with caution.

“Arctic LLM addresses specific needs in the enterprise sector,” said Baris Gultekin, head of AI at Snowflake, in an interview with TechCrunch, “and moves away from generic AI applications like poetry writing to focus on business-focused challenges like development “to focus on SQL collaborations.” Pilots and high-quality chatbots.”

Arctic LLM, like DBRX and Google’s current most powerful generative version, Gemini 1.5 Pro, is a combination of Expert Architecture (MoE). MoE architectures fundamentally break down data processing tasks into subtasks and then delegate them to smaller, specialized “expert” models. So while Arctic LLM contains 480 billion parameters, it only activates 17 billion at a time – enough to power the 128 separate expert models. (Parameters essentially define an AI model’s capabilities for a problem, such as analyzing and generating text.)

Snowflake claims that this efficient design allowed it to train Arctic LLM on open public web datasets (including RefinedWeb, C4, RedPajama, and StarCoder) at “approximately one-eighth the cost of similar models.”

Run everywhere

Snowflake provides resources such as coding templates and a list of training sources alongside Arctic LLM to guide users through the process of getting the model running and optimizing it for specific use cases. However, recognizing that these are likely costly and complex undertakings for most developers (fine-tuning or running Arctic LLM requires around eight GPUs), Snowflake also promises to make Arctic LLM available on a range of hosts, including Hugging Face and Microsoft Azure, Together AI’s model hosting service and enterprise generative AI platform Lamini.

But here’s the catch: Arctic LLM will be available First on Cortex, Snowflake’s platform for developing apps and services based on AI and machine learning. Not surprisingly, the company touts it as the preferred way to run Arctic LLM with “security,” “governance,” and scalability.

“Our dream here is to have an API within a year that our customers can use to enable business users to communicate directly with data,” said Ramaswamy. “That would have been it It was easy for us to say, “Oh, we’ll just wait for an open source model and use it.” Instead, we make a fundamental investment because of how we think [it’s] We will offer our customers more added value.”

So I’m wondering: who is Arctic LLM really suitable for, other than Snowflake customers?

In a landscape full of “open” generative models that can be fine-tuned for virtually any purpose, Arctic LLM does not stand out in any way. Its architecture could result in efficiencies compared to some other options on the market. However, I am not convinced that they will be dramatic enough to steer companies away from the countless other well-known and supported business-friendly generative models (e.g. GPT-4).

There is also a reason why Arctic LLM is unfavorable: its relatively small context.

In generative AI, the context window refers to input data (e.g. text) that a model considers before generating output (e.g. more text). Models with small context windows tend to forget the content of even very recent conversations, while models with larger contexts usually avoid this danger.

Arctic LLM’s context ranges from about 8,000 to about 24,000 words, depending on the tuning method – well below that of models like Anthropic’s Claude 3 Opus and Google’s Gemini 1.5 Pro.

Snowflake doesn’t mention it in the marketing, but Arctic LLM almost certainly suffers from the same limitations and shortcomings as other generative AI models – namely hallucinations (i.e. self-aware, incorrectly answering queries). That’s because Arctic LLM, like every other generative AI model in existence, is a statistical probability engine – a machine that, in turn, has a small window of context. It uses a variety of examples to guess which data makes the most “sense” to place where (e.g. the word “go” before “the market” in the sentence “I go to the market”). Inevitably you will guess wrong – and that is a “hallucination”.

As Devin writes in his article, we can only look forward to incremental improvements in the field of generative AI until the next major technical breakthrough. But that won’t stop vendors like Snowflake from glorifying them as great achievements and marketing them in every way possible.

Snowflake Releases Its Own Flagship Model for Generative AI | TechCrunch

A corporate model

Run everywhere

Leave a Comment Cancel reply