Elon Musk’s xAI released its Grok large language model as “open source” over the weekend. The billionaire clearly hopes to set his company at odds with rival OpenAI, which, despite its name, shouldn’t be particularly open. But does releasing the code for something like Grok actually contribute to the AI development community? Yes and no.
Grok is a chatbot trained by xAI to fill the identical vaguely defined role as something like ChatGPT or Claude: You ask it, it answers. This LLM, nonetheless, was given a sassy tone and additional access to Twitter data as a way of differentiating it from the remainder.
As all the time, these systems are nearly not possible to judge, but the overall consensus appears to be that it’s competitive with last-generation, medium-size models like GPT-3.5. (Whether you select that is impressive given the short development time-frame or disappointing given the budget and bombast surrounding xAI is entirely as much as you.)
At any rate, Grok is a contemporary and functional LLM of great size and capability, and the more access the dev community has to the heart of such things, the higher. The issue is in defining “open” in a way that does greater than let an organization (or billionaire) claim the moral high ground.
This isn’t the primary time the terms “open” and “open source” have been questioned or abused within the AI world. And we aren’t just talking a couple of technical quibble, equivalent to picking a usage license that’s not as open as one other (Grok is Apache 2.0, in the event you’re wondering).
To start with, AI models are unlike other software relating to making them “open source.”
In the event you’re making, say, a word processor, it’s relatively easy to make it open source: You publish all of your code publicly and let community to propose improvements or make their very own version. A part of what makes open source as an idea precious is that each aspect of the applying is original or credited to its original creator — this transparency and adherence to correct attribution shouldn’t be only a byproduct, but is core to the very concept of openness.
With AI, that is arguably impossible in any respect, because the way in which machine learning models are created involves a largely unknowable process whereby an incredible amount of coaching data is distilled into a fancy statistical representation the structure of which no human really directed, and even understands. This process can’t be inspected, audited, and improved the way in which traditional code can — so while it still has immense value in a single sense, it will possibly’t ever really be open. (The standards community hasn’t even defined what open shall be on this context, but are actively discussing it.)
That hasn’t stopped AI developers and firms from designing and claiming their models as “open,” a term that has lost much of its meaning on this context. Some call their model “open” if there’s a public-facing interface or API. Some call it “open” in the event that they release a paper describing the event process.
Arguably the closest to “open source” an AI model will be is when its developers release its weights, which is to say the precise attributes of the countless nodes of its neural networks, which perform vector mathematics operations in precise order to finish the pattern began by a user’s input. But even “open-weights” models like LLaMa-2 exclude other essential data, just like the training dataset and process — which could be needed to recreate it from scratch. (Some projects go further, in fact.)
All that is before even mentioning the undeniable fact that it takes hundreds of thousands of dollars in computing and engineering resources to create or replicate these models, effectively restricting who can create and replicate them to corporations with considerable resources.
So where does xAI’s Grok release fall on this spectrum?
As an open-weights model, it’s ready for anyone to download, use, modify, superb tune, or distill. That’s good! It appears to be amongst the most important models anyone can access freely this fashion, when it comes to parameters — 314 billion — which supplies curious engineers rather a lot to work with in the event that they wish to test the way it performs after various modifications.
The scale of the model comes with serious drawbacks, though. You’ll need a whole bunch of gigabytes of high-speed RAM to make use of it on this raw form. In the event you’re not already in possession of, say, a dozen Nvidia H100s in a six-figure AI inference rig, don’t trouble clicking that download link.
And although Grok is arguably competitive with another modern models, it’s also far, far larger than them, meaning it requires more resources to perform the identical thing. There’s all the time a hierarchy of size, efficiency, and other metrics, and it’s still precious, but that is more raw material than final product. It’s also not clear whether that is the newest and best version of Grok, just like the clearly tuned version some have access to via X.
Overall, it’s thing to release this data, nevertheless it’s not a game-changer the way in which some hoped it is perhaps.
It’s also hard to not wonder why Musk is doing this. Is his nascent AI company really dedicated to open source development? Or is that this just mud in the attention of OpenAI, with which Musk is currently pursuing a billionaire-level beef?
In the event that they are really dedicated to open source development, this shall be the primary of many releases, and they’re going to hopefully take the feedback of the community under consideration, release other crucial information, characterize the training data process, and further explain their approach. In the event that they aren’t, and this is simply done so Musk can point to it in online arguments, it’s still precious — just not something anyone within the AI world will depend on or pay much attention to after the following few months as they play with the model.