X.ai, Elon Musk’s AI startup, has revealed its latest generative AI model, Grok-1.5. Set to power social network X’s Grok chatbot within the not-to-distant future (“in the approaching days,” X.ai writes in a blog post), Grok-1.5 appears to be a measurable upgrade over its predecessor, Grok-1 — no less than judging by the benchmark results and specs that X has published.
Grok-1.5 advantages from “improved reasoning,” in line with X.ai, particularly where it concerns coding and math-related tasks. The model greater than doubles Grok-1’s rating on a well-liked mathematics benchmark, MATH, and scores over ten percentage points higher on the HumanEval test of programming language generation and problem-solving abilities.
After all, it’s difficult to predict how those results will translate in actual usage. As we recently wrote, commonly-used AI benchmarks, which measure things as esoteric as performance on graduate-level chemistry exam questions, do a poor job of capturing how the typical person interacts with models today.
One improvement that should result in observable gains is the quantity of context Grok-1.5 can soak up in comparison with Grok-1.
Grok-1.5 has a 128,000-token context — “tokens” referring to bits of raw text (e.g., the word “improbable” split into “fan,” “tas” and “tic”). Context, or context window, refers to input data (on this case, text) that a model considers before generating output (more text). Models with small context windows are inclined to forget the content of even very recent conversations, while models with larger contexts avoid this pitfall — and, as an additional benefit, higher grasp the flow of information they soak up.
“[Grok-1.5 can] utilize information from substantially longer documents,” X.ai writes within the aforementioned blog post. “Moreover, the model can handle longer and more complex prompts while still maintaining its instruction-following capability as its context window expands.”
What’s historically set X.ai’s Grok models other than other generative AI models is that they reply to questions on topics which might be typically off-limits to other models, like conspiracies and more controversial political ideas. The models also answer questions with “a rebellious streak,” as Musk has described it, and outright rude language if requested to achieve this.
It’s unclear what changes, if any, Grok-1.5 brings in these areas. X.ai doesn’t allude to this within the blog post.
Grok-1.5 will soon be available to early testers on X, X.ai says, accompanied by “several latest features.” Musk has previously hinted at summarizing threads and replies and suggesting content for posts; we’ll see if those arrive soon enough.
The announcement of Grok-1.5 comes after X.ai open sourced Grok-1, albeit without the code vital to fine-tune or further train it. More recently, Musk said that more users on X — specifically those paying for X’s $8-per-month Premium plan — would gain access to Grok, the chatbot, which was previously only available to X Premium+ customers (who pay $16 per 30 days).