Great news for xAI: Grok is now pretty good at answering questions about Baldur’s Gate

Completely different AI labs have completely different priorities. OpenAI has historically centered on client customers, for example, whereas its rival Anthropic tends to focus on enterprises. Elon Musk’s xAI, we found lately, has been inserting specific emphasis on video-game walkthroughs.

On Friday, Enterprise Insider’s Grace Kay revealed an in depth and far-reaching report about xAI, the AI startup lately acquired by SpaceX, with specific emphasis on how Musk is making life tough for workers. However this specific anecdote stood out:

In a single occasion final yr, a mannequin launch was delayed for a number of days as a result of Musk was dissatisfied with how the chatbot answered detailed questions in regards to the online game “Baldur’s Gate,” in response to individuals acquainted with the matter. Excessive-level engineers had been pulled from different initiatives to enhance the responses earlier than launch, they stated.

After all, you may think about the frustration of any revered and skilled engineer who reveals as much as work considering he’ll be tackling basic issues of data and machine intelligence, solely to be sidetracked into serving to a 54-year-old man beat his online game. However the anecdote raises an much more urgent query: Did Musk find yourself getting the gaming expertise he wished?

To reply that query, our resident RPG fanatic Ram Iyer put collectively a set of 5 common questions on Baldur’s Gate, which we ran towards xAI and the three main fashions in a type of quasi-benchmark that I’ve determined to name “BaldurBench.”

Within the curiosity of journalistic transparency, I’ve made all of the chat transcripts public, so you may see them right here: Grok, ChatGPT, Claude, and Gemini.

First, the excellent news: Grok truly provides fairly good info. Its responses had been a bit dense with gamer jargon — “save-scumming” as an alternative of saving and “DPS” as an alternative of injury — however the solutions had been each helpful and well-informed, supplied you knew what it was speaking about. Grok additionally actually loves tables and theorycraft, which is about what you’d anticipate.

There are many Baldur’s Gate guides on the market and the fashions had been usually drawing from the identical ones, so the most important variations had been stylistic. ChatGPT prefers bulleted lists and sentence fragments, whereas Gemini likes to daring necessary phrases.

Techcrunch occasion

Boston, MA
|
June 9, 2026

The largest shock was Claude, which was significantly involved about giving me info that might spoil my expertise of the sport. Once I requested about good celebration compositions, it closed the steerage by saying, “Don’t stress an excessive amount of and simply play what sounds enjoyable to you.” Thanks, Claude!

It’s necessary to remember, it is a topic space we all know (because of Enterprise Insider’s reporting) that xAI has particularly centered on reaching parity. So we shouldn’t learn an excessive amount of into the truth that, after the reported dash, Grok’s recommendation turned out about the identical as the opposite fashions. Nonetheless, it’s good to know xAI could make it work if it tries.