It's been said before but it is great news for consumers that there's so much competition in the LLM space. If it's hard for any one player to get daylight between them & the 2nd best alternative, hopefully that means one monopolistic firm isn't going to be sucking up all the value created by these things
To put it this way: after seeing examples of how a LLM with similar capabilities to state-of-the-art ones can be built with 20 times less money, we now have proof that the same can be done with 20 times more money as well!
If what they say is true, then you have to give them credit for catching up incredibly fast. And slightly pulling ahead. Not only with the models, but also products.
I don't know, but I found the recording uninspiring. There was nothing new for me. We've all seen reasoning models by now—we know they work well for certain use cases. We've also seen "Deep Researchers," so nothing new there either.
No matter what people say, they're all just copying OpenAI. I'm not a huge fan of OpenAI, but I think they're still the ones showing what can be done. Yes, xAI might have taken less time because of their huge cluster, but it’s not inspiring to me. Also, the dark room setup was depressing.
Karpathy believes that this is at o1-pro level[1].
This again proves that OpenAI simply has no tech moat whatsoever. Elon's $97 billion offer for OpenAI last week was reasonable given that xAI already have something just a few months behind - it would probably be faster for xAI to catch up with o3 than going through all those paperworks and lawyer talks required for such an acquisition.
Elon also has some huge up-hand here -
Elon and his mum are extremely popular in China, it would be easier for him to acquire Chinese AI engineers. He can offer xAI/XSpace/Neurallink shares to those best AI engineers who'd prefer some kind of almost guaranteed 8 figure return in long run.
Good luck to OpenAI investors who still believe that OpenAI worth anything more than $100 billion.
A very impressive debut. No doubt they benefited from all the research and discoveries that have preceded it.
Maybe the best outcome of a competitive Grok is breaking the mindshare stranglehold that ChatGPT has on the public at large and with HN. There are many good frontier models that are all very close in capabilities.
Controversial opinion but I think the AI game studio idea is a very good one. Not because I think they will make any money off the games, but dogfooding will lead to so much more improvement than relying on feedback from external customers.
Have you thought of a future where LLM will be fined tune to target advertisment to you? I mean look at search: first iterations of search were pretty simple in term of ads. Then personalized ads came. I wouldn't help but envision the distopia where the LLM will insert personalized ads based on what you are asking for help.
> Currently, Grok Web is not accessible in the United Kingdom or the countries of the European Union. We are diligently working to extend our services to these regions, prioritizing compliance with local data protection and privacy laws to ensure your information remains safely secure.
I suppose you can take that to mean that people who do have access to the service should not expect much in terms of data protection.
I think they put the new model behind a $40 paywall so less people use it. The model seems only marginally better than open source models, based on xAI's own internal tests, and they spend $$$ money for it to run. Elon talked in the second half about making one of the largest GPU data centers just to get this running. I guess the next iteration they'll be trying to reduce the costs.
Also, they will be open sourcing Grok 2, which is probably pretty behind at this point, but will still be interesting for people to check out.
I am excited for the voice mode promised in "a week" or so. ChatGPT Advanced Voice has been a big disappointment for me. It can't do some of the things they demoed at the announcement. It's a lot dumber than text mode. I find the voice recognition unreliable. I couldn't get it to act as a translator last time I tried. But most of all I find I don't have much to talk to it about. If Grok 3 voice mode can discuss current events from the X timeline then it should be much more interesting to talk to.
I'm a freeloader and it appears that unfortunately Elon is not stupid enough to just give it to me for free..
There's no fair price either since I see no pay-per-use pricing, so.. unavailable for me for now.
Billions spent, one of the most powerful AI developed, and still no one competent enough to trim the 15 mins of waiting time filler at the beginning of the announcement video...
They will open-source Grok 2 when Grok 3 comes out. Also it seems like it will be paywalled - disappointing considering DeepSeek-R1 is free and open source.
For some ouroborus fun, I attached this whole HN discussion and asked Grok 3 to summarize (with specific focus on the members attitude towards Elon Musk). Here's what it came up with:
Off topic, but just in case: is there a good reference on how people actually use LLMs on a daily basis ? All my attempts so far have been pretty underwhelming:
* when I use chatbots as search engines, I'm very quickly disappointed by obvious hallucinations
* I ended up disabling github copilot because it was just "auto-complete on steroids" at best, and "auto-complete on mushrooms" at worst
* I rarely have use cases where I have to "generate a plausible page of text that statistically looks like the internet" - usually, when I have to write about something, it's to put information that's in my head into other people head
* I'd love to have something that reads all my codebase and draws graphs, explain how things work, etc... But I tried aider/ollama, etc.. and nothing even starts making sense (is that an avenue to persevere in, though ?)
* At once, I tried to write in plain english a situation where a team has to do X tasks, in Y weeks, and I needed a table of who should be working on what for each week. I was impressed that LLMs were able to produce a table - the slight problem was that, of course, the table was completely wrong. Again, is it just bad prompting ?
It's an interesting problem when you don't know if you're just having a solution in search of a problem, or if you're missing something obvious about how to use a tool.
Also, all introductory texts about LLMs go into many details about how they're made (NNs and transformers and large corpuses and lots of electricity etc...) but "what you can do with it" looks like toy examples / simply not what I do."
So, what is the "start from here" about what it can really do ?
Elon just said they are launching an AI game studio. Does this mean they will be building games that are mostly built with AI, or will they make AI tooling available for anyone to build games easily? Probably the former, but it would be nice if they would make it fully available to everyone.
I don't understand how and why Grok would be related to "understanding the nature of the universe", as Musk puts it. Please correct me if I'm wrong, but they basically just burned more cash than any human should have to buy Nvidia GPUs and make them predict natural language, right? So, they are somewhat on-par with all the other companies that did the same.
This is not innovation, this is baseless hype over a mediocre technology. I use AI every day, so it's not like I don't see its uses, it's just not that big of a deal.
Can't stand Elon but happy to see this. We badly need a frontier model that is not so obsessed with "safety". That nonsense has held things back significantly, and leads to really stupid fake constraints.
We know RLHF and alignment degrades model quality. could it be that Grok, due to its less restrictive training guidelines (and the fact that its creators aren't afraid of getting sued), can achieve higher performance partly due to this simple factor?
It blows my mind that Musk hasn't integrated Grok as an app inside their vehicles. A literal AI copilot is a completely novel and killer app that cannot be pulled off by any other vehicle manufacturer.
Interesting thing about this is that because of all the Musk-related overhyping that's gone on and because the launch is a video, the thread that marks the entry of another company into the select group of serious AI companies will go off the front page with possibly only 200 points!
Grok3 Launch [video]
(x.com)624 points by travelhead 18 February 2025 | 1346 comments
Comments
The pull quote is: The impression overall I got here is that this is somewhere around (OpenAI) o1-pro capability
https://x.com/lmarena_ai/status/1891706264800936307
It's been said before but it is great news for consumers that there's so much competition in the LLM space. If it's hard for any one player to get daylight between them & the 2nd best alternative, hopefully that means one monopolistic firm isn't going to be sucking up all the value created by these things
No matter what people say, they're all just copying OpenAI. I'm not a huge fan of OpenAI, but I think they're still the ones showing what can be done. Yes, xAI might have taken less time because of their huge cluster, but it’s not inspiring to me. Also, the dark room setup was depressing.
This again proves that OpenAI simply has no tech moat whatsoever. Elon's $97 billion offer for OpenAI last week was reasonable given that xAI already have something just a few months behind - it would probably be faster for xAI to catch up with o3 than going through all those paperworks and lawyer talks required for such an acquisition.
Elon also has some huge up-hand here -
Elon and his mum are extremely popular in China, it would be easier for him to acquire Chinese AI engineers. He can offer xAI/XSpace/Neurallink shares to those best AI engineers who'd prefer some kind of almost guaranteed 8 figure return in long run.
Good luck to OpenAI investors who still believe that OpenAI worth anything more than $100 billion.
[1] https://x.com/karpathy/status/1891720635363254772
Maybe the best outcome of a competitive Grok is breaking the mindshare stranglehold that ChatGPT has on the public at large and with HN. There are many good frontier models that are all very close in capabilities.
This commit seems to indicate so, but neither HF or GH has public data yet:
https://huggingface.co/xai-org/grok-1/commit/91d3a51143e7fc2...
Edit: Answer from Elon in video is that they plan to make Grok 2 weights open once Grok 3 is stable.
I'm also skeptical of lmarena as there is a large number of Elon Musk zealots trying to pass off Grok as a proxy for Tesla shares.
I suppose you can take that to mean that people who do have access to the service should not expect much in terms of data protection.
Also, they will be open sourcing Grok 2, which is probably pretty behind at this point, but will still be interesting for people to check out.
https://lngnmn2.github.io/articles/grok3/
https://x.com/i/grok/share/CTDC0WOi7RCbEDrm11AJ3PtLM
* when I use chatbots as search engines, I'm very quickly disappointed by obvious hallucinations
* I ended up disabling github copilot because it was just "auto-complete on steroids" at best, and "auto-complete on mushrooms" at worst
* I rarely have use cases where I have to "generate a plausible page of text that statistically looks like the internet" - usually, when I have to write about something, it's to put information that's in my head into other people head
* I'd love to have something that reads all my codebase and draws graphs, explain how things work, etc... But I tried aider/ollama, etc.. and nothing even starts making sense (is that an avenue to persevere in, though ?)
* At once, I tried to write in plain english a situation where a team has to do X tasks, in Y weeks, and I needed a table of who should be working on what for each week. I was impressed that LLMs were able to produce a table - the slight problem was that, of course, the table was completely wrong. Again, is it just bad prompting ?
It's an interesting problem when you don't know if you're just having a solution in search of a problem, or if you're missing something obvious about how to use a tool.
Also, all introductory texts about LLMs go into many details about how they're made (NNs and transformers and large corpuses and lots of electricity etc...) but "what you can do with it" looks like toy examples / simply not what I do."
So, what is the "start from here" about what it can really do ?
How long before this starts getting deployed in safety critical applications or government decision making processes?
With no oversight because Elon seems to have the power to dismiss the people responsible for investigating him.
Anyone not scared by this concentration of power needs to pick up a book.
This is the largest computer cluster the world has ever seen.
Can someone please post interesting comments about things I can learn?
Getting the largest computer cluster in the world up and running in a matter of months? Unbelievable.
I'm not sure if this was a very bad joke by Elon, or if Grok 3 is really biased like that.
This is not innovation, this is baseless hype over a mediocre technology. I use AI every day, so it's not like I don't see its uses, it's just not that big of a deal.
500 Internal Server Error
nginx/1.27.4