Elon Musk Announces xAI's Grok 3 to be Released by Year-End, Trained with 100,000 NVIDIA H100 Accelerator Cards
xAI, a artificial intelligence technology development company founded by Elon Musk, has been working on its flagship product, Grok, which has already been integrated into X/Twitter for X Premium subscribers.
Today, Elon Musk announced that xAI's Grok 2 will be released in August, followed by the more powerful Grok 3 by the end of the year. It is likely that the main training content has already been completed.
Notably, xAI has used 100,000 NVIDIA H100 AI accelerator cards for training, which is an unimaginable scale in terms of purchasing cost and electricity fees.
Why did Musk emphasize the use of such a large number of accelerator cards for training? This is because Cohere CEO Aidan Gomez pointed out that many models are trained based on OpenAI's output, rather than being trained by the companies themselves.
Aidan Gomez highlighted the significant difference between models trained using OpenAI's output and those trained by AI companies themselves. Users can feel the difference between these two types of models.
Musk agrees with Aidan Gomez's view, stating that for AI companies, if they collect and train data themselves, an unavoidable step is to clean the data, as much of it is garbage data that cannot be used for AI model training. AI companies need to spend a lot of time and effort cleaning the data to leave useful data for model training. xAI's Grok 2 training data is all collected and cleaned by themselves, and Musk claims that Grok 2 has made significant progress in this regard.
Currently, using OpenAI's output to train AI models seems to be a popular approach, as AI companies only need to pay API fees to obtain large amounts of data from OpenAI, which can then be used to train AI models directly, without the need for complex data cleaning. However, models trained in this way may have limited capabilities.
The latest version of Grok, version 1.5, was released in March, which improved inference capabilities and provided 128K context. In benchmark tests, Grok 1.5 was slightly inferior to GPT-4, but the gap was not significant. In HumanEval tests, it even surpassed GPT-4.