Meta Launches Llama 4 AI Models; Beats GPT-4o and Grok 3 in LMArena

meta-releases-llama-4-scout-maverick-and-behemoth-AI-models-azmotech

After a four-month hiatus, Meta has unveiled a new lineup of Llama 4 open-weight models. The latest additions include Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth. Departing from the earlier dense model approach, Meta has adopted the Mixture of Experts (MoE) architecture this time, similar to DeepSeek R1 and V3. All Llama 4 models are natively multimodal from the ground up.

Starting with the smallest model, Llama 4 Scout features a total of 109B parameters with 16 experts, though only 17B parameters are active at any time. It supports an impressive context length of 10 million tokens. According to Meta, Llama 4 Scout (17B) outperforms Gemma 3, Mistral 3.1, and Gemini 2.0 Flash Lite.

Moving on, the Llama 4 Maverick model boasts 400B parameters with 128 experts, again with only 17B parameters active during inference. With significantly more specialized expert models, Maverick is more capable than Scout and supports a context length of 1 million tokens. Meta asserts that Llama 4 Maverick surpasses OpenAI’s GPT-4o and Google’s Gemini 2.0 Flash.

llama-4-maverick-benchmarks-azmotech

What’s impressive about Llama 4 Maverick is that, despite having just 17B active parameters, it has achieved an ELO score of 1,417 on the LMArena leaderboard. This places Maverick in the second position, right below Gemini 2.5 Pro, and ahead of Grok 3, GPT-4o, GPT-4.5, and others. It also delivers results on par with the latest DeepSeek V3 model in reasoning and coding tasks—remarkably, while using only half the active parameters.

Meta has done an excellent job distilling the Llama 4 Scout and Maverick models from the largest in the series, Llama 4 Behemoth. The Behemoth model is trained on a total of 2 trillion parameters, with 288 billion active across 16 experts. Meta says Behemoth is still undergoing training, and more information about its release will be revealed later.

Meta claims that Llama 4 Behemoth outperforms top-tier AI models like GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Pro on STEM benchmarks. It’s important to note that these results are based on non-reasoning models, meaning Meta expects even greater performance from future reasoning models built on the Llama 4 series base.

In terms of availability, Meta has announced that Llama 4 is now rolling out on Meta AI across WhatsApp, Messenger, Instagram, and the Meta AI website, starting today in 40 countries. However, the multimodal capabilities are currently limited to users in the United States.

Share this article
Shareable URL
Leave a Reply

Your email address will not be published. Required fields are marked *

Read next
0
Share