The Prompt: Microsoft Bets Big On Its AI Future
Welcome back to The Prompt,Over the weekend, Meta released a cohort of new AI models called Llama 4, which it claimed perform better and more cost efficiently than OpenAI’s GPT4-o and Google’s Gemini 2.0 for tasks such as creative writing, coding and summarizing documents. One of these models, called Llama-4- Maverick, was benchmarked on the popular platform LM Arena, where people rate and compare AI models' answers. But what Meta benchmarked turned out to be different from the one it had publicly released to developers, TechCrunch reported. Instead, the social media giant benchmarked a more fine tuned version, several AI researchers noted on X and Meta mentioned in fine print in its own blog post. As a result, LM Arena said it is updating its policies for fair and "reproducible" model evaluations in the future.“Meta’s interpretation of our policy did not match what we expect from model providers,” it posted on X. “As a result of that we are updating our leaderboard policies to reinfor...

