Meta slammed over new model benchmarks

šŸ˜± Trump tariffs detonate AI chip market?

WORK WITH US ā€¢ COMMUNITY ā€¢ PODCASTS ā€¢ SIGN UP

Mondayā€™s AI Report

ā€¢ 1. šŸ˜¬ Meta slammed over new model benchmarks
ā€¢ 2. šŸ“½ļø In partnership with Guidde
ā€¢ 3. šŸ¤– Microsoft releases agentic copilot
ā€¢ 4. šŸ˜± Trump tariffs detonate AI chip market?
ā€¢ 5. šŸ’¼ In partnership with Innovating with AI
ā€¢ 6. āš™ļø Trending AI Tools
ā€¢ 7. šŸ—ļø Practical AI Applications
ā€¢ 8. šŸ“‘ Recommended Resources

Read Time: 5 minutes

ā€¼ļøWant to learn how to build a newsletter with AI? Join tomorrowā€™s live webinar with AI expert, Louis Shulman.

āœ… Refer your friends and unlock rewards. Scroll to the bottom to find out more!

Meta slammed over new model benchmarks

šŸšØ Our Report 

After the unexpected triumph of DeepSeekā€™s R1, Meta has scrambled to develop rival AI models that perform better but run more efficiently and, as a result, has just announced a new family of multi-modal AI models, under the Llama 4 name: Llama 4 Scout (a small model, that runs on just one NVIDIA chip), Llama 4 Maverick (an advanced model, comparable to GPT-4o/Gemini Flash 2,0)ā€”which are both available to download nowā€”and Llama 4 Behemoth (a teacher model designed to train other AI models) which is still in training.

šŸ”“ Key Points

  • Meta used the ā€œMixture of Expertsā€ (MoE) architecture to develop Llama 4, which is more computationally efficient for training and answering queries as it just uses the parts of a model that are needed for a given task.

  • According to internal testing, Maverick (which is best for ā€œgeneral assistance and chatā€) performs better than OpenAIā€™s GPT-4o on coding, reasoning, multilingual, long-context, and image benchmarks.

  • Scout beats Googleā€™s Gemma 3/Gemini 2.0 Flash-Lite models and Mistral 3.1, ā€œacross a broad range of benchmarks,ā€ and Behemoth (reportedly) outperforms GPT-4.5 and Claude Sonnet 3.7 on ā€œseveral STEM benchmarks.ā€

šŸ” Relevance 

Despite the impressive benchmark results, Developers have accused Meta of misleading the public with ā€˜skewedā€™ results, as the version of Maverick thatā€™s available to download now, is different from the version that was uploaded to the popular benchmarking platform, LM Arena (where humans compare models and choose which they preferā€”Maverick came 2nd), and although LM Arena isnā€™t considered a reliable measure of the true performance of an AI modelā€”as it's based on subjective opinion rather than factsā€”itā€™s slightly concerning (and misleading) that Meta has chosen to submit a different version of the model, to get a better score from LM Arena users.

šŸŽ„ Guidde - Create how-to video guides quickly and easily with AI

Tired of explaining the same thing over and over again to your colleagues?

Itā€™s time to delegate that work to AI. Guidde is a GPT-powered tool that helps you explain the most complex tasks in seconds with AI-generated documentation.

1ļøāƒ£Share or embed your guide anywhere

2ļøāƒ£Turn boring documentation into stunning visual guides

3ļøāƒ£Save valuable time by creating video documentation 11x quicker

Simply select ā€˜captureā€™ on the browser extension and the app will automatically generate step-by-step video guides complete with visuals, voiceover, and CTA.

The key part? The extension will cost you nothing.

šŸšØ Our Report

To celebrate its 50th birthday, Microsoft has upgraded Copilot so it can now complete agentic tasks like browsing ā€œmost websites,ā€ booking tickets, making reservations on a user's behalf, and remembering specific preferences like favorite foods, music, or films (similar to OpenAIā€™s agent, ā€œOperatorā€).

šŸ”“ Key Points

  • Copilot can now ā€œseeā€ and analyze videos and photos, answer questions about them, and also track online deals: For example, users can ask it to look for items on sale and it will notify them when the price drops.

  • To provide these agentic features, Microsoft has partnered with 1-800-Flowers.com, Booking.com, Expedia, Kayak, OpenTable, Priceline, Tripadvisor, Skyscanner, Viator, and Vrbo.

  • Although itā€™s given a few details about what the newly upgraded Copilot can do, Microsoft hasnā€™t transparently established where it might struggle or need human intervention, as many of its rivals have done.

šŸ” Relevance

These upgrades come as Microsoft has been reportedly considering moving away from using OpenAIā€™s technology to power its models, turning to in-house solutions instead, as it tries to keep up with the pace of innovation and new feature rollouts from the likes of OpenAIā€™s GPT and Googleā€™s Gemini.

  • Trump's new tariffs have caused economic disruption across industries, and tech companies are now fearful as many ā€˜AI chip-related productsā€™ werenā€™t on the published list of products that wouldnā€™t be affected by the tariffs.

  • Nobody seems to know if AI chips are exempted from higher import costs, and this has sent the US tech stock markets into declineā€”with major chip manufacturer NVIDIA down by 7.36% and TSMC by 7.22%.

  • The confusion comes as, while Trump announced a tariff exemption for AI chips, the electronic products that contain AI chips could face the general Taiwan tariff of 32%, scheduled to hit April 9th.

Want to build a 6-figure business as an AI consultant?

The AI consulting market is about to grow by a factor of 8X ā€“ from $6.9B to $54.7B in 2032.

But how does an AI enthusiast become an AI consultant?

How well you answer that question makes the difference between just ā€œhaving AI ideasā€ and being handsomely compensated for your contribution to an organizationā€™s AI transformation. 

Thankfully, you donā€™t have to go it alone ā€“ our friends at Innovating with AI have welcomed 300 new students into The AI Consultancy Project, their new program that trains you to build a business as an AI consultant.

Some of the highlights current students are excited about:

  • The tools and frameworks to find clients and deliver top-notch services

  • A 6-month plan to build a 6-figure AI consulting business

  • Students getting their first AI client in as little as 3 days

And as a reader of The AI Report, you get early access to the next enrollment cycle.

Prompt Inspiration

After typing this prompt, you will get an evaluation of the impact political and economic changes are likely to have on your supply chain.

Assist in evaluating the potential impact of political or economic changes on our supply chain.

P.S. Use the Prompt Engineer GPT by The AI Report to 10x your prompts.

STARTUPS

Name: BrieflyAI
Value: Unknown
Funding raised: Bootstrapped

BrieflyAI is an AI-powered writing tool used by over 4,000 users who integrate it with Google Meet to transcribe meetings, generate meeting summaries, create follow-up emails and pre-meeting briefs, build customer profiles, and organize meeting notes.

PODCASTS

How real businesses are crushing it with AI

In this podcast episode, Ajay Malikā€”former Google exec and CEO of StudioXā€”discusses how real businesses are crushing it with AI.

ā€¼ļøPlus: If you missed last weekā€™s AI Report podcast episodeā€”where Liam sat down with entrepreneur Alex Banks, founder of GrowHub, to discuss how heā€™s building a personal brand empire and gained 129k followers on LinkedInā€”catch it here.

QUICK HITS

We read your emails, comments, and poll replies daily.

Hit reply and tell us what you want more of!

Got a friend who needs to learn more about AI?

Sign them up for The AI Report here.

Until next time, Martin, Liam, and Amanda.

P.S. Unsubscribe if you donā€™t want us in your inbox anymore.

What did you think of this edition?

Login or Subscribe to participate in polls.