🏅 Did Mistral Beat GPT-3.5?

On Google Gemini being under fire, Grok's roll out, AlphaCode 2, StableLM Zephyr 3B

AlphaSignal

Hey ,

Welcome to this week's edition of AlphaSignal.

Whether you are a researcher, engineer, developer, or data scientist, our summaries ensure you're always up-to-date with the latest breakthroughs in AI.

Let's get into it!

Lior

In Today’s Email:

  • Top Releases and Announcements

  • Mistral Quietly Unveils Mixtral 8x7B

  • E.U. Agrees on Landmark AI Rules

Read Time: 4 min 12 sec

RELEASES & ANNOUNCEMENTS

Google is facing backlash after admitting its Gemini AI demo video, which went viral for its impressive capabilities, was not live but staged with heavy editing and off-camera prompts. Critics argue this misrepresents the AI's true abilities. Google's spokesperson told Bloomberg, "the video was cobbled together using still image frames and prompting via text."

Designed to compete with other major AI models like OpenAI's ChatGPT, Grok stands out for its integration with X (formerly Twitter) allowing it real-time access to information from this social media platform. This feature gives Grok an edge over other AI models that generally rely on older internet data. Available for X’s US Premium Plus subscribers.

Google's DeepMind unveils AlphaCode 2, an advanced AI for coding contests, surpassing 85% of competitors on Codeforces. It utilizes Gemini (or Gemini Pro) for dynamic programming, excels in complex math, and collaborates effectively with human coders, hinting at future AI-assisted software development tools.

Databricks recently unveiled new retrieval augmented generation (RAG) tooling to help build high-quality large language model applications. Key features include vector search to integrate unstructured data, low latency feature serving for structured data, and monitoring systems to scan model responses. By combining relevant contextual data sources, these capabilities aim to simplify productionizing accurate and reliable RAG apps across various business use cases.

The 3 billion parameter LLM is 60% smaller than typical 7B models, efficiently runs on edge devices, and is fine-tuned on datasets like UltraChat and MetaMathQA, excelling in Q&A tasks, benchmarked on MT Bench and AlpacaEval.

Hire a world-class AI team for 80% less

Building AI products is hard, finding talented engineers who understand it is even harder.

That's why companies around the world trust AE Studio. We help you craft and implement the optimal AI solution for your business with our team of world class AI experts from Harvard, Stanford and Princeton.

Customized Solutions: Tailor-made software that fit your unique business needs. We work hand-in-hand with your team for seamless integration.

Cost-effective: High-quality solutions at a fraction of the cost.

Proven Track Record: Join the ranks of successful startups and Fortune 500 companies that rely on us.

Start with a free consultation.

NEWS
Mistral Quietly Unveils Mixtral 8x7B and API Endpoints

What's New?
Mistral AI just released Mixtral 8x7B, a sparse mixture of expert models (SMoE) with open weights. The new model outperforms existing LLMs like Meta Llama 70B, offering a balance of performance and efficiency.

Mixtral 8x7B, with 45 billion parameters, has a latency equivalent to a 12 billion parameter model, making it six times faster in inference than Llama 2 70B. It matches or exceeds GPT3.5 in most benchmarks.

Key features of Mixtral include handling a 32,000 token context and supporting five languages (English, French, German, Spanish, Italian). Notably, its code generation capability scores 40.2% on HumanEval. The model's instruction-following variant, refined with supervised fine-tuning and direct preference optimization (DPO), scores 8.3 on MT-Bench.

Mixtral's architecture is unique, utilizing a decoder-only model where a feedforward block selects from eight parameter groups. It effectively uses only a fraction of its 46.7 billion total parameters per token.

Mistral AI Platform Services

Mistral is opening beta access to their first platform services today: “la plateforme”.

Endpoints Offered: Three chat endpoints (mistral-tiny, mistral-small, mistral-medium) and an embedding endpoint, each with different performance and cost.

Integration: Uses open-source deployment stacks like vLLM and Skypilot for cloud deployment.

Development Stage: Currently in beta, with plans for general availability.

Support and Collaboration: Supported by Nvidia, CoreWeave, and Scaleway teams.

Get ready to scale your Machine Learning workloads to infinity and back with Latitude's Launchpad. The solution runs on the company's enterprise-grade bare metal infrastructure allocating fully dedicated GPUs to your containers, ensuring total scalability and amazing performance with hourly billing.

NEWS
E.U. Agrees on Landmark Artificial Intelligence Rules

The European Union has finalized the AI Act, a comprehensive legal framework governing artificial intelligence. This Act introduces a categorization system for AI based on risk levels: minimal, limited, high, and unacceptable. High-risk AI systems are subject to stringent requirements for risk management, transparency, and human oversight.

General Purpose AI systems (GPAIs), such as LLMs and multi-modal models, must provide technical documentation, training data summaries, and adhere to EU copyright laws.

For high-risk GPAIs, additional obligations include model evaluations, systemic risk assessment, adversarial testing, and reporting on cybersecurity and energy efficiency. Non-compliance can lead to penalties up to €35 million or 7% of global turnover.

Open-source AI models receive broad exemptions, a potential advantage for companies like Meta and European startups. However, there are concerns about the Act's impact on innovation within Europe's AI sector.

The implementation of the Act is key, especially for smaller companies, to prevent burdensome certification processes. Expected to take effect no earlier than 2025, the Act marks the EU as a pivotal player in AI regulation, influencing the global landscape of AI development and application.

How was today’s email?

Not Great      Good      Amazing

Thank You

Want to promote your company, product, job, or event to 150,000+ AI researchers and engineers? You can reach out here.