• Lior's View
  • Posts
  • 🐉 GPT-4's new rival? Baidu reveals Ernie-4

🐉 GPT-4's new rival? Baidu reveals Ernie-4

On Baidu's new LLM, ElevenLabs voice translation tool, Google Generative Image Search, and the State of AI report.

AlphaSignal

Hey ,

Welcome to this week's edition of AlphaSignal.

Whether you are a researcher, engineer, developer, or data scientist, our summaries ensure you're always up-to-date with the latest breakthroughs in AI.

Let's get into it!

Lior

In Today’s Email:

  • Top Releases and Announcements

  • The New State of AI 2023

  • Baidu Unveils Ernie 4.0 AI model, Claims that it Rivals GPT-4

Read Time: 4 min 31 sec

RELEASES & ANNOUNCEMENTS

1. ElevenLabs Launches a Powerful AI-Voice Translation Tool 
This voice translation tool can convert spoken content to another language in minutes, while preserving the voice of the original speaker.

2. Google releases Generative AI Image Search
Google's Search Generative Experience (SGE) now lets users generate images and text drafts directly from the search bar. Powered by Imagen family of AI models, it competes with Microsoft's Bing Chat using OpenAI's DALL-E. Features responsible AI use and metadata labeling.

3. Adobe releases their Firefly 2 Image Model
Adobe Firefly Image 2 offers enhanced architecture and training algorithms for high-quality, photorealistic image generation. It supports 4MP output, depth of field control, and high-frequency details like skin pores. Features like Generative Match enable style transfer from reference images.

4. Pytorch releases Flash-Decoding for long context LLMs
Flash-Decoding, a technique that speeds up attention in large language models during inference. It achieves up to 8x faster generation for long sequences. The method is especially useful for applications requiring long context, making it more efficient and cost-effective.

5. NVIDIA launches New Text-to-3D AI Playground
NVIDIA and Masterpiece Studio have launched Masterpiece X – Generate, a text-to-3D AI tool aimed at simplifying 3D art creation. The browser-based tool uses generative AI, optimized by NVIDIA's Picasso GPU, for real-time 3D model generation from text prompts. It's compatible with Blender, Unity, and Unreal Engine but is not yet suitable for high-fidelity or AAA game assets.

Retrieval Augmented Generation (RAG) Made Simple, Secure and Hallucination Free

GroundX APIs are the fastest way to build hallucination-free RAG apps. Ingest, store, search and complete with just a few lines of code and four layers of powerful hallucination suppression built in.

  • Garbage In/Garbage Out: Fine-tuned ingest pipeline converts complex docs into LLM ready data to prevent downstream hallucination

  • Dynamic Chunker uses natural breaks in content to improve search accuracy

  • Automatic Context Generator provides the metadata LLMs need to understand your content

  • Proprietary Search responds to the intent of user questions and returns chunks, metadata and ranking scores to increase LLM understanding of your data and outperform pure vector search.

NEWS
The New State of AI 2023 Report is Out

What's New?
The State of AI Report 2023 dives into several key areas such as the dominance of Large Language Models (LLMs) like GPT-4, advances in safety and governance, and new developments in life sciences and generative AI applications.

Why Does It Matter?
The rapid advancements in LLMs and their applications are reshaping AI research and its commercial landscape. The focus on safety and governance is increasingly influencing policy decisions. Investments in generative AI are growing, signaling significant commercial interest.

Key Takeaways:

  1. GPT-4 is the master of all it surveys (for now), beating every other LLM on both classic benchmarks and exams designed to evaluate humans, validating the power of proprietary architectures and reinforcement learning from human feedback.

  2. Efforts are growing to try to clone or surpass proprietary performance, through smaller models, better datasets, and longer context. These could gain new urgency, amid concerns that human-generated data may only be able to sustain AI scaling trends for a few more years.

  3. LLMs and diffusion models continue to drive real-world breakthroughs, especially in the life sciences, with meaningful steps forward in both molecular biology and drug discovery.

  4. Compute is the new oil, with NVIDIA printing record earnings and startups wielding their GPUs as a competitive edge. As the US tightens its restrictions on trade restrictions on China and mobilizes its allies in the chip wars, NVIDIA, Intel, and AMD have started to sell export-control proof chips at scale.

  5. GenAI saves the VC world, as amid a slump in tech valuations, AI startups focused on generative AI applications (including video, text, and coding), raised over $18 billion from VC and corporate investors.

  6. The safety debate has exploded into the mainstream, prompting action from governments and regulators around the world. However, this flurry of activity conceals profound divisions within the AI community and a lack of concrete progress towards global governance, as governments around the world pursue conflicting approaches.

  7. Challenges mount in evaluating state of the art models, as standard LLMs often struggle with robustness. Considering the stakes, as “vibes-based” approach isn’t good enough.

NEWS
Baidu Unveils Ernie 4.0 AI model, Claims that it Rivals GPT-4

What's New?
Baidu has introduced its new AI model, Ernie 4.0, claiming that it matches the performance of leading models like GPT-4. In live demonstrations, Ernie 4.0 showed capabilities in complex verbal instructions, logic application, and content generation. The model is planned to be integrated into various Baidu products including search, maps, and cloud services.

Why Does It Matter?
This development is significant in the landscape of AI research and application, particularly in the escalating competition between U.S. and Chinese tech companies. The capabilities of Ernie 4.0, if verified, could signify a major step forward for AI models developed outside the U.S. Moreover, Ernie 4.0's planned integration into various services can serve as a real-world test case for advanced AI capabilities.

Main Takeaways

  • Ernie 4.0 is claimed to have similar capabilities as GPT-4, including comprehension, reasoning, and memory.

  • Baidu intends to use Ernie 4.0 across a range of products, potentially transforming user experience.

  • Due to Ernie 4.0 not being publicly available, verification of its claimed capabilities is a challenge.

  • The introduction of Ernie 4.0 adds to the competitive dynamics between U.S. and Chinese AI technologies.

NEWS
Watch: How ChatGPT Vision Works. Our New Technical Breakdown

What's New?
AlphaSignal is now on Youtube! We just released our first video, a technical breakdown of the new Multimodal ChatGPT. How can it see, hear, and speak? Our deep-dive will shed light on the inner workings of this beautiful monster.

To pull this off, we partnered up with the brilliant Ajay from CodeEmporium, check him out!

Leave a comment and tell us what you like, loved or hated? We would love to hear from you.

How was today’s email?

Not Great      Good      Amazing

Thank You

Igor Tica is a writer at AlphaSignal and a Research Engineer at SmartCat, with main expertise in Computer Vision. Passionate about contributing to the field and seeking opportunities for research collaborations that span Self-supervised and Contrastive learning.

Want to promote your company, product, job, or event to 100,000+ AI researchers and engineers? You can reach out here.