⏱️ Real-Time Text-to-Image is Here

On Sam Altman's new interview, Perplexity replacing Google, Keras 3.0, ElevenLabs grants ans more


Hey ,

Welcome to this week's edition of AlphaSignal.

Whether you are a researcher, engineer, developer, or data scientist, our summaries ensure you're always up-to-date with the latest breakthroughs in AI.

Let's get into it!


In Today’s Email:

  • Top Releases and Announcements in the Industry

  • Stability’s New Real-Time Text-to-Image

  • Amazon Reveals Titan Image Generator

  • W&B Releases Centralized LLM Monitoring Tool

Read Time: 4 min 28 sec


Perplexity AI introduces pplx-7b-online and pplx-70b-online, large language models that leverage web data for more accurate and current answers, outperforming GPT-3.5 and Llama2-70B in benchmarks. These models, based on Mistral-7B and Llama2-70B, are now available through Perplexity’s API.

“I totally get why people want an answer right now. But I also think it’s totally unreasonable to expect it.”

Hailed as an “idea-to-video platform that brings your creativity to motion,” the new AI model can transform text, or images, or videos into an entirely new video. You can both “create and edit your videos with AI”, using text-to-video, image-to-video, and even video-to-video modalities.

Keras 3.0 completely overhauls the library so that all Keras workflows can be run on JAX, TensorFlow, or PyTorch. This multi-framework support allows seamless switching between backends and custom component development, significantly broadening machine learning capabilities.

ElevenLabs is launching Grants to help early-stage companies use voice AI in their products. Recipients get 11 million text characters monthly for three months for free, aimed at lowering entry barriers. Eligible startups must have fewer than 25 employees. Post-grant, they can opt for discounted enterprise plans or standard plans.

Hire a world-class AI team for 80% less

Building AI products is hard, finding talented engineers who understand it is even harder.

That's why companies around the world trust AE Studio. We help you craft and implement the optimal AI solution for your business with our team of world class AI experts from Harvard, Stanford and Princeton.

Customized Solutions: Tailor-made software that fit your unique business needs. We work hand-in-hand with your team for seamless integration.

Cost-effective: High-quality solutions at a fraction of the cost.

Proven Track Record: Join the ranks of successful startups and Fortune 500 companies that rely on us.

Start with a free consultation.

Stability AI Releases SDXL Turbo, a Real-Time Text-to-Image Generation

What's New?
Stability AI introduces SDXL Turbo, a text-to-image model leveraging a novel Adversarial Diffusion Distillation (ADD) technique. This approach significantly reduces the image generation process from 50 steps to a single step, maintaining high image quality. The new method avoids common artifacts and blurring seen in other distillation techniques. The model weights are available on Hugging Face.

Why Does It Matter?
SDXL Turbo's single-step generation process marks a substantial improvement in the efficiency and speed of AI-driven image creation. In blind tests against models like LCM-XL and SDXL, SDXL Turbo demonstrated superior performance in both speed and fidelity with minimal computational overhead.

Main Takeaways

  • High Quality Images: Retains SDXL-like image quality without artifacts.

  • Efficiency and Speed: Reduces image generation time to 207ms on an A100 GPU.

  • Superior Performance: Outperforms multi-step models in blind tests for prompt adherence and image quality.

  • Availability: Model weights and code available for non-commercial use on Hugging Face.

Co-Founder and CTO of Recital Software, Brendan Mulholland, is teaming up with Nylas to showcase the game-changing features he built with email integrations.

Recital's AI-enabled platform was enhanced with real-time workflow notifications and streamlined information access features that their users were looking for to save hours of time.

Amazon Reveals Titan Image Generator, Multimodal Embeddings, and Amazon Q

What's New?
At AWS re:Invent, Amazon introduced a bevy of updates and new product launches. Chief among these is Amazon Q, a new generative AI assistant built for businesses. Additionally, Amazon’s Bedrock platform added the Titan Image Generator and Titan Multimodal Embeddings.

The Titan Image Generator, now in preview, generates studio-quality images from text prompts, featuring capabilities like inpainting. Titan Multimodal Embeddings enhances multimodal search, supporting both text and image as a query for contextual search in recommendation systems. Additionally, Amazon presents Titan Text Lite and Titan Text Express, expanding their text model offerings.

Why Does It Matter?
With its knowledge of the AWS ecosystem, and focus on privacy, Amazon Q can help developers build, deploy, and operate workloads on AWS. The Titan Image Generator addresses the increasing demand for automated, high-quality image generation in industries such as advertising and e-commerce, incorporating responsible AI practices with invisible watermarking.

Titan Multimodal Embeddings improve the accuracy and relevance of multimodal search and recommendations.

Main Takeaways

  • Amazon Q: Generative AI assistant integrated with AWS ecosystem.

  • Titan Image Generator: Realistic image generation from text, with advanced editing.

  • Multimodal Embeddings: Improves multimodal search and recommendation accuracy.

  • Broad Integration: Compatible with various AI and ML tools for extensive applications.

Weights & Biases Releases a New Centralized LLM Monitoring Tool

What's New?
The new tool allows real-time monitoring of Large Language Model (LLM) usage within an organization. It enables tracking of cost per prompt request, number of prompt tokens, and prompt latency.

Why Does It Matter?
Prompts: LLM Monitoring addresses the growing complexity in enterprise AI workflows, facilitating resource optimization and return on investment (ROI) tracking for Q&A chatbots, document extraction pipelines, and sales outreach workflows.

Main Takeaways

  • Real-time LLM Activity Monitoring: Track prompt usage, cost, and latency.

  • Optimized Resource Management: Offers insights into performance and financial aspects to maximize ROI.

  • Compatibility with Major AI Platforms: Integrates seamlessly with Azure, OpenAI and HuggingFace.

  • Accessible Via API: Supports secure and controlled access through a centralized API gateway.

Read Last Week’s Summaries:

How was today’s email?

Not Great      Good      Amazing

Thank You

Igor Tica is a contributing writer at AlphaSignal and a research engineer at SmartCat, focusing on computer vision. He's actively seeking partnerships in self-supervised and contrastive learning.

Jacob Marks is an editor at AlphaSignal and ML engineer at Voxel51, is recognized as a leading AI voice on Medium and LinkedIn. Formerly at Google X and Samsung, he holds a Ph.D. in Theoretical Physics from Stanford.

Want to promote your company, product, job, or event to 150,000+ AI researchers and engineers? You can reach out here.