Lior's View
Posts
🍎 Apple's New Open-Source ML Framework

🍎 Apple's New Open-Source ML Framework

Your weekly technical digest of top projects, repos, tips and tricks to stay ahead of the curve.

Lior Sinclair
December 07, 2023

AlphaSignal

Hey ,

Welcome to this week's edition of AlphaSignal the newsletter for AI professionals.

Whether you are a researcher, engineer, developer, or data scientist, our summaries ensure you're always up-to-date with the latest breakthroughs in AI.

Let's get into it!

Lior

In Today’s Summary:

Repo Highlight: MLX
Trending Repos: marker, gpt-fast
Pytorch Tip: Softmax with Temperature
Trending Models: sdxl-turbo, Starling-LM
Python Tip: Reduce for Cumulative Operations

Reading time: 4 min 48 sec

^HIGHLIGHT
MLX: An array framework for Apple silicon

What’s New
Apple has launched MLX, a new machine learning framework specifically optimized for Apple Silicon. It makes model development and deployment straightforward within Apple's ecosystem.

MLX is designed to offer a familiar working environment for those accustomed to existing ML frameworks, drawing inspiration from established players like NumPy, PyTorch, Jax, and ArrayFire.

How it Works
The Python API mirrors NumPy and has higher level APIs like mlx.nn and mlx.optimizers that closely match PyTorch, making it familiar for PyTorch users. Computation graphs are constructed dynamically so changing arguments doesn't require compilation. MLX uses lazy evaluation, only materializing arrays as needed.

Models can run on CPU or GPU without data transfer between devices due to the unified memory model. Data loading utilities are provided in the mlx-data library. This model ensures that data resides in shared memory, enabling operations across different device types without the need for data migration.

Key features include lazy computation, where calculations are deferred until necessary, and dynamic graph construction, which allows for more flexibility in changing function argument shapes without the overhead of recompilation. This makes debugging more straightforward and intuitive.

The included model examples and data manipulation utilities enable rapid prototyping by building on recent advances like LLaMA, LoRA, and Stable Diffusion.

Features

Unified memory between CPU and GPU avoids data transfer
Familiar NumPy and PyTorch-like APIs
Lazy evaluation and dynamic graphs for faster iteration
Includes utils for efficient data loading
Lower level C++ API also available

Introducing Deepgram Aura: A Text-to-speech API for Voice AI Agents

In an LLM-centric world, speech-to-text and text-to-speech technologies have become indispensable.

Introducing Aura, a powerful real-time text-to-speech (TTS) API designed for conversational voice applications. Compared to alternatives, Aura produces human-like speech more quickly and efficiently.

Learn more about Deepgram Aura, or be the first to try it out.

partner with us

TRENDING REPOS

marker (☆ 3.6k)

Marker converts PDF, EPUB, and MOBI files to markdown. It's 10x as fast as nougat and more accurate, and includes features like header/footer removal, equation conversion to Latex, and multi-language support.

gpt-fast (☆ 3.5k)

PyTorch native implementation of transformer models which achieve almost 200 token/second token generation with Llama-2-7B on a single GPU. The repository illustrates versions with quantization, speculative decoding, and tensor parallelism, and is meant to be forked.

unsloth (☆ 1.7k)

Unsloth makes local LLM finetuning up to 5x faster without loss in accuracy with optimized GPU kernels. The package is compatible with Nvidia GPUs from 2018 onward.

meditron (☆ 1k)

A family of open-source medical Large Language Models adapted from Llama2 with 7B and 70B parameters. Meditron-70B surpasses Llama-2-70B, GPT-3.5 and Flan-PaLM in medical reasoning, and sports a 4k token context length.

rags (☆ 4.7k)

A Streamlit app that lets you build a Retrieval-Augmented Generation (RAG) pipeline over your own data using just natural language. RAGs supports setting RAG parameters like top-k retrieval, and chunk size via UI. The app is compatible with LLMs from OpenAI, Anthropic, Hugging Face, and Replicate.

Translate over 3 billion voices without the hassle of managing multiple APIs

Speechmatics has launched Real-Time Translation as part of its all-in-one Speech API.

Their new self-supervised model can bring your product or service to the largest audience possible, without the hassle of multiple different language APIs and lengthy setup times.

Try Free ↗

^{PYTORCH TIP}
Softmax with Temperature

Softmax with temperature scaling is a technique in deep learning to control the sharpness of the probability distribution output by the softmax function. By adjusting the temperature parameter, you can make the distribution softer (higher temperature) or sharper (lower temprature).

When To Use

Classification: For classification networks trained on limited or biased data, the default softmax scaling may be overconfident. Increase temperature to lower confidence.
Reinforcement Learning: Use temperature to adjust the balance between exploration and exploitation. Increase softmax temperature to encourage exploration.
Model Ensembles: Control the way confidence scores from constituent models are combined.

Benefits

Flexible: Offers a simple yet effective way to adjust the behavior of the softmax function.
Interpretable: Physically inspired, temperature scaling offers an intuitive control knob.
Compatible with Gradient-Based Training: The temperature-scaled softmax is differentiable, so temperature can be incorporated directly into the model architecture and trained end-to-end.


import torch
import torch.nn.functional as F

def softmax_with_temp(logits, temp=1.0):
    return F.softmax(logits / temp, dim=-1)

logits = torch.tensor([[1.0, 2.0, 3.0]])

# Apply softmax with temperatures

# Default temperature (1.0)
default = softmax_with_temp(logits)

# Sharper 
colder = softmax_with_temp(logits, 0.5)

# Softer
warmer = softmax_with_temp(logits, 2.0)

print('Default:', default.tolist())
print('Colder:', colder.tolist())
print('Warmer:', warmer.tolist())

Default: [[0.09003057330846786,
0.2447284758090973, 0.6652409434318542]]

Colder: [[0.01587624102830887,
0.11731042712926865, 0.8668133616447449]]

Warmer: [[0.18632373213768005,
0.30719590187072754, 0.5064803957939148]]

^{PYTHON TIP}
Reduce for Cumulative Operations

The functools.reduce function is a powerful tool for performing cumulative operations on iterables. It successively applies an operation to the elements of an iterable, reducing it to a single cumulative value.

When To Use

Iterative Calculations: Ideal for summing, multiplying, concatenating strings, converting number bases, and other scenarios with successive application of a function.
Data Processing Pipelines: In conjunction with map and filter operations, iterative reduction can streamline sequences of operations that transform data.

Benefits

Efficiency: Faster than looping through data, with lower memory footprint
Flexibility: Can be used with any function that takes two inputs and returns one output.
Readability: Simplifies complex operations into concise, readable code.


from functools import reduce

# Function to apply (e.g., to calculate product)
def multiply(x, y):
    return x * y

# Iterable (e.g., a list of numbers)
numbers = [1, 2, 3, 4, 5]

# Using reduce to calculate the product of numbers
product = reduce(multiply, numbers)

print("Product of numbers:", product)

# Output:
# Product of numbers: 120

How was today’s email?

Not Great Good Amazing

Thank You

Igor Tica is a contributing writer at AlphaSignal and a research engineer at SmartCat, focusing on computer vision. He's actively seeking partnerships in self-supervised and contrastive learning.

Jacob Marks is an editor at AlphaSignal and ML engineer at Voxel51, is recognized as a leading AI voice on Medium and LinkedIn. Formerly at Google X and Samsung, he holds a Ph.D. in Theoretical Physics from Stanford.

Want to promote your company, product, job, or event to 150,000+ AI researchers and engineers? You can reach out here.

🍎 Apple's New Open-Source ML Framework

Your weekly technical digest of top projects, repos, tips and tricks to stay ahead of the curve.

HIGHLIGHTMLX: An array framework for Apple silicon

TRENDING REPOS

PYTORCH TIPSoftmax with Temperature

PYTHON TIPReduce for Cumulative Operations

^HIGHLIGHT
MLX: An array framework for Apple silicon

^{PYTORCH TIP}
Softmax with Temperature

^{PYTHON TIP}
Reduce for Cumulative Operations