📝 How to Expand LLMs Memory

Your weekly technical digest of top projects, repos, tips and tricks to stay ahead of the curve.

AlphaSignal

Hey ,

Welcome to this week's edition of AlphaSignal the newsletter for AI professionals.

Whether you are a researcher, engineer, developer, or data scientist, our summaries ensure you're always up-to-date with the latest breakthroughs in AI.

Let's get into it!

Lior

On Today’s Summary:

  • Repo Highlight: MemGPT

  • Trending Repos: litellm, 4DGaussians

  • Pytorch Tip: ONNX

  • Trending Models: MistralLite, SSD-1B

  • Python Tip: set()

Reading time: 3 min 29 sec

HIGHLIGHT
⚡️ MemGPT: Transforming LLMs into Memory Managers

What’s New
MemGPT expands the memory capacity of language models. It uses a tiered memory system to help the model manage more text, improving performance in long chats and big document analysis.

Why Does It Matter
Current LLMs are limited by how much they can “remember” at once. This can hinder performance for tasks like document analysis and multi-session chats. MemGPT enables LLMs to efficiently handle extended conversations or analyze bigger documents without forgetting details.

How it Works
MemGPT operates in analogy with computer operating systems. It creates a virtual memory space for LLMs, similar to how computers use RAM and hard drives. This allows models to keep the most relevant data in quick-access memory and store other information in an external context.

Features

  • Extended Memory: Mimics computer memory systems to give LLMs a larger "memory space".

  • Self-Regulating: The LLM can decide how to manage and transfer its data.

  • Broad Use Cases: Useful for longer conversations and larger documents, and compatible with a wide range of LLMs.

⚙️ TRENDING REPOS

BerriAI / litellm (☆ 2k)
Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

hustvl / 4DGaussians (☆ 800)
4D Gaussian Splatting (4D-GS) is a new method for quickly and efficiently rendering dynamic scenes in real-time. It offers low storage requirements and fast training, and generates high-quality high-resolution visuals.

spdustin / ChatGPT-AutoExpert (☆ 4k)
AutoExpert offers an effective set of custom instructions designed to improve the performance of GPT-4 and GPT-3.5-Turbo, optimizing responses for depth and context.

thuml / Time-Series-Library (☆ 2k)
TSlib is an open-source library for creating and evaluating deep time series models. The library covers five key tasks: forecasting, imputation, anomaly detection, and classification.

dennybritz / reinforcement-learning (☆ 19k)
Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

PYTORCH TIP
ONNX

Open Neural Network Exchange (ONNX) provides an open-source format for deep learning models, allowing interchangeability between various deep learning frameworks. PyTorch's integration with ONNX enables developers to move models between different platforms with ease, optimizing for inference and deployment.

When To Use

  • Interoperability: When you need to use or deploy a PyTorch model in a different framework or platform.

  • Optimized Inference: To leverage platform-specific optimizations for faster inference.

Benefits

  • Flexibility: Transfer models between various deep learning frameworks without being locked into one.

  • Ease of Deployment: Facilitate deployment on cloud platforms and edge devices that support ONNX.


# PyTorch to ONNX
import torch
import torch.onnx
import torchvision.models as models

model = models.resnet18(pretrained=True)
model.eval()
x = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, x, "resnet18.onnx")

# ONNX Runtime for inference
import onnxruntime

session = onnxruntime.InferenceSession("resnet18.onnx")
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

result = session.run([output_name], {input_name: x.numpy()})

# result now contains the inference output

🗳️ TRENDING MODELS/SPACES

amazon/MistralLite
MistralLite is an optimized version of Mistral-7B-v0.1 that is adept at processing extended contexts up to 32K tokens. By leveraging refined Rotary Embedding and a sliding window, it offers enhanced performance in tasks like summarization and question-answering over its predecessor.

LP-Music-Caps-demo
A project designed to generate descriptive captions for music using two approaches: transforming music tags into captions with OpenAI's GPT-3.5 Turbo API, and directly translating music audio to captions using a trained cross-model encoder-decoder model.

SimianLuo/LCM_Dreamshaper_v7
Latent Consistency Models (LCMs) offer rapid, high-resolution image synthesis by predicting solutions in the latent space, reducing the need for extensive iterative sampling. LCMs deliver top-tier text-to-image generation performance in fewer steps and lower latency than other diffusion models.

PYTHON TIP
Set Collection

The ‘set’ data type in Python is designed for checking membership of elements in a collection. When you have a large dataset and need to frequently verify if an item exists within it, using a ‘set’ can be much faster than a list.

When To Use

  • Frequent Membership Queries: When you need to repeatedly check for the existence of elements in the same collection.

  • Data Deduplication: When you need to eliminate duplicate entries from a collection.

Benefits

  • Speed: Set operations like membership tests are very fast, generally achieving O(1) time complexity.

  • Simplicity: Easily convert a list to a set and vice versa.

  • Uniqueness: By design, sets don't allow duplicate entries.


my_list = [1, 2, 2, 2, 2, 3, 5]

# Convert to set
my_set = set(my_list)

# Output (it removed duplicates)
{1, 2, 3, 5}

%time print(3 in my_list)
CPU times: user 71 µs,

%time print(3 in my_set)
CPU times: user 1.03 ms,

# lookups are 71x faster!


How was today’s email?

Not Great      Good      Amazing

Thank You

Want to promote your company, product, job, or event to 150,000+ AI researchers and engineers? You can reach out here.