Lior's View
Posts
⚙️ Top AI repos you should know about + coding tips

⚙️ Top AI repos you should know about + coding tips

Let's get technical, we just prepared you a toolbox of interesting repos and tools you can start implementing today. Enjoy!

Lior Sinclair
May 03, 2023

AlphaSignal

Hey ,

Welcome to AlphaSignal, your weekly guide to the latest and most promising developments in the world of AI.

In this week's edition, we've scoured GitHub to bring you the most trending and groundbreaking repositories, with the LLaMA Adapter as our standout highlight. We'll also explore the emergence of text-to-audio models and exciting projects building on top of Meta's recent work. But it's not just about the repositories - we also highlight practical tools and services to help you be even more productive. And for those looking to improve their coding skills, we've got you covered with tips for Python and PyTorch.

Let's get into it!

Lior

In today’s summary:

Repo highlight: LLaMa Adapter
Top of Github: Bark, MiniGPT,..
Python Tip: Zip()
New AI tools: Cursor, HuggingChat,..
Pytorch Tip: Gradient accumulation

^HIGHLIGHT
LLaMA Adapter

LLaMA-Adapter is a new lightweight adaptation method proposed for fine-tuning instruction-following LLaMA models, by using 52K data provided by Stanford Alpaca. The method involves inserting adapters into LLaMA's transformer, introducing only 1.2M learnable parameters and converting the LLaMA into an instruction-following model within one hour. To stabilize training at early stages, a novel Zero-init Attention with zero gating mechanism has been proposed to adaptively incorporate instructional signals.

After fine-tuning, LLaMA-Adapter can generate high-quality instruction-following sentences similar to those produced by the fully fine-tuned Stanford Alpaca and Alpaca-Lora. The approach is easily extendable to multi-modal input instructions. The reasoning framework of image-conditioned LLaMA-Adapter for ScienceQA is shared by other modalities such as audio and video. This method has the potential to significantly enhance the performance of LLaMA models and improve their capability for generating high-quality instruction-following sentences.

The Buyer’s Guide to Evaluating ML Feature Stores & Feature Platforms
If you’re looking to adopt a feature store or platform for machine learning and don’t know where or how to start your research, then this guide is for you.

Download the free guide to:

Access a comprehensive framework for understanding the capabilities of different feature stores and platforms
Get tips to on how to use a data-driven approach to evaluate vendors
Learn how the right solution can improve ML model accuracy

Want to promote a product, job, or event to 100,000+ AI researchers and engineers? You can reach out to us here.

⚙️ TOP OF GITHUB

IDEA-Research / Grounded-Segment-Anything
Marrying Grounding DINO with Segment Anything & Stable Diffusion & BLIP & Whisper & ChatBot - Automatically Detect , Segment and Generate Anything with Image, Text, and Speech Inputs

suno-ai / bark
Text-to-audio model that generates multilingual speech and supports nonverbal communication. Pretrained checkpoints are available for inference.

Vision-CAIR / MiniGPT-4
MiniGPT-4: Enhancing Vision-language Understanding with Advanced Large Language Models

togethercomputer / RedPajama-Data
The repository contains code for preparing large datasets for training large language models.

microsoft / semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps

^{PYTHON TIP}
ZIP()

When working with iterable objects such as lists, sets, and dictionaries in Python, the "zip" function can be incredibly helpful. In the provided example, this function merges two lists and groups the corresponding elements in tuples.

This makes it easy to iterate through both lists together. However, it's worth noting that if the two lists have different lengths, the resulting zip object will be truncated to the length of the shorter list.

languages = ['Java','Python','Javascript']
versions = [14, 3, 6]

result = zip(languages, versions)

print(list(result))

#Output: [('Java', 14), ('Python', 3), ('Javascript', 6)]

🛠 NEW TOOLS

Cursor
Cursor is an AI-powered programming editor that generates and edits code, and offers other useful features like chat (ChatGPT-style interface that understands your current file). Open-source alternative to GitHub Copilot.

HuggingChat
Open-source alternative to ChatGPT, powered by OpenAssistant.

Segment Anything
Segment Anything Model (SAM): a new AI model from Meta AI that can "cut out" any object, in any image, with a single click · SAM uses a variety of input prompts.

Runway Gen-2
A multi-modal AI system that can generate novel videos with text, images, or video clips.

Amazon Bedrock
A service for developing generative AI applications using foundation models through an API, without managing infrastructure.

^{PYTORCH TIP}
Gradient accumulation

Training Large Language Models (LLMs) can benefit from larger batch sizes, but it may not be possible to fit them into available machines. To address this issue, "Gradient Accumulation" can be used, which involves storing gradients for multiple forward passes before running backpropagation. This allows developers to use larger effective batch sizes without requiring more powerful and expensive hardware.

How can you add gradient accumulation to your current training loop within PyTorch?

optimizer = ...
NUM_ACCUMULATION_STEPS = ...
for epoch in range(...):
    for idx, sample in enumerate(dataloader):
        inputs, labels = sample

        # Forward Pass
        outputs = model(inputs)
        # Compute Loss and Perform Back-propagation
        loss = loss_fn(outputs, labels)

        # Normalize the Gradients
        loss = loss / NUM_ACCUMULATION_STEPS
        loss.backward()
        if (
            ((idx + 1) % NUM_ACCUMULATION_STEPS == 0) 
            or (idx + 1 == len(dataloader)
        ):
            # Update Optimizer
            optimizer.step()
            optimizer.zero_grad()

How was today’s email?

Not Great Good Amazing

Thank You