How to Make Open-Source & Local LLMs Work in Practice

Heavybit has partnered with GenLab and the MLOps Community, which gathers thousands of machine learning practitioners interested in learning about deploying and managing models in production, to offer you a virtual ticket to an in-depth meetup, recorded in the second half of 2024, on getting open-source LLMs running locally.

Below, we’ve captured lightning sessions from a variety of practitioners presenting practical methods to get open-source LLMs working locally for a variety of technical use cases, along with session summaries and speaker details.

I. Deploying Vector Databases with Local Models

In this live demo, KubeAI founder Sam Stoelinga walks through exactly how to deploy a vector database using an open model (Google Gemma) running locally on a laptop.

>> WATCH VIDEO: DEPLOYING VECTOR DATABASES WITH LOCAL MODELS

II. Fine-Tuning LLMs for Multi-Turn Function Calling

In this video, Acorn Labs lead AI engineer Sanjay Nadhavajhala explains how to fine-tune LLMs to handle multi-turn function calls, which take place over multiple interactions with users.

>> WATCH VIDEO: FINE-TUNING LLMs FOR MULTI-TURN FUNCTION CALLING

III. Making LLM App Patterns Work With Open-Source LLMs

In this video, Helix.ml CEO and founder Luke Marsden shares lessons his startup team learned while building applications on top of LLMs.

>> WATCH VIDEO: MAKING LLM APP PATTERNS WORK WITH OPEN-SOURCE LLMS

IV. Self-Hosting Llama 3.1 405B

In this video, researcher Aditya Advani discusses the approach he took to deploying Meta's 405-billion-parameter model locally.

>> WATCH VIDEO: SELF-HOSTING LLAMA 3.1 405B

V. You've Got Your Model Now What?

In this video, Expanso CEO David Aronchick discusses the real-world challenges teams must face as they plan to deploy their models.

>> WATCH VIDEO: YOU'VE GOT YOUR MODEL NOW WHAT?

VI. Large-Scale Vector Search in E-Commerce and Email RAG

In this video, Weaviate co-founder and CTO Etienne Dilocker discusses large-scale semantic search use cases for e-commerce and RAG.

>> WATCH VIDEO: LARGE-SCALE VECTOR SEARCH IN E-COMMMERCE AND EMAIL RAG

VII. Fine-Tuning in Practice

In this video, Sailplane CEO Sam Ramji discusses the important considerations teams should make when fine-tuning models, including model size and training costs.

>> WATCH VIDEO: FINE-TUNING IN PRACTICE

VIII. How to Own Your AI Code Assistant

In this video, Continue co-founder and CEO Ty Dunn explains how to use open-source, self-hosted models to create your own AI coding assistant.

>> WATCH VIDEO: HOW TO OWN YOUR AI CODE ASSISTANT

Content from the Library

Visit library

Sep 19, 2024

Podcast

Generationship Ep. #20, Smells Like ML with Salma Mayorquin and Terry Rodriguez of Remyx AI

In episode 20 of Generationship, Rachel Chalmers is joined by Salma Mayorquin and Terry Rodriguez of Remyx AI. Together they...

Sep 5, 2024

Article

Enterprise AI Infrastructure: Privacy, Maturity, Resources

Enterprise AI Infrastructure: Privacy, Economics, and Best First Steps The path to perfect AI infrastructure has yet to be...

Apr 15, 2024

Article

MLOps vs. Eng: Misaligned Incentives and Failure to Launch?

Failure to Launch: The Challenges of Getting ML Models into Prod Machine learning is a subset of AI–the practice of using...