1. Library
  2. How to Make Open-Source & Local LLMs Work in Practice

How to Make Open-Source & Local LLMs Work in Practice

Light Mode
How to Get Open-Source LLMs Running Locally
I. Deploying Vector Databases with Local Models
II. Fine-Tuning LLMs for Multi-Turn Function Calling
III. Making LLM App Patterns Work With Open-Source LLMs
IV. Self-Hosting Llama 3.1 405B
V. You've Got Your Model Now What?
VI. Large-Scale Vector Search in E-Commerce and Email RAG
VII. Fine-Tuning in Practice
VIII. How to Own Your AI Code Assistant
  • Heavybit Photo
    Heavybit
    Heavybit
3 min

How to Get Open-Source LLMs Running Locally

Heavybit has partnered with GenLab and the MLOps Community, which gathers thousands of machine learning practitioners interested in learning about deploying and managing models in production, to offer you a virtual ticket to an in-depth meetup, recorded in the second half of 2024, on getting open-source LLMs running locally.

Below, we’ve captured lightning sessions from a variety of practitioners presenting practical methods to get open-source LLMs working locally for a variety of technical use cases, along with session summaries and speaker details.

I. Deploying Vector Databases with Local Models

In this live demo, KubeAI founder Sam Stoelinga walks through exactly how to deploy a vector database using an open model (Google Gemma) running locally on a laptop.

>> WATCH VIDEO: DEPLOYING VECTOR DATABASES WITH LOCAL MODELS

II. Fine-Tuning LLMs for Multi-Turn Function Calling

In this video, Acorn Labs lead AI engineer Sanjay Nadhavajhala explains how to fine-tune LLMs to handle multi-turn function calls, which take place over multiple interactions with users.

>> WATCH VIDEO: FINE-TUNING LLMs FOR MULTI-TURN FUNCTION CALLING

III. Making LLM App Patterns Work With Open-Source LLMs

In this video, Helix.ml CEO and founder Luke Marsden shares lessons his startup team learned while building applications on top of LLMs.

>> WATCH VIDEO: MAKING LLM APP PATTERNS WORK WITH OPEN-SOURCE LLMS

IV. Self-Hosting Llama 3.1 405B

In this video, researcher Aditya Advani discusses the approach he took to deploying Meta's 405-billion-parameter model locally.

>> WATCH VIDEO: SELF-HOSTING LLAMA 3.1 405B

V. You've Got Your Model Now What?

In this video, Expanso CEO David Aronchick discusses the real-world challenges teams must face as they plan to deploy their models.

>> WATCH VIDEO: YOU'VE GOT YOUR MODEL NOW WHAT?

VI. Large-Scale Vector Search in E-Commerce and Email RAG

In this video, Weaviate co-founder and CTO Etienne Dilocker discusses large-scale semantic search use cases for e-commerce and RAG.

>> WATCH VIDEO: LARGE-SCALE VECTOR SEARCH IN E-COMMMERCE AND EMAIL RAG

VII. Fine-Tuning in Practice

In this video, Sailplane CEO Sam Ramji discusses the important considerations teams should make when fine-tuning models, including model size and training costs.

>> WATCH VIDEO: FINE-TUNING IN PRACTICE

VIII. How to Own Your AI Code Assistant

In this video, Continue co-founder and CEO Ty Dunn explains how to use open-source, self-hosted models to create your own AI coding assistant.

>> WATCH VIDEO: HOW TO OWN YOUR AI CODE ASSISTANT