How to Make Open-Source & Local LLMs Work in Practice
- Heavybit
How to Get Open-Source LLMs Running Locally
Heavybit has partnered with GenLab and the MLOps Community, which gathers thousands of machine learning practitioners interested in learning about deploying and managing models in production, to offer you a virtual ticket to an in-depth meetup, recorded in the second half of 2024, on getting open-source LLMs running locally.
Below, we’ve captured lightning sessions from a variety of practitioners presenting practical methods to get open-source LLMs working locally for a variety of technical use cases, along with session summaries and speaker details.
I. Deploying Vector Databases with Local Models
In this live demo, KubeAI founder Sam Stoelinga walks through exactly how to deploy a vector database using an open model (Google Gemma) running locally on a laptop.
>> WATCH VIDEO: DEPLOYING VECTOR DATABASES WITH LOCAL MODELS
II. Fine-Tuning LLMs for Multi-Turn Function Calling
In this video, Acorn Labs lead AI engineer Sanjay Nadhavajhala explains how to fine-tune LLMs to handle multi-turn function calls, which take place over multiple interactions with users.
>> WATCH VIDEO: FINE-TUNING LLMs FOR MULTI-TURN FUNCTION CALLING
III. Making LLM App Patterns Work With Open-Source LLMs
In this video, Helix.ml CEO and founder Luke Marsden shares lessons his startup team learned while building applications on top of LLMs.
>> WATCH VIDEO: MAKING LLM APP PATTERNS WORK WITH OPEN-SOURCE LLMS
IV. Self-Hosting Llama 3.1 405B
In this video, researcher Aditya Advani discusses the approach he took to deploying Meta's 405-billion-parameter model locally.
>> WATCH VIDEO: SELF-HOSTING LLAMA 3.1 405B
V. You've Got Your Model Now What?
In this video, Expanso CEO David Aronchick discusses the real-world challenges teams must face as they plan to deploy their models.
>> WATCH VIDEO: YOU'VE GOT YOUR MODEL NOW WHAT?
VI. Large-Scale Vector Search in E-Commerce and Email RAG
In this video, Weaviate co-founder and CTO Etienne Dilocker discusses large-scale semantic search use cases for e-commerce and RAG.
>> WATCH VIDEO: LARGE-SCALE VECTOR SEARCH IN E-COMMMERCE AND EMAIL RAG
VII. Fine-Tuning in Practice
In this video, Sailplane CEO Sam Ramji discusses the important considerations teams should make when fine-tuning models, including model size and training costs.
>> WATCH VIDEO: FINE-TUNING IN PRACTICE
VIII. How to Own Your AI Code Assistant
In this video, Continue co-founder and CEO Ty Dunn explains how to use open-source, self-hosted models to create your own AI coding assistant.
>> WATCH VIDEO: HOW TO OWN YOUR AI CODE ASSISTANT
Content from the Library
Generationship Ep. #20, Smells Like ML with Salma Mayorquin and Terry Rodriguez of Remyx AI
In episode 20 of Generationship, Rachel Chalmers is joined by Salma Mayorquin and Terry Rodriguez of Remyx AI. Together they...
Enterprise AI Infrastructure: Privacy, Maturity, Resources
Enterprise AI Infrastructure: Privacy, Economics, and Best First Steps The path to perfect AI infrastructure has yet to be...
MLOps vs. Eng: Misaligned Incentives and Failure to Launch?
Failure to Launch: The Challenges of Getting ML Models into Prod Machine learning is a subset of AI–the practice of using...