Member-only story

Local Inference with Meta’s Latest Llama 3.2 LLMs Using Ollama, LangChain, and Streamlit

Meta’s latest Llama 3.2 1B and 3B models are available from Ollama. Learn how to install and interact with these models locally using Streamlit and LangChain.

Gary A. Stafford
9 min readSep 27, 2024
Streamlit application featured in this post

Introduction

Meta just announced the release of Llama 3.2, a revolutionary set of open, customizable edge AI and vision models, including “small and medium-sized vision LLMs (11B and 90B), and lightweight, text-only models (1B and 3B) that fit onto edge and mobile devices, including pre-trained and instruction-tuned versions.

According to Meta, the lightweight, text-only Llama 3.2 1B and 3B models “support context length of 128K tokens and are state-of-the-art in their class for on-device use cases like summarization, instruction following, and rewriting tasks running locally at the edge.

The same day, Ollama announced that the lightweight, text-only Llama 3.2 1B and 3B models were available on their platform. For those unfamiliar with it, Ollama allows you to “Get up and running with large language models” quickly and easily in your local environment. According to Ollama, “The prompts and responses should feel instantaneous, and with Ollama

--

--

Gary A. Stafford
Gary A. Stafford

Written by Gary A. Stafford

Area Principal Solutions Architect @ AWS | 10x AWS Certified Pro | Polyglot Developer | DataOps | GenAI | Technology consultant, writer, and speaker

Responses (3)