Overview

The ollama node runs open-source LLMs locally using Ollama. Perfect for edge computing where privacy is critical, internet is unavailable, or you need cost-free unlimited inference. Supports Llama, Mistral, CodeLlama, and more.

Local

On-Device

Private

No Cloud

Free

Unlimited

Offline

Capable

Properties

Property	Type	Default	Description
host	string	"http://localhost:11434"	Ollama server URL
model	string	"llama3.2"	Model name to use
systemPrompt	string	""	System message
temperature	number	0.7	Creativity (0-2)
numCtx	number	4096	Context window size
stream	boolean	false	Stream response tokens
keepAlive	string	"5m"	Keep model loaded in memory

Popular Models

Llama 3.2 (3B)

Great for edge devices, fast inference

2GB RAM Fast

ollama pull llama3.2

Llama 3.1 (8B)

Balanced performance and quality

5GB RAM Capable

ollama pull llama3.1

Mistral (7B)

Excellent reasoning capabilities

4GB RAM Reasoning

ollama pull mistral

CodeLlama (7B)

Specialized for code generation

4GB RAM Code

ollama pull codellama

Phi-3 Mini (3.8B)

Microsoft's efficient small model

2GB RAM Efficient

ollama pull phi3

LLaVA (7B)

Vision + language multimodal

5GB RAM Vision

ollama pull llava

Installing Ollama

Linux (Raspberry Pi, Ubuntu)

curl -fsSL https://ollama.com/install.sh | sh

macOS

brew install ollama

Docker

docker run -d -v ollama:/root/.ollama -p 11434:11434 ollama/ollama

Example: Local Sensor Classification

Classify sensor readings locally without sending data to the cloud.

// Function node: Classify sensor data locally
var reading = msg.payload;

msg.systemPrompt = `You are a sensor data classifier. Classify readings into:
- NORMAL: Within expected range
- WARNING: Approaching limits
- CRITICAL: Requires immediate attention

Respond with ONLY the classification and a brief reason.`;

msg.payload = `Classify this sensor reading:
Temperature: ${reading.temperature}°C
Humidity: ${reading.humidity}%
Pressure: ${reading.pressure} hPa

Normal ranges:
- Temperature: 18-26°C
- Humidity: 30-60%
- Pressure: 980-1020 hPa`;

msg.temperature = 0.1; // Very deterministic
msg.model = "llama3.2"; // Fast, small model

return msg;

Example: Local Image Analysis with LLaVA

Analyze camera images locally using the LLaVA vision model.

// Function node: Analyze image locally
var imageBuffer = msg.payload;
var base64Image = imageBuffer.toString('base64');

msg.model = "llava";
msg.payload = "Describe what you see in this image. Focus on any people or vehicles.";
msg.images = [base64Image];

return msg;

// Output processing
// Function node: Parse and act on results
if (msg.payload.toLowerCase().includes("person")) {
    msg.alert = true;
    msg.alertType = "motion_detected";
}
return msg;

Performance Tips

Keep Model Loaded

Set keepAlive: "1h" to avoid reload delays

Use Quantized Models

q4_0 models use less RAM with minimal quality loss

GPU Acceleration

NVIDIA GPUs with CUDA dramatically speed up inference

Right-Size Context

Lower numCtx if you don't need long context

Related Nodes

openai

Cloud GPT models

anthropic

Claude models

function

Process AI output