Skip to main content
Advanced AI / LLM Camera Messaging

AI-Powered Smart Doorbell

Build a smart doorbell that captures a visitor photo, uses a local LLM (LLaVA via Ollama) to describe who is at the door, and sends an intelligent notification to your phone via Telegram. All AI processing runs locally on your hardware -- no cloud APIs needed.

5
Nodes Used
~45min
Build Time
Local
AI Processing
~3s
Response Time

Flow Architecture

[GPIO In: Button] --> [HTTP Request: Capture Image] --+
                                                       |
[HTTP In: POST /doorbell] -----------------------------+
                                                       |
                                                       v
                                            [Function: Prepare Image]
                                                       |
                                                       v
                                            [Ollama: LLaVA Vision]
                                            "Describe this visitor"
                                                       |
                                                       v
                                            [Function: Format Message]
                                                       |
                                                       v
                                            [Telegram: Send with Photo]

What You'll Need

Hardware

  • Raspberry Pi 4/5 (4GB+ RAM recommended)
  • Pi Camera Module or IP camera with snapshot URL
  • Push button + 10kΩ pull-down resistor (for GPIO trigger)
  • Optional: separate GPU server for faster LLM inference

Software

  • EdgeFlow installed and running
  • Ollama installed with LLaVA model pulled
  • Telegram Bot (via BotFather)
  • Optional: motion (for Pi Camera HTTP snapshots)

Step-by-Step Setup

1

Install Ollama

Install Ollama on the machine that will run the LLM. This can be the Pi itself (slower) or a separate server with a GPU (recommended for faster inference).

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Verify installation
ollama --version
2

Pull the LLaVA Vision Model

LLaVA is a multimodal model that can understand images. Pull the model -- this may take several minutes depending on your internet speed (the model is approximately 4.7GB).

# Pull LLaVA model (4.7GB)
ollama pull llava

# Test it works
ollama run llava "Describe this image" --image test.jpg

# For a smaller/faster model, try:
ollama pull llava:7b
3

Set Up the Camera

Configure your camera to provide an HTTP snapshot URL. For a Pi Camera, use the motion package. For an IP camera, find the snapshot URL in the camera's documentation.

# Pi Camera with motion
sudo apt install motion
# Edit /etc/motion/motion.conf:
#   stream_port 8081
#   snapshot_interval 0
#   webcontrol_port 8080
sudo systemctl start motion

# Snapshot URL: http://localhost:8080/0/action/snapshot
# Stream URL: http://localhost:8081

# IP Camera examples:
# http://192.168.1.100/snapshot.jpg
# rtsp://user:pass@192.168.1.100:554/stream1
4

Create a Telegram Bot

Open Telegram, search for @BotFather, and send /newbot. Follow the prompts to get your bot token. Then send a message to your bot and use the API to find your chat ID.

# After creating the bot, get your chat ID:
curl "https://api.telegram.org/botYOUR_BOT_TOKEN/getUpdates"

# Look for: "chat":{"id":123456789,...}
# That number is your chat ID
5

Import the Flow

Copy the flow JSON below. In EdgeFlow, go to Menu → Import, paste the JSON, and click Import.

6

Configure and Deploy

Update the following in the flow: camera snapshot URL, Ollama host (default: http://localhost:11434), Telegram bot token and chat ID. Then click Deploy. Press the doorbell button or send a POST request to /doorbell to test.

Configuration Details

Ollama Node Configuration

Property Value Notes
host http://localhost:11434 Change if Ollama runs on another machine
model llava Multimodal vision model
temperature 0.3 Low for consistent descriptions
prompt (see below) Custom prompt for doorbell context

LLaVA Prompt Engineering

The prompt is critical for getting useful descriptions. Here is the optimized prompt used in the flow:

You are a smart doorbell assistant. Analyze this doorbell camera image and provide
a brief, useful description of the visitor. Include:
1. Number of people visible
2. Apparent gender and approximate age
3. Notable clothing or accessories
4. Whether they are carrying packages or items
5. Any visible vehicles in the background
6. Overall assessment (delivery person, neighbor, stranger, etc.)

Keep the description to 2-3 sentences. Be factual and concise.
Do not speculate about identity or intentions.

Function Node Code

Prepare Image for LLM

This function takes the captured image (as a buffer) and prepares it for the Ollama vision API:

// Prepare image for Ollama LLaVA model
// Input: msg.payload = image buffer from HTTP request
// Output: msg with base64 image for Ollama node

var imageBuffer = msg.payload;
var base64Image = imageBuffer.toString('base64');

// Store original image for Telegram later
flow.set('doorbell_image', imageBuffer);
flow.set('doorbell_time', new Date().toLocaleString());

msg.payload = {
    model: "llava",
    prompt: "You are a smart doorbell assistant. Analyze this doorbell camera image and provide a brief description of the visitor. Include number of people, appearance, clothing, packages, and your assessment of who they might be (delivery, neighbor, stranger). Keep it to 2-3 sentences.",
    images: [base64Image],
    stream: false,
    options: {
        temperature: 0.3
    }
};

return msg;

Format Telegram Message

This function formats the LLM response into a nice Telegram notification:

// Format the Ollama response for Telegram
var description = msg.payload.response || msg.payload;
var timestamp = flow.get('doorbell_time') || new Date().toLocaleString();
var image = flow.get('doorbell_image');

// Build Telegram message
msg.payload = {
    type: "photo",
    content: image,
    caption: "🛎️ *Doorbell Ring*\n"
           + "🕒 " + timestamp + "\n\n"
           + "👤 *Visitor Description:*\n"
           + description + "\n\n"
           + "_AI-powered by LLaVA (local)_",
    options: {
        parse_mode: "Markdown"
    }
};

return msg;

Complete Flow JSON

Copy and import this flow into EdgeFlow via Menu → Import.

{
  "name": "AI-Powered Smart Doorbell",
  "nodes": [
    {
      "id": "gpio_button",
      "type": "gpio-in",
      "name": "Doorbell Button",
      "pin": 17,
      "edge": "rising",
      "debounce": 500,
      "x": 120,
      "y": 120
    },
    {
      "id": "http_in_doorbell",
      "type": "http-in",
      "name": "POST /doorbell",
      "method": "post",
      "url": "/doorbell",
      "x": 120,
      "y": 240
    },
    {
      "id": "http_capture",
      "type": "http-request",
      "name": "Capture Image",
      "method": "GET",
      "url": "http://localhost:8080/0/action/snapshot",
      "returnType": "bin",
      "x": 360,
      "y": 180
    },
    {
      "id": "func_prepare",
      "type": "function",
      "name": "Prepare Image for LLM",
      "code": "var imageBuffer = msg.payload;\nvar base64Image = imageBuffer.toString('base64');\nflow.set('doorbell_image', imageBuffer);\nflow.set('doorbell_time', new Date().toLocaleString());\nmsg.payload = { model: 'llava', prompt: 'You are a smart doorbell assistant. Analyze this doorbell camera image and provide a brief description of the visitor. Include number of people, appearance, clothing, packages, and assessment. Keep to 2-3 sentences.', images: [base64Image], stream: false, options: { temperature: 0.3 } };\nreturn msg;",
      "x": 580,
      "y": 180
    },
    {
      "id": "ollama_llava",
      "type": "ollama",
      "name": "LLaVA Vision",
      "host": "http://localhost:11434",
      "model": "llava",
      "x": 800,
      "y": 180
    },
    {
      "id": "func_format",
      "type": "function",
      "name": "Format Message",
      "code": "var description = msg.payload.response || msg.payload;\nvar timestamp = flow.get('doorbell_time') || new Date().toLocaleString();\nvar image = flow.get('doorbell_image');\nmsg.payload = { type: 'photo', content: image, caption: 'Doorbell Ring\n' + timestamp + '\n\nVisitor Description:\n' + description, options: { parse_mode: 'Markdown' } };\nreturn msg;",
      "x": 1020,
      "y": 180
    },
    {
      "id": "telegram_send",
      "type": "telegram",
      "name": "Send Notification",
      "botToken": "YOUR_BOT_TOKEN",
      "chatId": "YOUR_CHAT_ID",
      "x": 1240,
      "y": 180
    },
    {
      "id": "http_response",
      "type": "http-response",
      "name": "OK Response",
      "statusCode": 200,
      "x": 1240,
      "y": 280
    },
    {
      "id": "debug_desc",
      "type": "debug",
      "name": "Log Description",
      "x": 1240,
      "y": 100
    }
  ],
  "connections": [
    { "from": "gpio_button", "to": "http_capture" },
    { "from": "http_in_doorbell", "to": "http_capture" },
    { "from": "http_capture", "to": "func_prepare" },
    { "from": "func_prepare", "to": "ollama_llava" },
    { "from": "ollama_llava", "to": "func_format" },
    { "from": "func_format", "to": "telegram_send" },
    { "from": "func_format", "to": "http_response" },
    { "from": "ollama_llava", "to": "debug_desc" }
  ]
}

Expected Output

When someone presses the doorbell, you receive a Telegram message like this:

🤖
EdgeFlow Doorbell

Doorbell Ring

2/12/2026, 2:34:15 PM

Visitor Description:
A middle-aged man wearing glasses and a blue jacket is standing at the front door, carrying a medium-sized cardboard package. He appears to be a delivery person. No vehicles visible in the driveway.

AI-powered by LLaVA (local)

The debug node also logs the raw LLM response for review:

{
  "payload": {
    "model": "llava",
    "response": "A middle-aged man wearing glasses and a blue jacket is standing at the front door, carrying a medium-sized cardboard package. He appears to be a delivery person. No vehicles visible in the driveway.",
    "done": true,
    "total_duration": 2847000000,
    "eval_count": 42
  }
}

Troubleshooting

Ollama connection refused

Verify Ollama is running with systemctl status ollama. If running on a different machine, ensure the host is set to http://REMOTE_IP:11434 and that the firewall allows port 11434. You may need to set OLLAMA_HOST=0.0.0.0 in the Ollama environment.

LLaVA response is very slow

On a Raspberry Pi 4, LLaVA can take 15-30 seconds. For faster results, run Ollama on a separate machine with a GPU. Alternatively, try the smaller llava:7b model. Ensure no other heavy processes are consuming memory or CPU on the Pi.

Camera snapshot returns error

Test the snapshot URL directly in a browser. For Pi Camera via motion, ensure the service is running and the correct port is configured. For IP cameras, check authentication credentials in the URL. Some cameras require digest auth rather than basic auth.

Telegram photo not sending

Verify the bot token and chat ID are correct. The image buffer must be a valid JPEG or PNG. Check that the Telegram bot has not been blocked. Use the debug node to inspect the image buffer size -- it should be at least a few KB for a valid image.

Next Steps