Overview

EdgeFlow provides comprehensive telemetry and monitoring for every device in your fleet. Metrics are collected locally, broadcast in real-time via WebSocket, and synced to the cloud for centralized dashboards and alerting.

System Metrics

The resource monitor collects system-level metrics continuously:

Metric	Source	Description
CPU Usage	`/proc/stat`	Per-core and total CPU utilization percentage
Memory	`/proc/meminfo`	Total, used, free, available, cached, and swap
Disk	`statfs`	Total, used, free disk space per mount
Temperature	`/sys/class/thermal`	CPU/GPU temperature (Raspberry Pi)
Uptime	`/proc/uptime`	System uptime in seconds
Load Average	`/proc/loadavg`	1, 5, and 15-minute load averages
Network I/O	`/proc/net/dev`	Bytes received/transmitted per interface
Goroutines	`runtime`	Active Go goroutines count

REST API

# Get resource statistics
curl http://localhost:8080/api/v1/resources/stats

{
  "cpu": {
    "usage_percent": 23.5,
    "cores": 4
  },
  "memory": {
    "total_bytes": 4294967296,
    "used_bytes": 1073741824,
    "available_bytes": 3221225472,
    "percent": 25.0
  },
  "disk": {
    "total_bytes": 32000000000,
    "used_bytes": 8000000000,
    "free_bytes": 24000000000,
    "percent": 25.0
  },
  "temperature": 45.5,
  "uptime": 86400,
  "load_avg": {
    "1min": 0.5,
    "5min": 0.3,
    "15min": 0.2
  }
}

# Get detailed resource report
curl http://localhost:8080/api/v1/resources/report

Flow Metrics

The metrics service tracks flow-level statistics:

Metric	Description
`total_flows`	Total number of flows configured
`running_flows`	Currently executing flows
`stopped_flows`	Flows in stopped state
`total_executions`	Cumulative execution count
`failed_executions`	Executions that resulted in errors
`success_rate`	Percentage of successful executions

Execution History

Every flow execution is recorded with node-level event tracking. Up to 100 records are kept in memory with a sliding window.

# Get execution history
curl http://localhost:8080/api/v1/executions

{
  "executions": [
    {
      "id": "exec_xyz789",
      "flow_id": "flow_abc123",
      "flow_name": "Temperature Monitor",
      "status": "completed",
      "start_time": "2026-02-21T12:00:00Z",
      "end_time": "2026-02-21T12:00:01Z",
      "duration": 1234,
      "node_count": 5,
      "completed_nodes": 5,
      "error_nodes": 0,
      "node_events": [
        {
          "node_id": "node_001",
          "node_name": "DHT22 Sensor",
          "node_type": "dht",
          "status": "success",
          "execution_time": 45,
          "timestamp": 1708516800
        },
        {
          "node_id": "node_002",
          "node_name": "MQTT Publish",
          "node_type": "mqtt_out",
          "status": "success",
          "execution_time": 12,
          "timestamp": 1708516801
        }
      ]
    }
  ]
}

Real-time WebSocket Events

Connect to ws://localhost:8080/ws to receive live telemetry updates. The WebSocket hub broadcasts the following event types:

Event Type	Trigger	Data
`flow_status`	Flow start/stop/create/delete	Flow ID, name, status
`node_status`	Node add/remove/update	Node ID, type, status
`execution`	Each node execution step	Node ID, input/output, timing, status
`log`	Application log events	Level, message, source, fields
`notification`	System alerts	Title, message, severity
`gpio_state`	GPIO pin change (200ms poll)	Pin number, mode, value

WebSocket Message Format

{
  "type": "execution",
  "timestamp": "2026-02-21T12:00:00Z",
  "data": {
    "flow_id": "flow_abc123",
    "node_id": "node_001",
    "node_name": "DHT22 Sensor",
    "node_type": "dht",
    "input": {"payload": "trigger"},
    "output": {"temperature": 22.5, "humidity": 65.0},
    "status": "success",
    "execution_time": 45,
    "timestamp": 1708516800
  }
}

Prometheus Export

EdgeFlow exports metrics in Prometheus text format for integration with Prometheus, Grafana, and other monitoring stacks:

# HELP edgeflow_flows_total Total number of flows
# TYPE edgeflow_flows_total gauge
edgeflow_flows_total 12

# HELP edgeflow_flows_running Currently running flows
# TYPE edgeflow_flows_running gauge
edgeflow_flows_running 5

# HELP edgeflow_executions_total Total executions
# TYPE edgeflow_executions_total counter
edgeflow_executions_total 1542

# HELP edgeflow_executions_failed Failed executions
# TYPE edgeflow_executions_failed counter
edgeflow_executions_failed 23

# HELP edgeflow_nodes_total Total configured nodes
# TYPE edgeflow_nodes_total gauge
edgeflow_nodes_total 87

# HELP edgeflow_uptime_seconds System uptime
# TYPE edgeflow_uptime_seconds gauge
edgeflow_uptime_seconds 86400

# HELP edgeflow_cpu_usage_percent CPU usage
# TYPE edgeflow_cpu_usage_percent gauge
edgeflow_cpu_usage_percent 23.5

# HELP edgeflow_memory_used_bytes Memory usage
# TYPE edgeflow_memory_used_bytes gauge
edgeflow_memory_used_bytes 1073741824

# HELP edgeflow_api_requests_total Total API requests
# TYPE edgeflow_api_requests_total counter
edgeflow_api_requests_total 15234

# HELP edgeflow_api_response_time_avg Average response time (ms)
# TYPE edgeflow_api_response_time_avg gauge
edgeflow_api_response_time_avg 12.5

Cloud Telemetry Sync

When connected to the SaaS platform, device metrics are automatically synced to the cloud:

On Connect — Full system metrics report sent immediately
Periodic Sync — Shadow updates include system state every 5 minutes
On Demand — Cloud can query metrics via get_system_metrics command
Execution Events — Flow execution records available via get_executions

Resource Alerts

The resource monitor raises alerts when thresholds are exceeded:

Alert	Threshold	Severity
High CPU	> 90% for 5 minutes	Warning
Low Memory	< 100 MB free	Critical
Memory Soft Limit	> 4 GB used	Warning
Memory Hard Limit	> 8 GB used	Critical (auto-disable modules)
High Temperature	> 80°C	Warning
Disk Full	> 90% used	Critical