EdgeFlow includes a comprehensive monitoring system that tracks hardware resources, flow execution metrics, and system health in real-time. Monitor CPU, memory, disk, temperature, and network usage from the built-in dashboard.
Monitor Overview
The System Monitor dashboard provides an at-a-glance view of your device's health and performance. Circular gauges show real-time utilization with color-coded thresholds that shift from green to yellow to red as resources approach their limits.
System Metrics
EdgeFlow collects system metrics from the Linux /proc and /sys
filesystems, providing detailed insight into hardware utilization without external dependencies.
CPU Monitoring
CPU metrics are gathered from Go's runtime package and /proc/loadavg,
polled every 3 seconds by default.
| Metric | Source | Update Interval |
|---|---|---|
| CPU Usage % | Runtime | 3s |
| Load Average | /proc/loadavg | 3s |
| Core Count | Runtime | Static |
| Goroutines | Runtime | 3s |
Memory Usage
Memory information is read from /proc/meminfo and displayed as a segmented
breakdown bar. The monitor tracks total, used, available, and swap partitions.
| Total | 1024 MB |
| Used | 358 MB |
| Free | 154 MB |
| Available | 410 MB |
| Swap Total | 512 MB |
| Swap Used | 24 MB |
Disk Usage
Disk utilization is reported per-partition. EdgeFlow monitors the root filesystem and any mounted data partitions.
| Partition | Total | Used | Available | Use% |
|---|---|---|---|---|
/ | 29.7 GB | 13.4 GB | 14.8 GB | 45% |
/boot | 256 MB | 72 MB | 184 MB | 28% |
Temperature Monitoring
On Raspberry Pi and other supported boards, EdgeFlow reads the CPU temperature from
/sys/class/thermal/thermal_zone0/temp. The value is displayed as a gauge
with color-coded zones indicating thermal state.
When the CPU temperature reaches 80°C or above, the Raspberry Pi firmware automatically throttles the CPU frequency to prevent damage. This can significantly reduce EdgeFlow performance. Consider adding a heatsink or fan if temperatures regularly exceed 70°C.
Network
Network statistics are read from /proc/net/dev and polled at the same
interval as other system metrics. Speed is calculated as the byte delta between consecutive polls.
| Interface | RX Bytes | TX Bytes | Speed |
|---|---|---|---|
eth0 | 1.24 GB | 856 MB | 2.4 MB/s |
wlan0 | 342 MB | 128 MB | 450 KB/s |
lo | 56 MB | 56 MB | 1.2 MB/s |
Monitoring API Endpoints
All monitoring data is accessible through REST API endpoints. These are the same endpoints used by the built-in dashboard and can be consumed by external tools.
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/v1/resources/stats | CPU, memory, disk, goroutines, system info |
| GET | /api/v1/resources/report | Module manager format (MB units) |
| GET | /api/v1/system/info | Hostname, OS, arch, board model, uptime, temperature, load avgs, swap |
| GET | /api/v1/system/network | Network interfaces with IPv4/IPv6, MAC, MTU |
| GET | /api/v1/system/wifi/scan | WiFi networks (SSID, signal, security, channel) |
| POST | /api/v1/system/reboot | Reboot system (Linux only, requires sudo) |
| POST | /api/v1/system/restart-service | Restart EdgeFlow service |
Example Response
GET /api/v1/system/info
Response:
{
"hostname": "edgeflow",
"os": "linux",
"arch": "arm64",
"board": "Raspberry Pi 4 Model B",
"uptime": "12d 5h 23m",
"temperature": 52.1,
"load_avg": [0.45, 0.38, 0.32],
"memory": {
"total": 1073741824,
"used": 163577856
}
} Resource Limits & Auto-Management
EdgeFlow defines soft and hard resource thresholds in its ResourceLimits
configuration. When thresholds are crossed, the system takes progressive action from
logging warnings to automatically disabling non-essential modules.
| Threshold | Action | Default |
|---|---|---|
| Memory Soft Limit | Warning, log message | 80% |
| Memory Hard Limit | Auto-disable non-essential modules | 90% |
| Disk Warning | Log warning | 85% |
| Disk Critical | Alert notification | 95% |
| Low Memory Threshold | Prevent module loading | 50 MB available |
When memory usage exceeds the hard limit, EdgeFlow automatically disables non-essential
modules to reclaim resources. The CanLoadModule() function checks available
memory before allowing new modules to be loaded, preventing out-of-memory situations
on constrained devices.
EdgeFlow proactively manages resources to prevent system instability on constrained devices. Modules are gracefully stopped and can be re-enabled once resources are available again.
Health Check System
The health check system runs periodic checks against critical subsystems and reports an aggregate status. Each check returns one of three severity levels:
Built-in Health Checks
| Check | What It Tests | Healthy | Degraded | Unhealthy |
|---|---|---|---|---|
| Database | DB connection ping | OK | - | Fail |
| Disk Space | Filesystem usage | < 85% | 85-95% | > 95% |
| Memory | RAM utilization | < 90% | > 90% | - |
| Goroutines | Active goroutine count | Normal | Excessive | - |
The health endpoint is available at GET /api/v1/health and returns the
aggregate status along with individual check results. Periodic checks run automatically
on a configurable interval (default: 30 seconds).
GET /api/v1/health
{
"status": "healthy",
"checks": {
"database": { "status": "healthy", "latency_ms": 2 },
"disk": { "status": "healthy", "usage_percent": 45 },
"memory": { "status": "healthy", "usage_percent": 35 },
"goroutines": { "status": "healthy", "count": 42 }
}
} Flow & Execution Metrics
EdgeFlow exposes Prometheus-format metrics for flow execution, node activity, and system performance. These metrics can be scraped by Prometheus and visualized in Grafana.
| Metric | Type | Description |
|---|---|---|
edgeflow_flows_total | counter | Total flows created |
edgeflow_flows_running | gauge | Currently running flows |
edgeflow_executions_total | counter | Total flow executions |
edgeflow_executions_failed | counter | Failed executions |
edgeflow_nodes_total | gauge | Total registered nodes |
edgeflow_nodes_active | gauge | Currently active nodes |
edgeflow_uptime_seconds | gauge | Server uptime |
edgeflow_memory_used_bytes | gauge | Memory usage |
edgeflow_goroutines | gauge | Active goroutines |
edgeflow_api_requests_total | counter | Total API requests |
edgeflow_api_errors_total | counter | API errors |
edgeflow_api_response_time_ms | gauge | Average response time |
Prometheus & Grafana Integration
Enabling Prometheus Metrics
Enable Prometheus metrics export in Settings > Metrics. Once enabled,
EdgeFlow exposes a /metrics endpoint on the configured Prometheus port
(default: 9090) that returns all metrics in Prometheus text format.
Add EdgeFlow as a scrape target in your prometheus.yml configuration:
scrape_configs:
- job_name: 'edgeflow'
static_configs:
- targets: ['raspberrypi:9090']
scrape_interval: 15s Grafana Dashboard
To visualize EdgeFlow metrics in Grafana:
- Open Grafana and navigate to Configuration > Data Sources
- Click Add data source and select Prometheus
- Set the URL to your Prometheus server (e.g.,
http://localhost:9090) - Click Save & Test to verify the connection
- Import the EdgeFlow dashboard template from Dashboards > Import
- Select the Prometheus data source and click Import
Check grafana.com/dashboards for community-contributed EdgeFlow dashboard templates that include pre-built panels for CPU, memory, flow execution rates, and error tracking.
Log Viewer
The built-in Log Viewer streams real-time log entries via WebSocket, supporting
log, flow_status, and node_status event types.
It retains the last 500 entries in the browser for scrollback.
Log Viewer Features
- Level Filter - Filter by ALL, DEBUG, INFO, WARN, or ERROR
- Search - Full-text search across log messages
- Pause / Resume - Freeze the stream to inspect entries
- Auto-scroll - Automatically scrolls to newest entries when active
- Retention - Keeps the last 500 entries in the browser
Log Sources
| Source | Description |
|---|---|
| Backend Logs | Server-side log output (Go runtime, services, handlers) |
| Flow Events | Flow start, stop, deploy, error events |
| Node Events | Individual node status changes and output |
| Frontend Actions | User actions in the web UI (deploy, save, etc.) |