Overview
The html node parses HTML content and extracts elements using
CSS selectors. It is ideal for web scraping, processing HTTP responses, extracting structured data from
web pages, and transforming HTML content within your EdgeFlow pipelines.
Properties
| Property | Type | Required | Default | Description |
|---|---|---|---|---|
| selector | string | Yes | - | CSS selector to match elements (tag, .class, #id, or combined) |
| output | select | No | "text" | Output type: "text" (inner text), "html" (inner HTML), or "attr" (attribute value) |
| attr | string | No | "" | Attribute name to extract (required when output is "attr") |
| multiple | boolean | No | false | Return all matches as an array instead of the first match only |
Inputs
An HTML string to parse. Typically the response body from an HTTP request node.
{
"payload": "<html><head><title>My Page</title></head><body><h1>Hello</h1></body></html>"
} Outputs
{
"payload": "Hello"
} {
"payload": [
"https://example.com/page1",
"https://example.com/page2",
"https://example.com/page3"
]
} Output Modes
text
Returns the inner text content of matched elements, with HTML tags stripped.
selector: "h1"
input: "<h1>Hello <em>World</em></h1>"
output: "Hello World" html
Returns the inner HTML of matched elements, preserving nested markup.
selector: "h1"
input: "<h1>Hello <em>World</em></h1>"
output: "Hello <em>World</em>" attr
Returns the value of a specific attribute from matched elements.
selector: "a"
attr: "href"
input: "<a href="/about">About</a>"
output: "/about" Example Flows
Extract All Links from a Page
Fetch a web page and extract every hyperlink URL.
[
{
"id": "fetch-page",
"type": "http-request",
"method": "GET",
"url": "https://example.com",
"ret": "txt"
},
{
"id": "extract-links",
"type": "html",
"selector": "a",
"output": "attr",
"attr": "href",
"multiple": true
},
{
"id": "show-links",
"type": "debug",
"name": "All Links"
}
]
// Output:
// {
// "payload": [
// "https://example.com/about",
// "https://example.com/contact",
// "https://example.com/blog"
// ]
// } Get Page Title
Extract the title from an HTML page for monitoring or logging.
[
{
"id": "fetch-page",
"type": "http-request",
"method": "GET",
"url": "https://example.com",
"ret": "txt"
},
{
"id": "get-title",
"type": "html",
"selector": "title",
"output": "text",
"multiple": false
},
{
"id": "log-title",
"type": "debug",
"name": "Page Title"
}
]
// Output:
// { "payload": "Example Domain" } Extract Table Data
Scrape tabular data from a web page and display it in a dashboard table.
[
{
"id": "fetch-data",
"type": "http-request",
"method": "GET",
"url": "https://example.com/data",
"ret": "txt"
},
{
"id": "extract-cells",
"type": "html",
"selector": "table.data tr td",
"output": "text",
"multiple": true
},
{
"id": "format-table",
"type": "function",
"name": "Reshape to rows"
},
{
"id": "dashboard-table",
"type": "ui-table",
"name": "Scraped Data"
}
]
// Output from html node:
// {
// "payload": [
// "Sensor A", "22.5", "Online",
// "Sensor B", "18.3", "Offline",
// "Sensor C", "25.1", "Online"
// ]
// } CSS Selector Reference
| Selector | Example | Matches |
|---|---|---|
| tag | h1 | All h1 elements |
| .class | .price | Elements with class "price" |
| #id | #main-content | Element with id "main-content" |
| tag.class | div.card | div elements with class "card" |
| parent child | ul li | li elements inside ul |
| [attr] | a[target] | Links with a target attribute |
| [attr=val] | input[type="text"] | Text input elements |
Common Use Cases
Web Scraping
Extract product prices, news headlines, or weather data from websites for IoT dashboards.
API Response Parsing
Parse HTML fragments returned by APIs or legacy services that don't provide JSON.
Link Monitoring
Monitor web pages for broken links, new content, or changes to specific elements.
Content Transformation
Strip HTML to plain text, extract specific sections, or reformat content for notifications.