Overview

The html node parses HTML content and extracts elements using CSS selectors. It is ideal for web scraping, processing HTTP responses, extracting structured data from web pages, and transforming HTML content within your EdgeFlow pipelines.

CSS

Selector Queries

Output Modes

Multi

Match Support

Attr

Extraction

Properties

Property	Type	Required	Default	Description
selector	string	Yes	-	CSS selector to match elements (tag, .class, #id, or combined)
output	select	No	"text"	Output type: "text" (inner text), "html" (inner HTML), or "attr" (attribute value)
attr	string	No	""	Attribute name to extract (required when output is "attr")
multiple	boolean	No	false	Return all matches as an array instead of the first match only

Inputs

msg.payload

An HTML string to parse. Typically the response body from an HTTP request node.

{
  "payload": "<html><head><title>My Page</title></head><body><h1>Hello</h1></body></html>"
}

Outputs

Single Match (multiple: false)

{
  "payload": "Hello"
}

Multiple Matches (multiple: true)

{
  "payload": [
    "https://example.com/page1",
    "https://example.com/page2",
    "https://example.com/page3"
  ]
}

Output Modes

text

Returns the inner text content of matched elements, with HTML tags stripped.

selector: "h1"
input: "<h1>Hello <em>World</em></h1>"
output: "Hello World"

html

Returns the inner HTML of matched elements, preserving nested markup.

selector: "h1"
input: "<h1>Hello <em>World</em></h1>"
output: "Hello <em>World</em>"

attr

Returns the value of a specific attribute from matched elements.

selector: "a"
attr: "href"
input: "<a href="/about">About</a>"
output: "/about"

Example Flows

Extract All Links from a Page

Fetch a web page and extract every hyperlink URL.

[
  {
    "id": "fetch-page",
    "type": "http-request",
    "method": "GET",
    "url": "https://example.com",
    "ret": "txt"
  },
  {
    "id": "extract-links",
    "type": "html",
    "selector": "a",
    "output": "attr",
    "attr": "href",
    "multiple": true
  },
  {
    "id": "show-links",
    "type": "debug",
    "name": "All Links"
  }
]

// Output:
// {
//   "payload": [
//     "https://example.com/about",
//     "https://example.com/contact",
//     "https://example.com/blog"
//   ]
// }

Get Page Title

Extract the title from an HTML page for monitoring or logging.

[
  {
    "id": "fetch-page",
    "type": "http-request",
    "method": "GET",
    "url": "https://example.com",
    "ret": "txt"
  },
  {
    "id": "get-title",
    "type": "html",
    "selector": "title",
    "output": "text",
    "multiple": false
  },
  {
    "id": "log-title",
    "type": "debug",
    "name": "Page Title"
  }
]

// Output:
// { "payload": "Example Domain" }

Extract Table Data

Scrape tabular data from a web page and display it in a dashboard table.

[
  {
    "id": "fetch-data",
    "type": "http-request",
    "method": "GET",
    "url": "https://example.com/data",
    "ret": "txt"
  },
  {
    "id": "extract-cells",
    "type": "html",
    "selector": "table.data tr td",
    "output": "text",
    "multiple": true
  },
  {
    "id": "format-table",
    "type": "function",
    "name": "Reshape to rows"
  },
  {
    "id": "dashboard-table",
    "type": "ui-table",
    "name": "Scraped Data"
  }
]

// Output from html node:
// {
//   "payload": [
//     "Sensor A", "22.5", "Online",
//     "Sensor B", "18.3", "Offline",
//     "Sensor C", "25.1", "Online"
//   ]
// }

CSS Selector Reference

Selector	Example	Matches
tag	h1	All h1 elements
.class	.price	Elements with class "price"
#id	#main-content	Element with id "main-content"
tag.class	div.card	div elements with class "card"
parent child	ul li	li elements inside ul
[attr]	a[target]	Links with a target attribute
[attr=val]	input[type="text"]	Text input elements

Common Use Cases

Web Scraping

Extract product prices, news headlines, or weather data from websites for IoT dashboards.

API Response Parsing

Parse HTML fragments returned by APIs or legacy services that don't provide JSON.

Link Monitoring

Monitor web pages for broken links, new content, or changes to specific elements.

Content Transformation

Strip HTML to plain text, extract specific sections, or reformat content for notifications.

Related Nodes

http-request

Fetch HTML content from web pages

json-parser

Parse JSON data from API responses

template

Generate HTML output from extracted data