HTML Extract
Extract specific HTML tag contents or attributes from complex HTML or XHTML content.
This is a flexible API which allows you to parse and extract any data from HTML documents.
You can search for data using a CSS/jQuery style selector.
Tag selector examples:
- ".super": find all elements with the class "super"
- "img.avatar": find all "img" tags with the class "avatar"
- "img[width=32]": find all "img" tags with the attribute "width" equaling "32"
- "img[src*=cool]": find all "img" tags with the attribute "src" containing the string "cool"
- "a[href]": find all "a" tags which have the "href" attribute set
- "#special-id": find all elements with the id "special-id"
You can also combine selectors, for example:
- "div a": find all "a" tags which are contained within (children of) "div" tags
- "div > a": find all "a" tags which descend directly from "div" tags
End Point
https://neutrinoapi.com/html-extract-tags
Test API
Parameter | Required | Type | Default | Description |
---|
content | Yes | string | | The HTML content. This can be either a URL to load HTML from or an actual HTML content string |
tag | Yes | string | | The HTML tag(s) to extract data from. This can just be a simple tag name like 'img' OR a CSS/jQuery style selector |
attribute | No | string | | If set, then extract data from the specified tag attribute. If not set, then data will be extracted from the tags inner content |
base-url | No | string | | The base URL to replace into realive links |
Parameter | Type | Description |
---|
total | integer | The total number of values extracted |
values | array | Array of extracted values |
Free Tier | Tier 1 | Tier 2 | Tier 3 |
---|
10 | 1,000 | 10,000 | 100,000 |