API Basics
Data Tools
Browser Bot
HTML Clean
URL Info
Security and Networking
Legacy APIs

HTML Extract

Legacy API Notice
This API is now deprecated, you should try the new and improved Browser Bot API. We will continue to operate this API into the future for pre-existing users however no more feature updates will be applied.

Extract specific HTML tag contents or attributes from complex HTML or XHTML content.
This is a flexible API which allows you to parse and extract any data from HTML documents.
You can search for data using a CSS/jQuery style selector.

Tag selector examples:

You can also combine selectors, for example:

End Point

Test API
API Request
contentyesstringThe HTML content. This can be either a URL to load from, a file upload (multipart/form-data) or an HTML content string
tagyesstringThe HTML tag(s) to extract data from. This can just be a simple tag name like 'img' OR a CSS/jQuery style selector
attributenostringIf set, then extract data from the specified tag attribute.
If not set, then data will be extracted from the tags inner content
base-urlnostringThe base URL to replace into relative links
API Response
totalintegerThe total number of values extracted
valuesarrayArray of extracted values
API Performance
Avg Latency20msAverage RTT for requests within the same data center/region
Max Rate2/secondMaximum inbound request rate. Exceeding this will result in request throttling
Max Concurrency250Maximum concurrent/simultaneous requests. Exceeding this will result in error code 06 [TOO MANY CONNECTIONS]