Parse, analyze and retrieve content from the supplied URL.
Determine if a URL is well-formed and actually hosting real content.
Determine many of the URLs properties such as its current HTTP status, content size, type, encoding and load time.
You can also use this API to fetch the actual URL response data for further processing or storage.
Note on using real web browsers:
Although you can use this API to fetch HTML content from a website URL, it's not a fully featured web browser.
Instead, you may want to try the more advanced Browser Bot API.
If this URL responds with html, text, json or xml then return the response. This option is useful if you want to perform further processing on the URL content (e.g. with the HTML Extract or HTML Clean APIs)
ignore-certificate-errors
no
boolean
false
Ignore any TLS/SSL certificate errors and load the URL anyway
timeout
no
integer
60
Timeout in seconds. Give up if still trying to load the URL after this number of seconds
retry
no
integer
0
If the request fails for any reason try again this many times
API Response
Parameter
Type
Description
valid
boolean
Is this a valid well-formed URL
real
boolean
Is this URL actually serving real content
title
string
The document title
language-code
string
The ISO 2-letter language code of the page. Extracted from either the HTML document or via HTTP headers
http-ok
boolean
True if this URL responded with an HTTP OK (200) status
http-status
integer
The HTTP status code this URL responded with. An HTTP status of 0 indicates a network level issue
http-status-message
string
The HTTP status message assoicated with the status code
is-error
boolean
True if an error occurred while loading the URL. This includes network errors, TLS errors and timeouts
is-timeout
boolean
True if a timeout occurred while loading the URL. You can set the timeout with the request parameter 'timeout'
http-redirect
boolean
True if this URL responded with an HTTP redirect
url
string
The fully qualified URL. This may be different to the URL requested if http-redirect is true
url-protocol
string
The URL protocol, usually http or https
url-port
integer
The URL port
url-path
string
The URL path
query
map
A key-value map of the URL query paramaters
content
string
The actual content this URL responded with. Only set if the 'fetch-content' option was used
content-size
integer
The size of the URL content in bytes
content-type
string
The content-type this URL serves
content-encoding
string
The encoding format the URL uses
load-time
float
The time taken to load the URL content in seconds
server-ip
string
The IP address of the server hosting this URL
server-name
string
The name of the server software hosting this URL
server-country
string
The servers IP geo-location: full country name
server-country-code
string
The servers IP geo-location: ISO 2-letter country code
server-city
string
The servers IP geo-location: full city name (if detectable)
server-region
string
The servers IP geo-location: full region name (if detectable)
server-hostname
string
The servers hostname (PTR record)
API Performance
Characteristic
Value
Description
Avg Latency
50-500ms (variable)
This API has a non-deterministic latency based on outside factors
Max Rate
2/second
Maximum inbound request rate. Exceeding this will result in request throttling
Max Concurrency
250
Maximum concurrent/simultaneous requests. Exceeding this will result in error code 06 [TOO MANY CONNECTIONS]