# A Simple Example
Let's say that your website, www.example.com
, is hosted on servers in Europe and North America. You're interested in availability and response time, so you create a RIPE Atlas ping measurement from five locations around the globe and begin
seeing results coming back that look something like this:
- Rotterdam, Netherlands:
id: 123, rtt: 9ms
- Athens, Greece:
id: 234, rtt: 12ms
- Vancouver, Canada:
id: 345, rtt: 13ms
- São Paulo, Brazil:
id: 456, rtt: 55ms
- Brisbane, Australia:
id: 567, rtt: 312ms
The ID for your new measurement is 123456789
, so you can get basic
information about your measurement by querying this URL:
https://atlas.ripe.net/api/v2/measurements/123456789/
The new status checks system is can be found at a similar URL:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check
Querying this URL alone should give you basic dashboard values for your server, which is enough for you to plug into a monitoring engine like Nagios (opens new window). The output should look something like this:
# Request
GET https://atlas.ripe.net/api/v2/measurements/123456789/status-check
# Response
HTTP/1.1 200 OK
Date: Tue, 29 Oct 2013 14:37:37 GMT
X-RIPE-Atlas-Global-Alert: 0
Content-Type: text/plain
Cache-Control: no-cache
{
"global_alert": false,
"probes": {
"123": {
"alert": false,
"last": 107.296,
"last_packet_loss": 0.0,
"source": "Country: NL"
},
"234": {
"alert": false,
"last": 14.152,
"last_packet_loss": 0.0,
"source": "Country: GR"
},
"345": {
"alert": false,
"last": 9.328,
"last_packet_loss": 0.0,
"source": "Country: CA"
},
"456": {
"alert": false,
"last": 21.761,
"last_packet_loss": 0.0,
"source": "Country: BR"
},
"567": {
"alert": false,
"last": 28.281,
"last_packet_loss": 0.0,
"source": "Country: AU"
}
}
Note that in the case of every probe above, alert
is set to false
. This is
because your network is presently healthy. If, however, connectivity between
your server and Brisbane, Australia were to degrade suddenly, for example, the output might
look something like this:
# Request
GET https://atlas.ripe.net/api/v2/measurements/123456789/status-check/
# Response
HTTP/1.1 200 OK
Date: Tue, 29 Oct 2013 14:37:37 GMT
X-RIPE-Atlas-Global-Alert: 1
Content-Type: text/plain
Cache-Control: no-cache
{
"global_alert": true,
"probes": {
"123": {
"alert": false,
"last": 107.296,
"last_packet_loss": 0.0,
"source": "Country: NL"
},
"234": {
"alert": false,
"last": 14.152,
"last_packet_loss": 0.0,
"source": "Country: GR"
},
"345": {
"alert": false,
"last": 9.328,
"last_packet_loss": 0.0,
"source": "Country: CA"
},
"456": {
"alert": false,
"last": 21.761,
"last_packet_loss": 0.0,
"source": "Country: BR"
},
"567": {
"alert": true,
"alert_reasons": [
"loss"
],
"all": [
null,
null,
null
]
"last": null,
"last_packet_loss": 100.0,
"source": "Country: AU"
}
}
}
Note that probe 567 (the ID for the probe that you're using in Brisbane) has somehow lost the ability to ping your server. This has resulted in the following changes to the output of your status check:
- The
last
property (the last attempt to ping your server) has anull
value - The
last_packet_loss
value is set to100
% - As the last attempt could not get even one packet through, the
alert
property was set totrue
- As one of the probes has now triggered an alert, the
global_alert
property is set totrue
- The
X-RIPE-Atlas-Global-Alert
header is set to1
- Two additional values were added to the probe definition in question:
all
andalert_reasons
:all
is a list of all packet results used to calculatelast
. There will be more explanation about this later.alert_reasons
is a list of reasons why this alert was triggered. Typically this will only have one value:loss
, but as we'll see later on, it may also containlatency
.
The idea is to have your monitoring software parse this output and act
accordingly. How you parse it is up to you. A simple use case would be to simply grep the output for global_alert":true
and trigger your alerts based on that, while a more nuanced example might parse the JSON and look for values relevant to different users to page the appropriate contact.
If you're not keen on parsing the output, or want to save bandwidth by using a simpler test, we also allow you to abuse the HTTP response code system by setting the flag change_http_status=1
. In this case, the above response would change to the following:
# Request
HEAD https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?change_http_status=1
# Response
HTTP/1.1 418 UNKNOWN STATUS CODE
Date: Tue, 29 Oct 2013 14:37:37 GMT
X-RIPE-Atlas-Global-Alert: 1
Content-Type: text/plain
Cache-Control: no-cache
Note that the only HTTP codes currently in use are 200 and 418. There are no plans to expand the abuse of the HTTP status code system at present, as this would make it difficult to indicate whether there is a problem with the measurement in question, or the status check system itself.
With these sorts of changes, you can write server-side scripts to capture and parse the JSON output, or just note the HTTP response code and take whatever action you see fit. To use Nagios as an example, you could use the check_http
script to alert if the HTTP response is anything other than 200. There's no need to write any custom code if you don't want to. Please make sure that your system uses properly set HTTP Host-headers, i.e. it sends a Host: atlas.ripe.net
line with the HTTP request. In Nagios this is acieved by using the option -H atlas.ripe.net
.