# A Complex Example
The simple example above should be good enough for most people, but if you're
dealing with a large subset of probes (we support up to 1024), or if you're
interested in comparing the current RTT
value to past values, then this
section is for you.
You can control how the alerts are triggered based on a few arguments in the URL:
Argument | Default | Description |
---|---|---|
max_packet_loss | 75 | The acceptable percentage packet loss per probe |
show_all | false | Show all RTT responses. The default is to only show all responses for alerting probes |
permitted_total_alerts | 0 | The total number of probes you would permit to respond with an alert before a global alert is issued |
lookback | 1 | The total number of measurement results to compare to generate a median RTT value. |
median_rtt_threshold | N/A | The threshold at which an alert should be issued when you compare the latest RTT value to the median values (based on the lookback) |
These arguments can be combined to give interesting results, so we'll break them down one-by-one and then give you some examples of combinations and the resulting output.
# max_packet_loss
By default, we don't set alert: true
unless the packet loss percentage exceeds
75%. If you'd like to adjust this threshold, you can pass max_packet_loss
to
the URL. Expanding on our simple example above, this request would require that
all packets be lost before an alert will be set on a probe:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?max_packet_loss=95
Note however that if you set max_packet_loss
to 100, no alert will ever be set
for lost packets.
Similarly, you can make the check more sensitive by tweaking the
max_packet_loss
value downward:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?max_packet_loss=0
This would set an alert if even one packet was lost.
# show_all
In the simple example, the sample output listed only basic probe information:
# Request
GET https://atlas.ripe.net/api/v2/measurements/123456789/status-check/
# Response
...
"234": {
"alert": false,
"last": 14.152,
"last_packet_loss": 0.0,
"source": "Country: GR"
},
...
If ever there is an alert triggered though, the all
attribute is included so
that you can see further details:
# Request
GET https://atlas.ripe.net/api/v2/measurements/123456789/status-checks/
# Response
...
"234": {
"alert": true,
"alert_reasons": [
"loss"
],
"all": [
null,
null,
null
]
"last": null,
"last_packet_loss": 100.0,
"source": "Country: GR"
},
...
By setting show_all
, you're asking the server to always include the all
attribute in the output, regardless of whether or not there's an alert issued,
so you'd change the output of an error-free result to:
# Request
GET https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?show_all=1
# Response
...
"234": {
"alert": false,
"all": [
12.123,
14.152,
17.321
]
"last": 14.152,
"last_packet_loss": 0.0,
"source": "Country: GR"
},
...
# permitted_total_alerts
By default, we assume that one probe failing to meet expected thresholds is
cause for alarm. If you feel this is too sensitive, you can increase this
value. This won't change the alert
value for each probe, but it will
determine whether or not global_alert
will be set to true
, and if
change_http_status
is set to 1
, the HTTP status will be changed to 418
.
The following will allow for a maximum of 3 probes to alert before the global alert is set:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?permitted_total_alerts=3
# lookback and median_rtt_threshold
Sometimes the current median RTT isn't enough information with which to make an
alert decision. Sometimes, you need a little history to determine whether an
alert is warranted. This is where lookback
and median_rtt_threshold
come
in.
Let's use our example again. Say that you've been running this measurement for
a few hours now and each of our 5
probes has collected at least 10
results
each:
Probe | Results |
---|---|
Rotterdam | 5 |
Athens | 12 |
Vancouver | 13 |
São Paulo | 32 |
Brisbane | 312 |
Based on these results, we can calculate a median value:
Probe | Median |
---|---|
Rotterdam | 5 |
Athens | 15 |
Vancouver | 14 |
São Paulo | 37 |
Brisbane | 310 |
The lookback
value mentioned above determines the total number of past
measurement results we take into account to generate these median values.
Values can range from 1 to 10 and the default is 1.
Once we have a median value, the next part of the equation, your specified
median_rtt_threshold
comes into play. We compare our calculated median value
to the current value, and if the difference exceeds your threshold value, we
post an alert.
To continue with our example, say that you've decided that you want to be alerted if any probe exceeds its median RTT by 10. Your query would look like this:
# Request
GET https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=10&median_rtt_threshold=10
# Response
...
"234": {
"alert": true,
"alert_reasons": [
"latency"
],
"all": [
43.103,
43.363,
43.517,
45.254,
45.303,
45.714,
45.72,
46.045,
46.907,
46.92,
47.338,
48.843,
49.831,
50.598,
50.834,
55.644,
65.612,
73.656,
78.739,
81.618,
101.793,
105.107,
111.606,
138.973,
144.736,
154.633,
159.825,
199.248,
206.075,
314.524
],
"last": 111.606,
"last_packet_loss": 0.0,
"median": 55.644,
"source": "Country: GR"
},
...
You'll note that not only has an alert been triggered due to the disparity
between median
and last
, but also that alert_reasons
now contains
latency
instead of what you may have seen until now: loss
. It's possible
that in some cases, you could have a sufficient number of dropped packets to
trigger an alert and a sufficient amount of latency, so this property will
help you figure out which is which.
You can vary the lookback
value if you like, and this will adjust the number
of samples used to establish a median.
# A note about the
lookback
valueMedian calculations are based only on the non-null values available. This means that if
lookback=10
and of those 10 results only 2 of them are non-null, only those two results will be used to calculate the median.
Supported median_rtt_thresholds
include both percentages and integers,
positive and negative. Some examples:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=10&median_rtt_threshold=10
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=10&median_rtt_threshold=10%
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=10&median_rtt_threshold=-10
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=10&median_rtt_threshold=-10%
Note however that you should be careful with using integers, as there's always likely to be a strong variance for probes located a long distance for their target.
# Sanity Filter
In the case of very low median values, a sanity check is applied to prevent alerts from being issued for no reason. An example of this might be a probe with a median RTT of 2.3 and a latest RTT of 4.6. That's a 200% increase, but not one worthy of note, so our sanity filter will not consider this worthy of an alert.
At present, the sanity filter ignores any delta of ±5ms.
# Combinations
So now that we've covered all of the different options, you can try combining them to see what kind of results you might get.
This will will only alert on probes that exceed a packet loss of 50%, and will only post a global alert if more than 3 probes are alerting:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?permitted_total_alerts=3&max_packet_loss=50
Same thing, but this will always show the RTT values:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?show_all=1&permitted_total_alerts=3&max_packet_loss=50
Looking back over the last 7 results, show alerts for probes exceeding the median RTT by 30%
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=7&median_rtt_threshold=30&show_all=1&permitted_total_alerts=3&max_packet_loss=50
The same thing, and again we include all RTT values:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=7&median_rtt_threshold=30%&show_all=1
The same thing again, but this time only sound a global alert if more than 5 probes are alerting:
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=7&median_rtt_threshold=30%&show_all=1&permitted_total_alerts=5
And finally, a great big one that will:
Establish a median for each probe based on the past 10 results
Alert any probe whose latest RTT exceeds that of the median by 20%
Show all RTTs, regardless of alert status
Will only show a global alert if more than 7 probes are alerting
Will mark a probe as alerting if the packet loss on that probe exceeds 50%
https://atlas.ripe.net/api/v2/measurements/123456789/status-check/?lookback=10&median_rtt_threshold=20%&show_all=1&permitted_total_alerts=7&max_packet_loss=50