# Guidelines For Best Practices

# Asking for probe status information

This is just a tiny sample of a client asking for statuses of specific probes (queries from the same IP):

Sending more than a million probe status queries every midnight is not optimal

This client is/was asking such questions more than a million times a day (in this case, within a small window around midnight). Note we have around 13,000 active probes, so this attributes to multiple times per minute, or even worse, since the questions were only about a specific set of probes.

Furthermore, the status or metadata of a probe is - for the most part - very stable. Most probes are connected for days or even months with no change, and very few of them ever change other attributes such as their geolocation or source address. Therefore asking the same question again and again only provides new information in very rare cases.

Zooming in within this topic to queries about one particular probe:

The above picture also shows queries for the same information multiple times (around 50 in each batch), which we assume to be the result of independent processes asking with no knowledge about each other.

Another pattern is when a user is asking for status information for a lot of probes in quick succession. In almost all of these cases, the real-time aspect (i.e. whether a probe is connected or not) is not the main reason for asking these questions - therefore retrieving this data in one bulk query makes more sense.

Users are free to ask for the status of any probe at any time. However, given the above example, wed like to encourage those who ask the API for probe information to:

  • Use the probe status archive (opens new window) if they are interested in the information about multiple probes. We publish this daily (therefore fetching it once a day is enough) and it contains the relevant information about probes for almost all the cases.

  • Share the results of such queries between multiple tasks if possible; local caching can go a long way avoiding hammering the service.

# Asking for measurement results

Take a look at this query log (queries from the same IP):

This client is asking for the latest results for a particular measurement every minute, basically polling for results. Also, it is seemingly done in two threads, three seconds apart every time. This particular measurement is run with a frequency of 900 seconds (15 minutes), therefore asking for results every minute is very expensive for both sides.

A much better and more scalable solution is to:

In either case, do this only once - doing it twice is double the work for the service as well as for the client.

When downloading results in daily batches, for example for a built-in measurement, it's better to use midnight (UTC) timestamps for start and stop query parameters, avoiding additional time-filtering load on the server side.

Different measurements have different amounts of results. The size of such a call can vary between a few hundred bytes to multiple megabytes (per query). We had cases where the client was asking for hundreds of results in parallel - a very expensive operation. Therefore, as rate limiting purely based on the number of queries sent is a very coarse approach, we are considering introducing limits based on bandwidth usage per client.

# Asking for results from a status-check

Similar to asking for measurement results, we see a pattern where users keep asking for measurement status checks (opens new window) multiple times per minute. Status checks are relatively expensive to calculate, since they involve looking up multiple recent results from each probe, and combining and comparing these to come up with an ultimate answer (with all the gory details included, if needed). Therefore the results of status check queries are cached for 5 minutes - asking more often should not yield different results.

# Using ongoing measurements instead of many one-offs

Sometimes we see users scheduling one-off measurements involving the same probes, towards the same targets. In such cases it is simpler (and credit-wise cheaper) to set up an ongoing measurement with a particular frequency instead, and access the results of this when needed.

# Measuring many different targets

Users of RIPE Atlas sometimes want to run measurement on a stable set of probes, but with varying destinations - e.g. traceroute to A then B then C from the same vantage points. Since one of the definitions the system has is that each measurement has one specific target, this leads to users scheduling multiple measurements to achieve the desired result. This is fine if the number of targets, therefore the number of measurements, is relatively low.

It is worth pointing out that measurement targets can be IP addresses or DNS names - in which case one can ask the system to resolve that name each time a probe executes the measurement ("resolve on probe"). Depending on how that name is actually resolved, this provides a means to resolve to different IPs - therefore measurement targets - on different invocations. It is possible to run a special DNS server for this: one that can provide a different answer (opens new window) from some pre-compiled list when asked.

This is similar to a built-in measurement all probes already do: measurements 5051, 5151, 6052, 6152 (opens new window) are traceroute measurements targeting most currently routed prefixes in BGP. This is done by looking up a DNS name that is mapped to the list of such prefixes. Each probe only gets to a relatively small number of such targets each day, yet the aggregate result is meaningful.

Note that such a DNS solution only gives a certain probability that each probe will measure each target. But in many cases that is enough, or perhaps even better than fixed mappings between probes and targets.

Last Updated: Wednesday 5 February 2025