-
-
Notifications
You must be signed in to change notification settings - Fork 216
Description
Welcome!
- I have searched open and closed feature requests to make sure this or similar feature request does not already exist.
- I have reviewed the Milestones to ensure that this feature request, or a similar one, has not already been proposed.
- This is a feature request, not a bug report or support question.
- I agree to follow this project's Code of Conduct.
Which part of the application does your feature belong to?
Other
Description
We are exploring replacing our home-rolled network link quantitative testing framework with speedtest-tracker, and with the exception of the relatively rigid testing schedule options, it would be perfect, especially with it's existing capabilities to export statistics, unified notification framework, etc. It is definitely a project to be proud of, and why it came to my attention while investigating a more robust solution for our needs.
As most speedtest tracker solutions I've found, the scheduling seems exclusively based on cron (e.g. will fire at the start of a minute as it's most granular option. and at specific times during the day, identical every day).
The trouble with time-of-day testing schedules is:
- Produces results only for specific chosen times, and never tests times of the day outside of that schedule
- Requires manually setting the parameters for each instance in order to prevent testing from multiple testing nodes from "stacking" on top of each other, when multiple nodes are present
- Wastes bandwidth if more fine-grained test periods are desired (e.g. every 10 minutes, for example, in order to ensure that every second of the clock day is no more than 10 minutes from a test)
The proposal to allow a non-default option for a pseudorandom schedule for speedtests.
The ways to accomplish this are manifold, but how I implemented it was to:
- Allow an environment-variable specified Test Interval value in minutes
- Allow an environment-variable specified Start Time value of 0-23 hrs (default 0 if not specified)
- Upon startup, a node would perform the first test at a random number of minutes between Start Time and "Test Interval" minutes - 35 seconds + (random lookup from array 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59) seconds
- Each subsequent test is then performed Test Interval - 35 seconds + (random lookup from array 1, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59) seconds from the last test
This is the general algorithm that we settled on after much testing and monitoring. By choosing a Test Interval that is a prime number of minutes, combined with the random lookup from all prime seconds between 0 and 60, provides the following critical benefits:
- Eliminates the need to manually "tune" each testing node schedule (they can all be configured identically)
- Eliminates the possibility of human error causing multiple tests to be performed simultaneously by different speedtest nodes (other than the statistically rare condition where the randomness happens to coincide, occasionally)
- Eliminates the possibility of effective "ddos" of all nodes testing at the same time
- Provides ultra-fine-grained quantitative link performance (amortized) without unnecessary bandwidth use that would be required by fine-grained direct testing (e.g. we can test every 111 minutes instead of every 11 minutes but still get granular detail for nearly every minute of the day (and nearly every second) over the course of a year, as if we had)
- Provides invaluable quality data for all times of the day when viewed from a historical perspective (in our case, exported ongoing to a Victoria server collecting the metrics via simple curl command from the node, so that current historical performance was always available realtime, but exporting to a influxdb is just as well)
By testing in this pseudorandom schedule using 3 different classes (tests within each POPs LAN, tests between POP-to-POP contract WAN links, and tests of the ISP uplinks of each POP), it becomes a powerful leveraging tool for contract negotiations with our network providers, as well as for empirically seeing when any link has become oversubscribed and requires additional infrastructure.
I realize much of seems more of an enterprise requirement, but I wager to state that even someone with a single internet upstream connection would benefit greatly from the ultra-fine-grained testing at all times of the day, when taken out to long periods (e.g. annually), without having to have unduly wasteful frequent testing.
Thank you for your consideration, and for such a robust, quality project !
-=dave