6 months of IDNet data

Started by esh, Dec 18, 2008, 17:11:16

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

esh

Six months ago, shortly after moving to IDNet I wrote a small application that would keep track of things, report back to me on important stuff, and quietly log information, which inclusive amongst that was external network round trip times (pings). I thought I'd present a couple crude plots here, which in fact probably say more about the general state of the internet as a whole rather than allowing one to draw definite conclusions about IDNet, but often for the average user this doesn't matter. There may obviously be a subtle overlay of my line usage patterns, but I pride myself on only having 6 hours of sleep a day and a very variable bedtime. Ahem. Firstly, the program is executed every ten minutes. This means I have 144 data points for any specific day, and 6 data points per hour.

This first plot shows the mean ping over the day over the past 180 days, ie. each data point is the sum of the days' pings divided by 144.

You can see the general trend between early July and late September where the ping times averaged about 10ms higher than before and after. Who knows what this was. I certainly wasn't using my line 24/7 during those periods to skew the data. It might just have been a subtle routing change or even a change in the links between my site and the target ping sites.

EDIT: stevethegas pointed out that BT mysteriously interleaved me at some point which I then got removed. This explains the "high" patch, and shows that on average, interleaving increases your mean ping on the order of 10ms.

For stats fans, the minimum mean ping was 30.765ms, the maximum mean ping 59.03ms, and on any day on IDNet at my site during the past 6 months you could expect a round trip time of 41.078ms +/- 6.6115ms.

Now time of day.


Here we now have 1,080 data points per graph point, so I believe that makes the results statistically significant. It's pretty much as one might expect -- lower pings in the wee hours (though note the range is a mere 7.7051ms). The maximum at 8pm is 45.28ms, the minimum at 6am with 37.575ms. On any particular hour of the day on IDNet at my site you can expect a round trip time of 41.331ms +/- 2.5379ms. This shows that the variation day to day dominates the hour to hour variation, but have the same mean (this is good).

Again, the variation is probably more indicative of the traffic levels and nodes of the sites I'm pinging than IDNet itself, but this is only because the variation from IDNet is clearly insignificant compared to the background variation of the internet in general. At least, that's what I conclude. The +10ms average over three months may have been something, but I think I need to poll more sites (I'm upgrading the app this weekend) to get a more conclusive answer on long-term effects. Either way, it was only a 10ms average. Feel free to browbeat me with statistics books now.
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

Rik

That's impressive work, Esh. :karma:

Have you let IDNet see the results?
Rik
--------------------

This post reflects my own views, opinions and experience, not those of IDNet.

Steve

Nice collection of data, Looking back I believe you had interleave removed towards the end of September which presumably correlates with your improved ping times. ;)
Steve
------------
This post reflects my own views, opinions and experience, not those of IDNet.

esh

You're absolutely right, stevethegas. I had forgotten about that. Mystery solved! Cheers!  :)
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

Steve

I think the variation in ping times for me anyway correlates inversely with D/L speeds.
Steve
------------
This post reflects my own views, opinions and experience, not those of IDNet.

Tacitus

Quote from: Rik on Dec 18, 2008, 17:15:33
That's impressive work, Esh. :karma:
Have you let IDNet see the results?

Very impressive.  It would be interesting if Simon could comment on the results as he would know what was happening at the time.  Server work, router problems etc.

esh

Download speed being inversely proportional to ping time is logical. Even at the most basic level, there's going to be a point when the TCP ACK packets aren't being received fast enough!

As for comments from IDNet... is that really necessary? I don't think there's any major blips, especially since we've figured out the 10ms jump now. I'll run it for another 6 months with some better stats and more of a distributed ping model to find out more. One could argue the ping should be lower since this app doesn't run on a physical machine, but one of the virtual 'jails' on a physical server, which brings in a slightly higher factor of ping latency from my brief testing. Any blips that last a mere day or two could be site variation. As I previously said, I need to make a wider distribution of pings to rule out the affector being the target.

So using the data I can precisely determine when interleaving was on and off by looking at the deviations. I just did up a slightly fancier plot with sigma limits plotted. Such usually only makes sense when the data contains statistical noise, but when so many things affect your internet connection, from the amount of christmas lights on in your street to the number of mugs of coffee the BT technician had that day, the variations might as well effectively *be* statistical noise. Or should be. What should be notable then, is what falls outside of that, and that is where people should take notice.

Let's see --



So I have now done a (rough) correction for the interleaving. This was simply averaging the interleaved pings, averaging the un-interleaved pings, and subtracting the difference on the interleaved data. The standard deviation appears fairly similar in both scenarios, so I didn't do any further fancy corrections which might add bias, but do consider that a chunk of it *was* interleaved.

The black line is the mean ping of all the data. You'll notice it's now at 35.127ms now, which is represented by the central black dashed line. The dark red and dark green dashed lines either side of it represent one sigma limits. This means if your ping is within those bars on any particular day, it's a *good* day, as far as your internet is concerned. Between the upper one sigma bar and the upper three sigma bar (bright red) it's an *okay* day. Anything beyond the three sigma upper bar is probably considered under-par. Five sigma (not on the plot) would be a bad day. If you get under the lower one sigma bar (the green one) then obviously things are pretty damn good ;)

You should realise that all things considered as statistical noise, three sigma is not rare. If you did the same experiment as me, 68% of data points you took (consequently 68% of your days) should be within one sigma. 99% of your days should be within three sigma. In this case one day is outside three sigma, meaning 0.5% of the data in my case.

Finally, the dotted line is the mean of the past ten days.

One should note that now I corrected for interleaving the standard deviation has dropped to 3.9511 ms. This is only 1.5x greater than the hour to hour variation calculated! When you consider an "under-par" day still only means 10ms more on your round-trip, I think it's pretty decent all round.
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

Rik

Rik
--------------------

This post reflects my own views, opinions and experience, not those of IDNet.

Simon_idnet

What really strikes me is how consistent the results are and with a remarkably low variation in upper and lower readings. Your Exchange must be fairly clear of contention. I can't really comment on the routing without knowing the end point address and preferably a traceroute from the start of the test (and few in between, if possible).

Cheers
Simon

Steve

Certainly I get much greater spread of ping times, which I have always assumed (apart from this years late summer blip) were due to exchange congestion in the early evening
Steve
------------
This post reflects my own views, opinions and experience, not those of IDNet.

esh

#10
I am somewhat "out in the sticks", and my exchange isn't due to be upgraded for at least a couple years to the so-called 21CN, so yes, a good exchange probably has a large role to play -- though BT still claimed 4Mbit for where I am.

Any number of things can cause a larger spread of pings. I've seen some home routers increase pings by 20-30ms! I have, for the record, enabled minimise delay QoS on packets from that server. All that should do though is avoid significant bias from my own line usage patterns.

Starting early january next year I'll deploy a similar testing on the remote ADSL line in the Warwick area (not IDNet). Yes, the same one which on one day gave me an upload rate of 8Kbit to speedtest.net. That should prove interesting.

I'm not really trying to prove anything here, but I thought it was nice to have some hard figures on ADSL line variation by the hour and day. If anyone has any more questions I will try to answer them at least.
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

Rik

I'm sure Simon would be interested in more detail, Esh, it's the kind of thing which can help them optimise the service. :)
Rik
--------------------

This post reflects my own views, opinions and experience, not those of IDNet.

esh

Thread resurrection! Now over the 1 year mark for the record, and the server hasn't been down in this time either. You'll probably remember we had a bit of a flaky time Jan/Feb period with the average on some days peaking at 90 odd. After that it calmed down again, or even got better according to my graph, and was stable for some months. Unfortunately I'm going through another (very) rough patch now.



Not quite sure what to make of that at the moment since I've been working in South America recently and haven't been keeping an eye on performance so to speak. I don't see any other immediate cries of dodgy ping rates, so maybe my hardware is up? Who knows, I will investigate this weekend at least. The good news is the net line has not gone down in a very long while, so it remains solidly reliable.
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

Rik

We have seen a handful of people with ping problems, but they seem to have been mainly resolved now. Good trip?
Rik
--------------------

This post reflects my own views, opinions and experience, not those of IDNet.

esh

I've quietly reset all the routers and the gateway remotely now it's 8pm so I'll see how that works out.

As for the trip, it was extraordinarily busy as they always tend to be, but no major hassles (apart from CDG...). The flight route out is LHR->CDG->SCL->LSC with a 130km drive at the end of it  (18-20 hrs total travel), followed by 2pm-8am shifts for 3 weeks. The route back was via EZE which was quite nice as I know someone there and to stay over a while. I'd post photos but I won't hijack the thread :)
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

Simon

You could always start a new thread.  :)
Simon.
--
This post reflects my own views, opinions and experience, not those of IDNet.

Rik

Rik
--------------------

This post reflects my own views, opinions and experience, not those of IDNet.

cavillas

I think all your pinging has caused a degradation in performance for everyone. ;D :whistle: :evil:
------
Alf :)

Technical Ben

If you had done this for O2 Home Access Broadband you would be a saviour to them. The service is so messed up and variable (ping times from 20 to 300 between hours/minuets is not uncommon).
Great stats. I just dislike graphs that are "zoomed in" to the upper limit. At first I thought your ping was doubling, then I saw the ranges to the left.  :whistle:
I use to have a signature, then it all changed to chip and pin.

esh

Yes, I'm afraid graphs are a bit of art vs science... bit of personal preference in there. It's only a crude hundred or two lines of code I wrote to give me a general idea of things. Rebooting all the gateways seems to be improving things so far. I crawled through the announcements and found something a while back about having troubles with the configuration on one of the central gateways and telling people to reset their modems if there were issues. So maybe that was it. The line is just up for so long I hardly ever have to reconnect!

The amount of stats is getting so large I'll have to work out a better way of doing it though. It is megabytes of raw data (but that includes pinging all internal servers too). You may be interested to know that the linux servers uniformly respond quicker than the Windows 2000 boxes, by a factor of 3-5. Still, Windows 7 is going up on them this December so we'll see how that goes.
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

esh

#20
Yes, I just can't leave this thread alone.

But I have about 2 years of data now! In fact, it's so much data and with so many strange outliers affecting a normal plot I came up with a rather arbitrary measure of line quality, but it makes changes much more apparent as you can see here:



Median pings (ms): 30.8, 20.4
90% confidence limits (ms): 48.22, 43.38
95% confidence limits (ms): 59.15, 51.29
99% confidence limits (ms): 99.816, 90.516

The green, blue, and red dashed lines are the confidence levels on the plot. This is to say I can be "90% confident" on any given day that my ping to host 1 is less than 48.22 and to host 2 is less than 43.38.

You may remember me pondering exactly how much of the variation in my ping measurement is caused by the remote site and how much is actually inherent to the BT network. Extending the data to multiple sites gives a fairly clear answer that all of the variation is on the DSL end, not the remote site. The correlation is very high.

The line quality measurement is the (negative) average of each of the remote site pings divided by the mean of that remote site in logarithmic base 10. This means that small deviations show up more clearly and also that 'zero' indicates an "average day" (positive is better and negative is worse).

You can very clearly see the interleaving turn off around 2008-10-05, and the ugly mess of high pings mid-2009 (BT congestion, a router reset got me on a good line again, mentioned in previous posts). I think we all remember the troubles around january 2009, and the plot very clearly shows the marked improvement.

You'll probably notice that most recently the line is more stable (less LQ spikes) but the line quality in general is slightly worse. I think this is actually the installation of new routers which happened around december. Slightly higher latency is a little upsetting but it seems to be fairly minor. I could blame BT congestion of course but I don't think I'll get away with that this time :)
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011

Niall

I declare this thread, nerd heaven  :comp:

;D
Flickr Deviant art
Art is not a handicraft, it is the transmission of feeling the artist has experienced.
Leo Tolstoy

Rik

Rik
--------------------

This post reflects my own views, opinions and experience, not those of IDNet.

esh

I.... is that a compliment?  :o

I'll stop now I promise  :red:
CompuServe 28.8k/33.6k 1994-1998, BT 56k 1998-2001, NTL Cable 512k 2001-2004, 2x F2S 1M 2004-2008, IDNet 8M 2008 - LLU 11M 2011