Tracert's to ntp.plus.net
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Help with my Plusnet services
- :
- Broadband
- :
- Tracert's to ntp.plus.net
Re: Tracert's to ntp.plus.net
28-05-2013 1:25 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I'm now on pcl-ag04, with identical routing to pcl-ag01. If I'm interpreting the tracerts correctly the load balancer is at hop 6 except for 212.159.13.50 where it's at hop 7 because of the extra hop.
So in the example I quoted about xe-10-0-0.ptw-cr01.plus.net falling over, what you seem to be saying is it has sufficient resiliency built in making is less likely to fall over completely, but should it do so, somehow (beyond your knowledge), it would go to another physical unit cr02, if I've understood that correctly.
Which is fine, but I suppose the underlying query, which none of this answers is why many of us see random periodic fails in DNS lookups - it's not errors on my connection to the exchange before someone leaps on that one!
Re: Tracert's to ntp.plus.net
28-05-2013 1:37 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: Bob I think BGP or something similar is used to route you to the closest cluster which is the basis for the question jelv asked
Thanks for the explanation Bob, it all sounds very good and from this you would expect the plusnet name servers to be the fastest around for PN users. Unfortunately my tests using DNS Benchmark frequently show PN's name servers to be the slowest for cached look ups.
Is there a reason for this and is there any plans to improve?
Re: Tracert's to ntp.plus.net
28-05-2013 1:53 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: Bob The IP's you see in traceroutes are not those of the physical caching DNS servers themselves. They are virtual IP's. The physical caching servers themselves are clustered across two sites for resiliency.
Bob,
Thank you for the clear explanation which has not been so well given in previous discussions on this topic.
Quote from: Bob I'm going to bow out now before I get out of my depth, however if this thread is a question of whether or not we've sufficient network redundancy then I don't think people have anything to worry about
...and take the rest of us with you. I suggest that the questions have arisen out of numerous recent issues where DNS name resolution has repeatedly failed completely or in part or (as noted by NPR) not been particularly fast.
On that subject I'm sat at home ( rather than on my business site ) and am finding some issues with DNS resolution... but that might be related to the connectivity issues I'm waiting for BTOR to come and investigate.
Kevin
In another browser tab, login into the Plusnet user portal BEFORE clicking the fault & ticket links
Superusers are not staff, but they do have a direct line of communication into the business in order to raise issues, concerns and feedback from the community.
If this post helped, please click the Thumbs Up and if it fixed your issue, please click the This fixed my problem green button below.
Re: Tracert's to ntp.plus.net
28-05-2013 2:13 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: townman
On that subject I'm sat at home ( rather than on my business site ) and am finding some issues with DNS resolution... but that might be related to the connectivity issues I'm waiting for BTOR to come and investigate.
Kevin
Could run your own caching DNS resolver, I gave up on all ISP DNS resolvers years ago.
I can recommend Unbound DNS, very easy to install and been 100% reliable during the years I've run it.
Details in the usual place
Re: Tracert's to ntp.plus.net
28-05-2013 2:38 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: Anotherone So in the example I quoted about xe-10-0-0.ptw-cr01.plus.net falling over, what you seem to be saying is it has sufficient resiliency built in making is less likely to fall over completely, but should it do so, somehow (beyond your knowledge), it would go to another physical unit cr02, if I've understood that correctly.
Yes, you've understood correctly
Quote from: Anotherone Which is fine, but I suppose the underlying query, which none of this answers is why many of us see random periodic fails in DNS lookups.
Do you?
I certainly wasn't aware of any widespread complaints and I've not personally noticed any problems myself. Whilst other servers might be quicker responding, ours should rarely fail unless there's a service wide problem of some description.
Quote from: npr
Quote from: Bob I think BGP or something similar is used to route you to the closest cluster which is the basis for the question jelv asked
Thanks for the explanation Bob, it all sounds very good and from this you would expect the plusnet name servers to be the fastest around for PN users. Unfortunately my tests using DNS Benchmark frequently show PN's name servers to be the slowest for cached look ups.
Is there a reason for this and is there any plans to improve?
That's the slowest of the fastest percentile though I'm guessing? I very much doubt our servers are slower to respond than the majority of publicly accessible resolvers. Using a default namebench install I get the following which shows our resolvers to be up there with the quickest of them.
I've just downloaded and optimised DNS Benchmark too. Out of thousands of resolvers it claims to have selected the fastest 50 or so. Granted if I specify our server address by comparison then it doesn't look good but Google doesn't seem to fair much better either. Both are in the bottom five:
Location isn't everything and there are clearly other factors that need considering. The architecture of the physical boxes themselves and the software being used etc. Whilst the round trip time to our caches should definitely be quicker than to other DNS servers, that doesn't guarantee the quickest average DNS response time.
In summary, our caching DNS servers should be perfectly fine to use. There are reasons you may want to specify others though e.g. for resiliency and if you roam (I think our servers are locked to our IP ranges). You might even want to replace your primary server with a non-Plusnet server. Personally though I don't think the milliseconds it purportedly saves makes much of a difference
To answer your other question, I don't think there are any immediate plans to overhaul the DNS platform.
Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵
Re: Tracert's to ntp.plus.net
28-05-2013 2:50 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: Bob
Quote from: Anotherone Which is fine, but I suppose the underlying query, which none of this answers is why many of us see random periodic fails in DNS lookups.
Do you?
I certainly wasn't aware of any widespread complaints and I've not personally noticed any problems myself. Whilst other servers might be quicker responding, ours should rarely fail unless there's a service wide problem of some description.
I'm not referring specifically to anything that's just cropped up. There have been many threads over lengthy periods, where we've experienced these things, and you have responded to some of them IIRC. Using an alternative DNS server always remedies the problem, so it's difficult to draw any other conclusion, however I am (as always) prepared to be educated on these things
Re: Tracert's to ntp.plus.net
28-05-2013 3:03 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
I have noticed a few issues in the past with Plusnet's DNS servers, past issues were usually how Plusnet's DNS servers appeared to handle certain misconfigured domains differently to others DNS servers, which managed to resolve the domain.
The most recent DNS issue was again an inability for Plusnet's DNS to resolve certain domains, but that time it was very unlikely for the ripe.net and isc.org domains to be misconfigured somehow. I noticed that Plusnet's DNS servers now don't respond to type ANY queries, e.g. "dig @212.159.6.9 plus.net ANY", but I can't remember if that was always the case or not.
Re: Tracert's to ntp.plus.net
28-05-2013 3:37 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: Bob
That's the slowest of the fastest percentile though I'm guessing? I
You could put it that way
IMO PN's servers should easily be the fastest for cached look up, they have the home advantage.
I can ping ntp.plus.net in under 16ms yet PN's cached name look ups takes on average 30 to 40 ms -- there's a lot of time being lost somewhere.
Quote "In summary, our caching DNS servers should be perfectly fine to use."
I don't doubt that, I'm just questioning whether they should / could be faster.
Re: Tracert's to ntp.plus.net
28-05-2013 3:49 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
>tracert 212.159.13.49
Tracing route to cdns01.plus.net [212.159.13.49]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms dsldevice.lan [192.168.1.254]
2 39 ms 38 ms 37 ms lo0-central10.pcl-ag06.plus.net [195.166.128.187]
3 38 ms 37 ms 38 ms link5-central10.pcl-gw01.plus.net [84.93.249.168]
4 37 ms 38 ms 37 ms 176.core.access.plus.net [212.159.0.176]
5 37 ms 37 ms 36 ms po2.pcl-gw01.plus.net [195.166.129.41]
6 38 ms 38 ms 38 ms vl63.pcl-lb02.plus.net [212.159.2.253]
7 39 ms 37 ms 37 ms cdns01.plus.net [212.159.13.49]
Trace complete.
>tracert 212.159.13.50
Tracing route to cdns02.plus.net [212.159.13.50]
over a maximum of 30 hops:
1 <1 ms <1 ms <1 ms dsldevice.lan [192.168.1.254]
2 38 ms 37 ms 38 ms lo0-central10.pcl-ag06.plus.net [195.166.128.187]
3 37 ms 39 ms 37 ms link11-central10.pcl-gw01.plus.net [84.93.249.180]
4 37 ms 36 ms 37 ms 176.core.access.plus.net [212.159.0.176]
5 38 ms 37 ms 37 ms ae1.ptw-cr01.plus.net [195.166.129.0]
6 38 ms 38 ms 37 ms te9-4.ptn-gw01.plus.net [195.166.129.33]
7 37 ms 37 ms 37 ms vl55.ptn-lb02.plus.net [212.159.2.125]
8 37 ms 37 ms 37 ms cdns02.plus.net [212.159.13.50]
Trace complete.
tracerts to 212.159.6.9 & 10 hops 4,5, & 6 identical to 212.159.14.49
So what is it about the pcl gateways so far, the gives 8 hops to 212.159.13.50 where as on ptw-ag04 there was 7 hops and hops 4, 5, & 6 were identical to the other cdns whereas the pcl ones are different?
Edit: ptn-ag02, same story as ptw-ag04 and all are 7 hops, also ptw-ag01 & ptn-ag03
pcl-ag05, same story as the other pcl's 8 hops to 212.159.13.50 also pcl-ag07 & pcl-ag04
Re: Tracert's to ntp.plus.net
29-05-2013 10:17 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Re: Tracert's to ntp.plus.net
29-05-2013 11:00 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
You may be on to something there. I've just repeatedly used "dig kernel.org @212.159.6.9". (plusnets DNS resolver)
The TTL time starts at 600 but doesn't count down properly with repeat tests.
eg 3 tests gave a TTL of 600, then:-
546
557
546
539
536
417
573
600
Also the query time suggests some were not coming from the cache.
Repeating the test using opendns and google dns the TTL time counted down with each repeat test as expected.
IMO there's something strange about the cache on PN's DNS resolver, it's as though each look up is to a different resolver.
Re: Tracert's to ntp.plus.net
29-05-2013 11:21 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: npr … it's as though each look up is to a different resolver.
I wonder if it is - the effect of load balancing, and each resolver maintains its own cache?
Re: Tracert's to ntp.plus.net
29-05-2013 4:07 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: npr You may be on to something there. I've just repeatedly used "dig kernel.org @212.159.6.9". (plusnets DNS resolver)
If I correctly understood Bob's reply (reply #14) then the IP addresses we see are virtual IP addresses which front any number of similar functioning servers. Consequentially we have no means of knowing that the same server serviced all of the requests apparently sent to the same IP address. For the test illustrated to be meaningful, one would need to know the addresses of the raw servers, not that of their load balancer.
In summary I now believe that PN have a multitude (number not known) of DNS servers hosted behind 4 virtual IP addresses. We have no means of knowing which specific server serviced any particular enquiry. As such, the configuration might at times deliver varied results and performance, but sounds highly resilient to single points of failure.
Kevin
In another browser tab, login into the Plusnet user portal BEFORE clicking the fault & ticket links
Superusers are not staff, but they do have a direct line of communication into the business in order to raise issues, concerns and feedback from the community.
If this post helped, please click the Thumbs Up and if it fixed your issue, please click the This fixed my problem green button below.
Re: Tracert's to ntp.plus.net
29-05-2013 5:01 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: ejs The most recent DNS issue was again an inability for Plusnet's DNS to resolve certain domains, but that time it was very unlikely for the ripe.net and isc.org domains to be misconfigured somehow.
Missed that thread entirely. There were a few out of hours callouts concerning DNS around that time and the load balancer work that Matt referenced towards the end of the thread. I'm not overly familiar with what happened though as I was on leave 17th-20th.
Quote I noticed that Plusnet's DNS servers now don't respond to type ANY queries, e.g. "dig @212.159.6.9 plus.net ANY", but I can't remember if that was always the case or not.
I'm not sure whether or not they ever have but I'll agree it's odd. I'll query the situation with our engineers. If I had to force a guess I'd say it was intended as a security measure. I'll let you know what I manage to find out.
Quote from: townman In summary I now believe that PN have a multitude (number not known) of DNS servers hosted behind 4 virtual IP addresses.
I believe there's 8 servers in two clusters across two sites and yes, I expect this it's this and the load balancing that are behind the odd TTL behaviour.
I still need to look into why certain traces to the resolvers cross data centres. Again, I'll report back once I've anything to share ...
Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵
Re: Tracert's to ntp.plus.net
30-05-2013 1:09 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: townman If I correctly understood Bob's reply (reply #14) then the IP addresses we see are virtual IP addresses which front any number of similar functioning servers.
I think this is confusing, see my reply #12 in the first instance and then reply #15
Quote from: Anotherone If I'm interpreting the tracerts correctly the load balancer is at hop 6 except for 212.159.13.50 where it's at hop 7 because of the extra hop.
The real virtual IP addresses are the ones we don't see in the last hop, as indeed you quoted from Bob's reply #14 - nothing like mixed metaphors to confuse the picture more!
I do think ejs has hit on something and npr has picked up on a possible cause of the oddity.
But I also wonder if this has something to do with oddities I see when I ping ntp.plus.net which I'll cover in detail in another thread so as not to clog this with ping results.
Whilst this is something I've seen before, I'd not spotted the patterns that I've just seen, by testing whilst writing this
.
I'm currently on pcl-ag01, as as noted earlier in the thread, tracerts to 212.159.13.50 have an extra hop on pcl gateways.
First I did a ping -t to ntp.plus.net and because of dns caching on my machine this went to 212.159.6.10 and every 3rd result in a groups of 4 gave a longer ping time. Repeating the test a bit later it was every 4th result in a group of 4. The TTL for all these was 250.
It should be noted that at off-peak times I can get results where all ping values are the same.
A ping -t to 212.159.13.50 gave a TTL of 249 for several tests.
It will certainly be interesting to discover why tracerts to 212.159.13.50 have an extra hop from pcl gateways.
Edit: ping thread is here
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page