Smarter Network Balancing
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Help with my Plusnet services
- :
- Broadband
- :
- Re: Smarter Network Balancing
Smarter Network Balancing
23-05-2013 5:40 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
We’ve successfully automated some of the session management that keeps our network in balance. It basically means that the service we provide to you is now very much more consistent and better able to cope with both scheduled maintenance and major service outages (MSOs) than it was just a couple of weeks ago.
Up until now keeping the broadband network in balance has been a manual process, with our Network Operations team tasked with keeping an eye on the number of sessions (each session being a single connected customer) on each of our broadband network ‘endpoints’. Simply put, the total amount of bandwidth we need to provide Internet connections to all our customers is split across 108 endpoints on our network; 92 of these are on WBC (Wholesale Broadband Connect) and 16 on IPSC (IP Stream Connect) at the time of writing – these being the wholesale products we buy from BT Wholesale.
We allocate a percentage of our overall bandwidth to each endpoint and then evenly distribute our customers across them so that any one doesn’t become either over or under-subscribed. Customers connected to an over-subscribed endpoint would be more likely to see slow-downs on certain traffic-managed protocols at peak periods, as demand forces the total volume of data transfer up to a threshold limit we set. Whereas bandwidth assigned to an under-subscribed one is effectively going to waste.
Keeping the networks in balance isn’t so easy for a number of reasons. WBC is managed ‘in-house’ here at Plusnet, but IPSC is managed by BT (we’re in the process of moving this over so we can automate that too), so for now I’ll just talk about WBC.
Imagine you turn your router off and on again. When you reconnect what endpoint are you likely to connect to? You’d have an equal chance of connecting to any one (either WBC or IPSC depending on availability in your area) unless we ‘steer’ the session to a particular endpoint. We’d want to do that if it was under-subscribed for example. Almost by definition though, customers who turn their router on to do some ‘Internetting’ are more likely to turn it off again afterwards. This makes rebalancing by hand quite difficult.
The two graphs below how manual ‘rebalancing’ can be quite ‘bumpy’. Each coloured line represents an endpoint and ideally each of these would hold a similar number of customers; so the tighter together the lines, the better. What tends to happen though is that after the balancing is done, the network drifts out of balance once more as the more active customers disconnect and reconnect to different endpoints.
This graph shows quite well how lots of attention to manual balancing can keep on top of things but how quickly it can slip out of balance, as it does from Tuesday midnight:
Manual rebalancing after maintenance is even trickier as the graph below shows. The endpoints labelled at 1 & 2 drop all of their sessions in the early hours of the morning (the vertical lines dropping to zero). You can see an immediate corresponding lift to the other endpoints as everyone’s routers reconnect. Bringing the dropped endpoints back online requires new connections to be ‘steered’ to them. The problem with that is you’ll likely end up with a disproportionate number of customers who tend to disconnect/ reconnect on them, meaning it’s going to go out of balance quite quickly afterwards. You can actually see that pattern in the graph from Tuesday afternoon to midnight (circled) as some sessions drop away again.
What our Networks Operations guys would try to do was to ‘lift’ the sessions from dropped endpoints above the rest to allow them to drop back into balance as customers disconnected. No wonder we had such bumpy graphs!
So, how are things now we’ve automated the balancing? Well, the graphs speak for themselves. All the bumpiness has been removed and balance is consistently tight. So far though we’ve only automated WBC. As I mentioned earlier, IPSC is managed by BT but we’re looking at bring that in-house and automating that too.
This graph below is great. Circled is an endpoint of pcl-ag08 which the previous evening had suffered a partial ‘drop’ (about 3,000 sessions had been disconnected). You can see though how smoothly it’s been brought back into balance with the rest.
What next? I’ve mention IPSC; after this is brought under our direct management we can automate that too. Further down the line we want to balance our endpoints not on the number of sessions but by the amount of data being transferred. Right now, even with perfect balance, it’s entirely possible that a particular endpoint could have a higher percentage of customers transferring a lot of data and placing a lot of demand on it. Our traffic management will protect customer experience as much as possible in such circumstances, but on the very heaviest of days it would be much better to spread that localised demand across the whole network.
I hope you’ve found this interesting. We’ve got a chap called Richard to thank in our Network Operations team for creating the automation script. It worked first time, which I think is highly commendable. But as ever there’s no time to rest on our laurels and the work on IPSC is already in progress.
Re: Smarter Network Balancing
23-05-2013 7:26 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Well done to the team that designed and developed this system and brought it into operation.
David
Re: Smarter Network Balancing
23-05-2013 7:37 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Re: Smarter Network Balancing
23-05-2013 7:52 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
It's similar to what the networks guys used to do manually, just a lot easier and more efficient.
Re: Smarter Network Balancing
23-05-2013 7:53 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Re: Smarter Network Balancing
23-05-2013 8:17 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: Chris I'm not sure why you'd want to move off an under-subscribed end-point though?
For example, a few weeks ago there was some technical problem which appeared to happen only on "pcl" gateways and not on "ptw" gateways, and therefore the advice was to hop gateways until you hit a "ptw".
So what happens now if there is one under-subscribed "pcl" gateway ?, rather than hopping randomly until getting to a desired gateway are we going to get stuck on the most under-subscribed gateway until the network comes into balance, at which point the randomness will return due to the natural churn of other sessions connecting and disconnecting ?
Does the new script have any means of landing new sessions on a different gateway, rather than re-landing on the gateway of the previous session ?
Another example, in the thread New traffic management hardware, you asked us to hop to "PCL-AG01" to see if we could spot any problems with the new configuration. If "PCL-AG01" was not the most under-subscribed gateway, then would we be able to hop to it as requested ?
Re: Smarter Network Balancing
23-05-2013 8:18 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Ex-Broadband Service Manager
Re: Smarter Network Balancing
23-05-2013 9:47 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
We used to manually work out which endpoints were under utilised and steer people towards them. we have automated that process and also upped the frequency of checking.
So even before when people gateway hopped you were still being steered towards certain gateways base on what had been manually decided.
Re: Smarter Network Balancing
23-05-2013 11:44 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Ex-Broadband Service Manager
Re: Smarter Network Balancing
24-05-2013 11:58 AM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
What will you peeps do with yourselves at the NOC now?!
Re: Smarter Network Balancing
24-05-2013 12:12 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
To argue with someone who has renounced the use of reason is like administering medicine to the dead - Thomas Paine
Re: Smarter Network Balancing
24-05-2013 12:44 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: jimbof Must admit to being slightly amazed this had been being done in such a manual fashion for so long - I mean, it has all the characteristics of a task which lends itself well to automation.
What will you peeps do with yourselves at the NOC now?!
It wasn't manually intensive. It was about adjusting a bunch of weightings basically. It's just far more efficient automated and means we don't need someone looking at it over night or at weekends.
Ex-Broadband Service Manager
Re: Smarter Network Balancing
24-05-2013 1:02 PM
- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Report to Moderator
Quote from: Kelly and means we don't need someone looking at it over night or at weekends.
That's when it will go wrong -- Murphy's law.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Plusnet Community
- :
- Forum
- :
- Help with my Plusnet services
- :
- Broadband
- :
- Re: Smarter Network Balancing