cancel
Showing results for 
Search instead for 
Did you mean: 

Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Kelly
Hero
Posts: 5,497
Thanks: 380
Fixes: 9
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

I totally agreed with that Gus, but I got a load of our staff members last night to run 4oD and wireshark them and they came out ok!    Undecided
Your pcap is definitely demonstrating a problem though.  Dave is bashing his head against it now though!
Kelly Dorset
Ex-Broadband Service Manager
mikeb
Rising Star
Posts: 463
Thanks: 15
Registered: ‎10-06-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

For fairly obvious reasons, I do not have complete captures.  I only have a few random packet samples that I had kept for evidence and some partial unprocessed captures that I haven't really bothered to look at.  Here's a few examples:
4oD during peak time sort-of working RTMP format data:

Internet Protocol, Src Addr: 84.53.138.85 (84.53.138.85), Dst Addr: 192.168.1.111 (192.168.1.111)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x20 (DSCP 0x08: Class Selector 1; ECN: 0x00)
0010 00.. = Differentiated Services Codepoint: Class Selector 1 (0x08)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 1458
Identification: 0x2ea9
Flags: 0x04
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 122
Protocol: TCP (0x06)
Header checksum: 0x2bdb (correct)
Source: 84.53.138.85 (84.53.138.85)
Destination: 192.168.1.111 (192.168.1.111)

4oD during peak time not working RTMP format data (heavily rate limited as previously demonstrated):

Internet Protocol, Src Addr: 84.53.138.85 (84.53.138.85), Dst Addr: 192.168.1.111 (192.168.1.111)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x20 (DSCP 0x08: Class Selector 1; ECN: 0x00)
0010 00.. = Differentiated Services Codepoint: Class Selector 1 (0x08)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 1458
Identification: 0x74ad
Flags: 0x04
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 122
Protocol: TCP (0x06)
Header checksum: 0xe5d6 (correct)
Source: 84.53.138.85 (84.53.138.85)
Destination: 192.168.1.111 (192.168.1.111)

4oD during off peak incorrect classification RTMP format data::

Internet Protocol, Src Addr: 84.53.138.85 (84.53.138.85), Dst Addr: 192.168.1.111 (192.168.1.111)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x20 (DSCP 0x08: Class Selector 1; ECN: 0x00)
0010 00.. = Differentiated Services Codepoint: Class Selector 1 (0x08)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 1458
Identification: 0x1056
Flags: 0x04
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 122
Protocol: TCP (0x06)
Header checksum: 0x4a2e (correct)
Source: 84.53.138.85 (84.53.138.85)
Destination: 192.168.1.111 (192.168.1.111)

4oD off peak correct classification RTMP format data:

Internet Protocol, Src Addr: 84.53.138.85 (84.53.138.85), Dst Addr: 192.168.1.111 (192.168.1.111)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x80 (DSCP 0x20: Class Selector 4; ECN: 0x00)
1000 00.. = Differentiated Services Codepoint: Class Selector 4 (0x20)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 1458
Identification: 0x77f6
Flags: 0x04
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 122
Protocol: TCP (0x06)
Header checksum: 0xe22d (correct)
Source: 84.53.138.85 (84.53.138.85)
Destination: 192.168.1.111 (192.168.1.111)

4oD off peak incorrect classification HTTPS format data:

Internet Protocol, Src Addr: 84.53.138.85 (84.53.138.85), Dst Addr: 192.168.1.111 (192.168.1.111)
Version: 4
Header length: 20 bytes
Differentiated Services Field: 0x20 (DSCP 0x08: Class Selector 1; ECN: 0x00)
0010 00.. = Differentiated Services Codepoint: Class Selector 1 (0x08)
.... ..0. = ECN-Capable Transport (ECT): 0
.... ...0 = ECN-CE: 0
Total Length: 1458
Identification: 0x10f4
Flags: 0x04
.1.. = Don't fragment: Set
..0. = More fragments: Not set
Fragment offset: 0
Time to live: 122
Protocol: TCP (0x06)
Header checksum: 0x4990 (correct)
Source: 84.53.138.85 (84.53.138.85)
Destination: 192.168.1.111 (192.168.1.111)

Please note that whilst the above examples show data from one specific Akamai IP, the problem is most definitely not just affecting this one specific IP.  Several other Akamai server IP's have been noted at various times along the way as have IPs assigned to :Limelight.  Data when supplied via Akamai is inconsistently classified to say the very least.  Data when supplied via Limelight was exclusively classified as $20 on the very few occasions that I saw it.  However, I have not yet personally seen BBC iPlayer classified as anything other than $80 but I have noted plenty of hiccups that are not usually present.  My usage is still accumulating loads of P2P so nothing has changed even though the heavy restriction on traffic between 1900 and 2200 didn't occur yesterday.  I must repeat once again that PN *REALLY* do need to do their own full investigation on this problem as it is quite clear that something is *VERY* broken in so far as 4oD is concerned and is probably also pretty broken for all streaming data in general. It isn't really sensible or appropriate for customers various to try and do this for you - the system is fundamentally broken and there are way too many variables.  The fact that something is wrong and 1000's of customers aren't complaining about it is nothing new to me.  I generally have to bash my head against the customer support wall for days/weeks/months on almost every single occasion when I have no choice but to report some issue or another even when the evidence is almost irrefutable never mind when it's difficult and ridiculously time consuming to actually get anything remotely close to hard evidence.  
I will see if I can obtain better and more complete data in due course *IF* I can find the time for more b*ggering about.

EDIT:
In absence of anything more recent and obtained under well controlled conditions, see if THIS helps at all.  I can't remember the specific details of the capture, it was simply a random vid that I started playing just prior to midnight to see if any obvious changes in performance or classification occurred when moving from peak to off-peak time.
PLEASE NOTE that the time stamps are 'interesting' (AKA often screwed up !) and I have no idea why ethereal doesn't always show the right info but it's always been like this whenever I've used it over the years.  The data was sampled remotely using ethereal running on a second system tee'd into the connection to my main PC via a hub.  I'm not going to be installing and running ethereal or indeed anything else on my main PC anytime soon Wink
I should also have similar capture for around 0100 hours somewhere and if I can find it then it will appear HERE within an hour or so.


B T Plusnet, a bit kinda like P T Barnum ...

... but quite often appears to feature more clowns Tongue
jelv
Seasoned Hero
Posts: 26,785
Thanks: 971
Fixes: 10
Registered: ‎10-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

MIke,
You have a choice:
1) Leave Plusnet to work from the information you have given. As they have been able to reproduce all of what you are seeing it will take them some time to get to the bottom of what is happening - you will have to be very patient.
2) Co-operate and provide the full Wireshark traces started from before the final click to start watching and stopped when the wrongly classified traffic appears. Plusnet will be able to see what is happening and fix it pretty quickly.
jelv (a.k.a Spoon Whittler)
   Why I have left Plusnet (warning: long post!)   
Broadband: Andrews & Arnold Home::1 (FTTC 80/20)
Line rental: Pulse 8 Home Line Rental (£14.40/month)
Mobile: iD mobile (£4/month)
Kelly
Hero
Posts: 5,497
Thanks: 380
Fixes: 9
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

I think the captures that Gus and Roswellgrey have supplied us have given us enough info to go on now.  (providing they are seeing the same problem..)
Kelly Dorset
Ex-Broadband Service Manager
mikeb
Rising Star
Posts: 463
Thanks: 15
Registered: ‎10-06-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Mr.Jelv ... it's not a question of cooperation it's a question of time, money, and practicality etc.   I have been more than patient and more than cooperative since May last year.  I am reasonably happy to waste vast amounts of time providing useful info as and when I can do (as I have many times over the years) but do not expect me to do the entire job of resolving PN problems.  TBH, I'm not entirely sure quite what looking at the captures is going to prove in any case other than the well known and well proven fact that the data is being classified incorrectly and is consequently being logged by PN as P2P.  
I am almost 100% certain that PN are quite capable of cloning my A/C settings into a test account and seeing exactly what happens for themselves ... bearing VERY much in mind that the whole process for streaming media is dynamic of course.  It is 100% irrelevant to suggest that other users are not experiencing this problem unless said other users have EXACTLY the same A/C type and EXACTLY the same PN configuration as I do for starters.  
I fully appreciate that this is a far from an easy problem to understand let alone try to resolve ... but there is only so much I can realistically do to help because I do not know what PN are trying to do or how they are trying to do it.  All I know is that it doesn't always work as intended and PN's very own figures for usage clearly show GB's of P2P when there has been no P2P activity in the last 15 years !!!!!
PS: I have already added links at the bottom of my last post to some data that might possibly be useful until such time as I can look at doing something else.


B T Plusnet, a bit kinda like P T Barnum ...

... but quite often appears to feature more clowns Tongue
Kelly
Hero
Posts: 5,497
Thanks: 380
Fixes: 9
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Right,
We've found that on a small % of customers their streaming is being *reclassified* as a specific p2p signature.   Based on the relative differences between the amount that's dropped out of the official 4OD/ITV player stats and what's appeared in the p2p stats, it's probably less than 5% of the customers using them and only causes problems if the gateway the customer is on is busy at peak hours, hence why we aren't getting calls.   It explains why all the people we got to test it weren't seeing the changes in their wiresharks but you guys are.
It's likely related to some of the upgrade done in the traffic management project release the other week.   We are going to try to restart some of the switches that are showing the problem in hours today to see it if clears it up (i.e. a stale config on those devices).  If not, we may need to do some changes which we'll have to do overnight
Quote from: mikeb
I am almost 100% certain that PN are quite capable of cloning my A/C settings into a test account and seeing exactly what happens for themselves ... bearing VERY much in mind that the whole process for streaming media is dynamic of course.  It is 100% irrelevant to suggest that other users are not experiencing this problem unless said other users have EXACTLY the same A/C type and EXACTLY the same PN configuration as I do for starters.  

To address this point, turning a connection on our side into *you* doesn't actually replicate your situation.  There is a whole load of BT network between you and us, plus your own network device and network that we can't replicate.  
Either way, I got guys trying to work this out from the point your post appeared.  I tend to trust your gut feel on this sort of thing, because you always have a really hard look at it.  It's appreciated.
What I should point out though is that it was the wireshark captures from Gus and Roswellgrey and a call to one of them so we could do a bit of real time debug which helped us pin down the problem.  
Kelly Dorset
Ex-Broadband Service Manager
Kelly
Hero
Posts: 5,497
Thanks: 380
Fixes: 9
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

You know how we take the mickey out of the "have you restarted your computer" support request scenario?
The restarts do appear appear to be working, and we've started seeing 4OD traffic being prioritised properly.   Have a look at this graph to see being prioritised properly.   We won't be able to cover the entire estate of switches tonight, so performance may be a little ropey again tonight, but we'll picked up the reloads from tomorrow morning and will hopefully finish the job then.

Kelly Dorset
Ex-Broadband Service Manager
mikeb
Rising Star
Posts: 463
Thanks: 15
Registered: ‎10-06-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

As someone who has been seeing the totally obvious problem of very heavily capped data rates between 1900 and 2200 every single night for around 2 weeks up until yesterday as well as noting intermittent poor performance at almost all except the quietest of times for many months despite various gateway changes ... I'll leave it to you to imagine just how difficult it is for me to believe what you say, no matter how true it might actually be in reality !!!  This problem first became apparent to me in May last year and only recurred bigtime around a couple of weeks ago but between these dates, performance of various streaming media has often been suspect and always significantly worse than iPlayer, but not consistently bad enough for long enough to warrant spending loads of time trying to find out why.  However, as a fully paid-up life-time member of the mythical "very small number of customers" club, I'm in no way surprised that practically everyone else is apparently enjoying a lower cost service free from these kinda weird and often persistent but frustratingly intermittent problems  Tongue
It's still mostly the same for me tonight BTW but as you say, there's some more work to do at your end tomorrow.  Always funny when the old "turn the power off and back on again" trick is so very effective  Grin but let's just hope that it's permanent fix and not a temporary one eh  Wink
Mucho thanks to those users who helped by confirming that the problem does exist and provided evidence to prove it Cool
... wanders off waiting to see if it finally does get sorted completely with a few more reboots of equipment various in due course.


B T Plusnet, a bit kinda like P T Barnum ...

... but quite often appears to feature more clowns Tongue
bobpullen
Community Gaffer
Community Gaffer
Posts: 16,887
Thanks: 4,979
Fixes: 316
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Quote from: Kelly
It's likely related to some of the upgrade done in the traffic management project release the other week.   We are going to try to restart some of the switches that are showing the problem in hours today to see it if clears it up (i.e. a stale config on those devices).  If not, we may need to do some changes which we'll have to do overnight.

Quick update ...
Most of the switches have been reloaded however there's still a few to go. We don't envisage the situation causing serious problems before 7pm though and by that time they'll all have been restarted. In a nutshell, all should be well this evening so do shout up if you find that you're experience is not akin with this. In doing so, it would help if you could create and keep hold of short Wireshark captures taken during the affected periods.

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

dave
Plusnet Help Team
Plusnet Help Team
Posts: 12,257
Thanks: 306
Fixes: 4
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Just finished doing the last one so hopefully all will be back to normal tonight, please let us know if you still see any problems.
Dave Tomlinson
Enterprise Architect - Network & OSS
Plusnet Technology
Kelly
Hero
Posts: 5,497
Thanks: 380
Fixes: 9
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Well, we certainly had a lot more traffic identified as 4OD last night.  Work ok for everyone last night?  Any streaming issues at all?
Kelly Dorset
Ex-Broadband Service Manager
Roswellgrey
Grafter
Posts: 91
Registered: ‎05-08-2008

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Quote from: Kelly
Any streaming issues at all?

Whilst this wasn't a prioritisation issue, tonight I had a major loss of Netflix streaming quality - the worst seen so far.
It usually is quite happy with the maximum possible 3600 kbps HD stream, with the odd blip (i.e. short term quality reduction ) here and there.
Tonight, however, there was a sustained period from about 2010 (duration of at least 20 mins)  where it couldn't sustain a streaming rate of > 235kbps.
A quick run up of wireshark showed the prioritisation as 0x80 in the dsf, so it wasn't being wrongly prioritised.
Something, somewhere, just appeared to run of of capacity/bandwidth ... ( as always, it could have been in many places ...)
Kelly
Hero
Posts: 5,497
Thanks: 380
Fixes: 9
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Interesting.  Do you know what gate way you were connected to? 
Kelly Dorset
Ex-Broadband Service Manager
Roswellgrey
Grafter
Posts: 91
Registered: ‎05-08-2008

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

pcl-ag06
( this is what I am on now, but as the router has not reconnected session-wise since a manual forced reconnect a few days ago, it must have been what I was on last night)
At the time it was playing up, I ran a speedtest and got my normal result i..e

fyi: As an asdl/20cn luddite syncing at 8128/448 with a d/l noise margin of 13dB, this is my normal result .  Don't mock the non-upgraded !

So it wasn't a general capacity drop. It seemed very specific to streaming.
And yet the prioritisation was correct, so it easily could have been something upstream of Plusnet causing this ...
Out of interest ......
Theoretically, if every man and his dog suddenly started streaming at the same time would:
(A) lower classes of traffic be allocated less capacity , and streaming still work to the detrement of lower classes of traffic, up to the worst case point where it
all starts to collapse due to not enough fundamental Plusnet capacity to support the streaming clients , or
(B) Does streaming get a specified percentage of available PN capacity (due to its traffic class, like a traditional QOS system), and its effectively
sandboxed to protect the Plusnet network from scenario (A) above.
As such, one could see a prioritisation of 0x80 at the client end, and see a good available bandwidth from the likes of speedtest , but in reality  
there is another level of rate reduction going on within the sandbox , due to overall excessive streaming usage by many other clients .
Hence, a client could observe a reduced streaming rate with a high prioritisation......
Just curious.

dave
Plusnet Help Team
Plusnet Help Team
Posts: 12,257
Thanks: 306
Fixes: 4
Registered: ‎04-04-2007

Re: Akamai and streaming data - traffic management problem(s) are NOT fixed :-(

Don't suppose you kept the wireshark? Would be useful to see which CDN was serving the traffic. Can't see anything obvious wrong but at that time we were seeing 2Gbps of traffic for the football so could be some congestion further upstream somewhere so would be useful to try and work that out.
We have a finite amount of bandwidth of which certain proportions are reserved for each queue. Titanium will get everything that it wants so if everyone games at the same time then every queue gets less bandwidth but the lowest priority traffic gets squeezed down first. With streaming it's the same except titanium will never get squeezed. Potentially you can get to a point where the demand for streaming is so high that there's no more lower priority traffic that can be squeezed down. At that point gold starts contending against itself but there's no real priority between the different types of traffic within gold, so email could be slowed down just as much as browsing and just as much as streaming.
Fortunately what we can do in those situations is deliberately slow down some traffic that doesn't need to have full speed. For example, a progressive video stream like YouTube will download if it can at line speed rather than at the bit rate of the video. It doesn't make any real difference for a YouTube video to come down at 4Mbps rather than 8Mbps in a situation like that so doing that protects the experience of those doing a stream like iPlayer which streams at the bitrate of the video. Same with file downloads such as MS Updates. Again in a high demand situation would you rather have your MS Updates come down a bit slower to ensure that you can watch the football on Sky?
Dave Tomlinson
Enterprise Architect - Network & OSS
Plusnet Technology