cancel
Showing results for 
Search instead for 
Did you mean: 

Long timeouts for stale PPPoE connections

bobpullen
Community Gaffer
Community Gaffer
Posts: 16,869
Thanks: 4,950
Fixes: 315
Registered: ‎04-04-2007

Re: Long timeouts for stale PPPoE connections

Quote from: jimbof
I just (around 11.15am) cycled my router twice; first time as per a standard openWRT reboot (not taking down the connection).  The second time I put back the utility which sniffs the connection and sends you the faked ethernet frame with the sniffed session ID to get you to kill the PPPoE session at your end.  You can probably see the difference in my sig graph in terms of reconnection time...

OK, this is what I see.
It doesn't look like the session is torn down for some time after the disconnect/reconnect. Not at ours or BT's side. Neither set of RADIUS logs show any activity until about 5 minutes after the reboot. You said you rebooted at 11.15am:
Ours:
Session Started		Session Ended			Session Duration
11:25 23/May/2013 N/A   1:43:8 (on going)
11:21 23/May/2013 11:24 23/May/2013   0:3:42 <<--WRT Reboot with fix
14:39 20/May/2013 11:20 23/May/2013 * 2 Days, 20:41:26 <<-- WRT Reboot

BT's:
Type  	Logon Identity  			Number  	Latest 	 			Earliest 
START XXXXXXXX@PLUSDSL.NET 1 23/05/2013 11:25:02 23/05/2013 11:25:02
LOGON XXXXXXXX@PLUSDSL.NET 1 23/05/2013 11:24:56 23/05/2013 11:21:10
WORKED XXXXXXXX@PLUSDSL.NET 1 23/05/2013 11:20:58 * 21/05/2013 00:39:30

Quote from: knowdice
Power off at 11:33:30
Power on at 11:34:00
PPP session starts at 11:40:24
Second PPP session starts at 11:40:58 - must be a Vigor thing...
CHAP completes at 11:41:00

Which again mirrors what both sets of RADIUS logs show:
Ours:
Session Started		Session Ended			Session Duration
11:40 23/May/2013 N/A   1:5:30 (on going)
07:00 23/May/2013 11:40 23/May/2013 *   4:40:14

BT's:
Type  	Logon Identity  			Number  	Latest 	 			Earliest 
START XXXXXXXX@PLUSDSL.NET 1 23/05/2013 11:40:59 23/05/2013 11:40:59
WORKED XXXXXXXX@PLUSDSL.NET 1 23/05/2013 11:40:19 * 23/05/2013 07:00:03

* Going to try and grab the exact time of the PPP disconnect from our RADIUS servers (down to the second). It will help ascertain where it originates from. our side or BT's.
Edit: FWIW, the PPP keep alive on our edge routers/gateways is set to 120 seconds. I'm informed that our RADIUS servers only have timers for dial-up customers.

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

jimbof
Grafter
Posts: 348
Thanks: 2
Registered: ‎02-05-2013

Re: Long timeouts for stale PPPoE connections

Hi Bob,
For PPP connections there are two important numbers I believe; the LCP echo time (i believe this is what you are referring to in the "keepalive") - which is a number of seconds between echos to which the other side should reply, and then a number of such echos not returning after which to kill the connection.  If the LCP echo time were set to 120 seconds it would suggest that 3 or 4 failures are required before you tear down the session at your end.
Under linux the pppd manual description of the two parameters is here:
lcp-echo-failure n
If this option is given, pppd will presume the peer to be dead if n LCP echo-requests are sent without receiving a valid LCP echo-reply. If this happens, pppd will terminate the connection. Use of this option requires a non-zero value for the lcp-echo-interval parameter. This option can be used to enable pppd to terminate after the physical connection has been broken (e.g., the modem has hung up) in situations where no hardware modem control lines are available.
lcp-echo-interval n
If this option is given, pppd will send an LCP echo-request frame to the peer every n seconds. Normally the peer should respond to the echo-request by sending an echo-reply. This option can be used with the lcp-echo-failure option to detect that the peer is no longer connected.
In  any case, even 120 seconds must be too long in the face of the advice in the BT SIN's, let alone a multiple of 120 seconds.
bobpullen
Community Gaffer
Community Gaffer
Posts: 16,869
Thanks: 4,950
Fixes: 315
Registered: ‎04-04-2007

Re: Long timeouts for stale PPPoE connections

Thanks. We're continuing to investigate and I'll let you know the outcome. Might not be today though.

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

jimbof
Grafter
Posts: 348
Thanks: 2
Registered: ‎02-05-2013

Re: Long timeouts for stale PPPoE connections

I would bet on 3 retries required before a fail, and that would be consistent with what has been seen; sometimes it can take up to 8 minutes, but I had seen around 6.5mins -7mins.
If you had just had a successful echo/reply, you then have 2 minutes before the next one, plus 3 failures each of 120 seconds.  So the range of failure times would be from 6 to 8 minutes, depending on exactly when in the 120 second cycle your connection died.
PAPED
Dabbler
Posts: 21
Thanks: 1
Registered: ‎09-12-2010

Re: Long timeouts for stale PPPoE connections

Not sure if this is helpful or not as I thought that this was due to my set-up but the thread here refers to FTTC which uses PPPoE, however if you use a router and bridged ADSL (not FTTC) modem then you also use PPPoE from the router for it to set-up the connection via the bridged ADSL modem.
I have this set-up and have had for a number of years with Plusnet and other ISP's and have this same issue with the PPPoE timeout's, but on previous ISP's and Plusnet up to around 12 to 18 months ago it was not an issue, it used to connect straight away and certainly within 1 to 2 attempts but now its always mutliple attempts and normally 5 to 10 minutes before the connection happen (often have time to make a coffee while waiting). However wasn't the FTTC testing/rollout started around the same time and hence was there changes done for this in Plusnet which could be causing this issue? (Cynical IT techie - there is always a reason for something breaking and it's often a change somewhere with unintentional effects - thought it was my router that the change was on but.....?)
As above I had always assumed it was my end newer firmware etc. but found this post, as I have had a few dropouts while working from home the last couple of days and its a bit of a pain waiting so was looking to see if there was any reconfig of the router I could do to fix it....
Anyway thought it was worth the post as the issue is at least similar in nature i.e. same error on PPPoE but obviously as my set-up differs to true FTTC in that the last mile from the exchange is ADSL2 (copper) connection it may be a different root cause, however it if is not then any "fix" could also help a number of none FTTC users as well....
In case its helpful my connection path is:
Linksys WRT54GL (running Tomato firmware) --> PPPoE --> Netgear DG632 (in bridge mode) --> ADSL2 --> BT/Plusnet infrastructure.
The PPP session/authentication, ISP connection is from my end controlled by the Linksys.
jimbof
Grafter
Posts: 348
Thanks: 2
Registered: ‎02-05-2013

Re: Long timeouts for stale PPPoE connections

Do you build your own tomato distribution or use a prepackaged build?  If you build your own I can give you the source code for a utility which (at least on my FTTC PPPoE connection) kills the stale session pretty much instantly.  It might need mods to work on mips though.
PAPED
Dabbler
Posts: 21
Thanks: 1
Registered: ‎09-12-2010

Re: Long timeouts for stale PPPoE connections

@ jimbof - Thank you for the offer but I use pre-built "shibby" builds and would not know where to start building and compiling my own....
jimbof
Grafter
Posts: 348
Thanks: 2
Registered: ‎02-05-2013

Re: Long timeouts for stale PPPoE connections

Someone just sent me a private email message asking me for my code but didn't include their email address - so I have no way to get back in touch!
Anyway, bumping this to the top in the hope they might see it...
jimbof
Grafter
Posts: 348
Thanks: 2
Registered: ‎02-05-2013

Re: Long timeouts for stale PPPoE connections

PS plusnet - anyone looked at fixing this yet?
MisterW
Superuser
Superuser
Posts: 14,574
Thanks: 5,408
Fixes: 385
Registered: ‎30-07-2007

Re: Long timeouts for stale PPPoE connections

Quote
Someone just sent me a private email message asking me for my code but didn't include their email address - so I have no way to get back in touch!
I suspect it was from this thread http://community.plus.net/forum/index.php/topic,116828.0.html

Superusers are not staff, but they do have a direct line of communication into the business in order to raise issues, concerns and feedback from the community.

Kelly
Hero
Posts: 5,497
Thanks: 380
Fixes: 9
Registered: ‎04-04-2007

Re: Long timeouts for stale PPPoE connections

We've got someone looking at it.  I'll see how they are doing.
Kelly Dorset
Ex-Broadband Service Manager
bobpullen
Community Gaffer
Community Gaffer
Posts: 16,869
Thanks: 4,950
Fixes: 315
Registered: ‎04-04-2007

Re: Long timeouts for stale PPPoE connections

Quote from: jimbof
PS plusnet - anyone looked at fixing this yet?

Right, we've some work penned in for the 22nd that we're hoping will make the situation a little less painful.
The 'ppp keepalive' setting is currently set to 120s on the edge routers/gateways. According to the manufacturer's documentation a failed link is detected after 3-4 failed ECHO responses. That explains the 8 minute thing some customers have observed.
We're going to try lowering the 'ppp keepalive' setting to 30s. This should allow a reconnection within two minutes, possibly less. Not as low as some might have hoped for, however it's worth bearing in mind that we don't have separate VLANs for FTTx traffic and we want to be careful not to disrupt the rest of the platform.
The 'ppp keepalive' setting is based on the edge router/gateway getting an ECHO reply from the customer's equipment. There's a risk that lowering the timer could cause unwanted disconnections every 90-120s if something doesn't go as planned.
We'll be asking the guys in the support centre to keep a very close eye on things after the changes have been made.
Keep your eyes peeled for the planned maintenance notice on the 21st ...

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

jimbof
Grafter
Posts: 348
Thanks: 2
Registered: ‎02-05-2013

Re: Long timeouts for stale PPPoE connections

Sounds like a reasonable compromise, I'm sure you'll see a reduction in related connection support issues.  For me that probably means I'll take the hack out of my router as I imagine after a reboot the connection will only have a further 20-30 seconds to timeout, quite acceptable.
Lorian
Grafter
Posts: 704
Thanks: 5
Registered: ‎31-07-2007

Re: Long timeouts for stale PPPoE connections

My router reboots in maybe 20-25 seconds, so quicker would be better but at least 2 minutes is better than the 7 I sit an watch the annoyance for so often.
krs360
Grafter
Posts: 94
Thanks: 2
Registered: ‎27-04-2013

Re: Long timeouts for stale PPPoE connections

I run the Thomson router in bridged mode and make the PPPOE connection via my RPI which is my router/gateway for my local lan.
If the connection isn't dropped through 'poff' - e.g I remove the power/ethernet cable by accident, or the power goes off the modem seems to take forever to drop the connection. I haven't timed it in minutes but it can be highly frustrating.