cancel
Showing results for 
Search instead for 
Did you mean: 

faults.plus.net down - need odd disconnects and log messages interpreted

tstaddon
Rising Star
Posts: 175
Thanks: 27
Registered: ‎01-08-2007

faults.plus.net down - need odd disconnects and log messages interpreted

Hi

For the last 2 weeks I've had intermittent issues with both a BT Hub and the PN hub dropping connection sometimes for a minute or two, other times for hours. PN status indicates no problem on its side when this happens.

The fault reporting site isn't up so I can't log it.

Looking at the Hub One logs, I'm sensing something similar to this:

https://community.bt.com/t5/Archive-Staging/Re-Daily-disconnections-due-to-quot-https-pbthdm-bt-mo-q...

And this:

https://community.plus.net/t5/My-Router/One-Hub-Watchdog-Resets-cause-0x3/td-p/1613381

However the CWMP failing to resolve the host is a bit weird...

 

11:36:14, 21 Nov.	OUT: BLOCK [9] Packet invalid in connection (UDP [0.0.0.0]:68-​>[255.255.255.255]:67 on ath00)
11:36:10, 21 Nov.	ath10: STA XX:XX:XX:XX:40:4f IEEE 802.11: Client associated
11:36:09, 21 Nov.	( 45.640000) CWMP: session closed due to error: Could not resolve host
11:36:05, 21 Nov.	( 41.840000) WiFi auto selected channel 44
11:36:05, 21 Nov.	( 41.840000) 36-​93::40-​93::44-​92::48-​92::52-​92::56-​92::60-​92::64-​92::100-​93::104-​93::108-​94::112-​94::116-​94::120-​95::124-​96::128-​95
11:36:05, 21 Nov.	( 41.840000) 5 GHz Wireless: Rescan, Reason: 'Power-​up'
11:36:03, 21 Nov.	( 40.110000) CWMP: Server URL: https://pbthdm.bt.mo; Connecting as user: ACS username
11:36:03, 21 Nov.	( 40.100000) CWMP: Session start now. Event code(s): '1 BOOT,4 VALUE CHANGE'
11:36:03, 21 Nov.	( 39.650000) CWMP: Initializing transaction for event code 1 BOOT
11:36:01, 21 Nov.	( 38.340000) WiFi auto selected channel 6
11:36:01, 21 Nov.	( 38.340000) 1-​82::2-​91::3-​90::4-​93::5-​100::6-​102::7-​99::8-​95::9-​102::10-​96::11-​95::12-​96::13-​95
11:36:01, 21 Nov.	( 38.340000) 2.4 GHz Wireless: Rescan, Reason: 'Power-​up'
11:36:01, 21 Nov.	( 38.340000) Wire Lan Port 2 up
11:36:01, 21 Nov.	( 38.340000) Wire Lan Port 1 up
11:36:01, 21 Nov.	( 37.890000) WAN Auto-​sensing running.
11:35:57, 21 Nov.	( 33.890000) System up, firmware version: 4.7.5.1.83.8.263
11:35:54, 21 Nov.	( 30.820000) WPA2 mode selected
11:35:54, 21 Nov.	( 30.820000) WPS enabled
11:35:51, 21 Nov.	( 28.350000) WPA2 mode selected
11:35:51, 21 Nov.	( 28.350000) WPS enabled
11:35:41, 21 Nov.	( 18.330000) System start
11:35:41, 21 Nov.	( 18.330000) Boot reason: watchdog reset (cause: 0x3)

 

 We have six people in the house on an almost round-the-clock basis so the connection does get hammered - but I've rigged things up so anything on the secure internal LAN isn't touching the VDSL router directly. The traffic management on TP-Link is a bit better. So in effect the PN router sees two tablets, a smart phone, TV and PS4 as client devices and everything else sits on a totally different LAN which has the external IP address assigned by the PN router. (See diagram).

 

EDIT: The only other thing that seems to happen from time to time is I get ICMP type 3 code 1 messages in the log, but these appear to be related to the router trying to contact the TP-Link router's management interface via its external address and (rightly) not getting any response.

9 REPLIES 9
Baldrick1
Seasoned Hero
Posts: 6,486
Thanks: 2,924
Fixes: 190
Registered: ‎30-06-2016

Re: faults.plus.net down - need odd disconnects and log messages interpreted

@tstaddon 

Are you reporting on https://faults.plus.net? I'm not sure that the old http address still works.

tstaddon
Rising Star
Posts: 175
Thanks: 27
Registered: ‎01-08-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted

Yes I am. The faults site now seems to be back up, so perhaps the question should be reworked!

tstaddon
Rising Star
Posts: 175
Thanks: 27
Registered: ‎01-08-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted

Not sure if it's relevant but every time the router disconnects the logs show a CWMP connection failure.

It is almost as if this is the same issue as...

https://community.bt.com/t5/Archive-Staging/Re-Daily-disconnections-due-to-quot-https-pbthdm-bt-mo-q...

See post 5 in that thread.

I did a bit of research and it does seem the two seem to go hand in hand, although it isn't clear if the disconnect causes the CWMP connection failure or the CWMP causes the disconnect. Some people are saying that with BT hubs, two CWMP disconnects one after the other precede a disconnection.

Given this observation's been made with multiple versions of managed DSL routers and multiple ISPs over at least five years if not more, someone somewhere in the tech team of any one of the larger ISPs must have come across it on many occasions. Presumably they don't just run a line test, say they can't see anything wrong on the line, and call BT out every time a customer's router is disconnecting for a couple of minutes once or twice a day?

If there's a repeatable interval between the disconnects, doesn't that rather suggest it's not a line fault but a problem with the kit somewhere in the chain? For example, on my logs it does appear as if one disconnect happened X half-hours after the last one, the next disconnect was X half-hours after that, and a third one was also X half-hours ahead of the second one. Any variation at the minute/second level could be explained by a randomization for a regular polling interval.

MauriceC
Superuser
Superuser
Posts: 4,070
Thanks: 2,490
Fixes: 17
Registered: ‎10-04-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted

Looks like a problem @bobpullen   might be interested in?

Superusers are not staff, but they do have a direct line of communication into the business in order to raise issues, concerns and feedback from the community.

tstaddon
Rising Star
Posts: 175
Thanks: 27
Registered: ‎01-08-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted

Have to say I'm impressed - quick turnaround on an OpenReach visit (I think they were already in the neighbourhood!) where the NTE and filtered VDSL faceplate have been replaced and the port's been swapped in the cabinet. So far so good - stable connection and after a bit of training the speed's gone up over 20%. I'll burn it in over the weekend.

bobpullen
Community Gaffer
Community Gaffer
Posts: 15,023
Thanks: 2,667
Fixes: 170
Registered: ‎04-04-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted


@tstaddon wrote:

Not sure if it's relevant but every time the router disconnects the logs show a CWMP connection failure.

It is almost as if this is the same issue as...

https://community.bt.com/t5/Archive-Staging/Re-Daily-disconnections-due-to-quot-https-pbthdm-bt-mo-q...

See post 5 in that thread.

I did a bit of research and it does seem the two seem to go hand in hand, although it isn't clear if the disconnect causes the CWMP connection failure or the CWMP causes the disconnect. Some people are saying that with BT hubs, two CWMP disconnects one after the other precede a disconnection.


The two do go hand in hand.

A CWMP connection request will typically be made once every 24 hours.

It will also be attempted on boot/or if your internet connection goes down and is then re-established.

On boot, it can be a bit eager and attempt communication before the Internet connection is established, thus the connection failures.

My concern stems from this in your original post:-

11:35:41, 21 Nov.	( 18.330000) Boot reason: watchdog reset (cause: 0x3)

That is indicative of the system itself restarting i.e. a potential hardware fault.

If you haven't already, I'd suggest factory resetting the router that message came from. If those restarts persist, then the hardware should be replaced.

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

tstaddon
Rising Star
Posts: 175
Thanks: 27
Registered: ‎01-08-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted

@bobpullen If that were the case, I would expect it to stop when I switched the router over to one I know is fully working.

Like I said, the pattern is the issue - we can almost predict to the nearest hour when it'll drop and it was happening with both routers. I have used both a BT Hub and the PN hub. Both do the same disconnect, at almost the same frequency.

I have a third router - a TP-Link Archer, currently set up in bridge mode. That is not ISP managed. If the issue resurfaces I'll test it again with that. If I'm right the problem will go away but not because the other two routers are faulty, it'll be because it isn't managed by the ISP.

BT have already replaced the master socket, filtered faceplate, moved the port in the street cabinet and tested the line thoroughly.  If it still happens exactly the same way as before, with all THREE routers, then it can't be anything to do with the router or the house wiring.

bobpullen
Community Gaffer
Community Gaffer
Posts: 15,023
Thanks: 2,667
Fixes: 170
Registered: ‎04-04-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted

I can't speak for the other device - it wasn't provided by us.

I'm not saying there isn't something else at play, but I have a fairly sound knowledge of the kit we provide, and that hex error is indicative of a wireless PCI failure or an issue with the kernel not responding to interrupts.

I also know that nothing we're doing from a hardware management perspective has anything to do with your disconnections. The CWMP logs are effect, not cause.

Your drops seem largely down to a physical loss of connectivity/sync with the green cabinet (seen as grey vertical bars below):-

sync.JPG

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

tstaddon
Rising Star
Posts: 175
Thanks: 27
Registered: ‎01-08-2007

Re: faults.plus.net down - need odd disconnects and log messages interpreted

Thanks Bob.

The log is showing other issues despite a much more stable line, the PN router has one LAN port constantly toggling between connected and disconnected (with a known good device on it) and it is still rebooting with watchdog resets.

I'm tempted to swap the router, as I still have the BT HH.