Plusnet
Sunday 22nd November 2009Login | Register | Help
Pages: 1 [2] 3 4 5

Critical Path Anti-spam Maintenance

« Reply #16 on 30/10/2007, 14:18 »
What are the chances of re-zeroing the graphs for the CP appliances?

Because of the 10k queues earlier, vertical resolution is proving to be ineffective - the queues appear to have flatlined at 0 which is probably not the case Wink

In addition, it appears that mails passing through the CP boxes are still being run through the DSPAM process (on sunmxcore-09).  I'm guessing that when you're happy with the CP rollout, the sunmxcores will be relieved of this duty - allowing much more throughput?

I appreciate that the CP boxes are not 'tagging' the spam as yet.  However, perhaps an interim X-Header tag would be useful so that customers who receive mail pathed through the test system can identify potential spamminess/hamminess of the CP solution?

So far, I've received 14 mails via the CP appliances.  Some are showing X-MAA*Spam headers within the X-DSPAM-Factors header, some not. 

Anyhow, so far it's looking good.  I reckon Bob should be popping the kettle on in celebration Wink

B.
Barry Zubel : plusnet Community Site Forum Moderator
I'm a customer, not an employee
Logged
  • jelv
  • Bright Spark
  • *
  • Posts: 9327
  • View Profile
« Reply #17 on 30/10/2007, 14:25 »
I appreciate that the CP boxes are not 'tagging' the spam as yet.  However, perhaps an interim X-Header tag would be useful so that customers who receive mail pathed through the test system can identify potential spamminess/hamminess of the CP solution?

Can you not filter on the "X-MAA: Suspected Spam" which appears in the middle of the Received: lines?
jelv

Plusnet chatroom: /server usertools.plus.net   /join #usertools
Plusnet Unlimited is not without limits
Logged
« Reply #18 on 30/10/2007, 14:32 »
I R numpty.

Of course you're right jelv - I hadn't seen that little gem tucked away in the headers there Smiley

B.
Barry Zubel : plusnet Community Site Forum Moderator
I'm a customer, not an employee
Logged
« Reply #19 on 30/10/2007, 15:37 »
Below are the headers from a recently missed spam which came through the CP boxes. Is the X-MAA line added by CP? - if so it was right!

Yep, that's the beauty! The Exim config to get the subject line altered for these emails is likely to be done early next week.

What are the chances of re-zeroing the graphs for the CP appliances?

I don't think this is possible. It will fix itself over time though. Incidentally nothing could be seen on the graph even before the blip! Wink

Quote
In addition, it appears that mails passing through the CP boxes are still being run through the DSPAM process (on sunmxcore-09).  I'm guessing that when you're happy with the CP rollout, the sunmxcores will be relieved of this duty - allowing much more throughput?

Remember, the Critical Path roll-out is a trial so it's quite likely that it will be removed at some point before a long term decision is made. It makes sense therefore to leave dspam running.

What we'll probably do before the trial is out is configure the Critical Path boxes to refuse the messages it identifies as definite spam for a 24 hour period or so instead of deliver them. I'm sure after a few days of the CP boxes tagging subject line, customers will have no qualms about this.

Quote
I appreciate that the CP boxes are not 'tagging' the spam as yet.  However, perhaps an interim X-Header tag would be useful so that customers who receive mail pathed through the test system can identify potential spamminess/hamminess of the CP solution?

As per Jelv's suggestion you can use the X-MAA headers.

Quote
Anyhow, so far it's looking good.  I reckon Bob should be popping the kettle on in celebration Wink

You've obviously not heard about my reluctance to make tea for the team Wink
Bob Pullen
Plusnet Comms Team
Service Status :: RSS :: Email

twitter / plusnet
Logged
  • jelv
  • Bright Spark
  • *
  • Posts: 9327
  • View Profile
« Reply #20 on 30/10/2007, 16:05 »
You've obviously not heard about my reluctance to make tea for the team Wink

I rather suspect he had which is why he made the comment!
jelv

Plusnet chatroom: /server usertools.plus.net   /join #usertools
Plusnet Unlimited is not without limits
Logged
« Reply #21 on 30/10/2007, 16:07 »
You've obviously not heard about my reluctance to make tea for the team Wink

I rather suspect he had which is why he made the comment!

I have sources... Wink

B.
Barry Zubel : plusnet Community Site Forum Moderator
I'm a customer, not an employee
Logged
  • Jameseh
  • Legend
  • *
  • Posts: 18336
  • Napier for England!
  • View Profile
« Reply #22 on 30/10/2007, 16:08 »
Quote
I have saucers...

Typo duly amended.
twitter / Jameseh
Logged
« Reply #23 on 31/10/2007, 09:16 »
This morning we introduced the second Critical Path box to the network and put a second mail server behind it. That means we've now got sunmxcore09 and sunmxcore10 behind the anti-spam appliances.

All's still going well with the exception of a small blip you'll notice that occurred at about 2:30am. After looking into this we found that the reject log (on the mxcore) had grown to a size too big for the system to handle. We're looking at methods to better manage the log size and will deploy a fix across the platform once a safe method has been chosen.
Bob Pullen
Plusnet Comms Team
Service Status :: RSS :: Email

twitter / plusnet
Logged
« Reply #24 on 31/10/2007, 12:14 »
OT: 

I'm intrigued by the queue patterns on sunmxcore09/13 and sunmxcore16/17/18.

Now that the huge anomaly has dropped off of the graph of sunmxcore09, you can clearly see that the queue goes through an hourly cycle.  This also happens to a lesser extent to sunmxcore13.  On the hour, every hour there is a (significant) queue drop.  What happens at this time to cause the queue to flush faster?

The same pattern can be seen on sunmxcore16/17/18 with one difference - the pattern seems to follow a 6-hourly pattern, with a queue drop occurring at 00.00, 06.00, 12.00 and 18.00.

I R Intrigued!

B.
Barry Zubel : plusnet Community Site Forum Moderator
I'm a customer, not an employee
Logged
  • jelv
  • Bright Spark
  • *
  • Posts: 9327
  • View Profile
« Reply #25 on 31/10/2007, 12:20 »
Let's see if you have better luck with that question that I did yesterday (last post on page 1).
jelv

Plusnet chatroom: /server usertools.plus.net   /join #usertools
Plusnet Unlimited is not without limits
Logged
« Reply #26 on 31/10/2007, 12:34 »
Barry, I'll try and get an answer for you. I suspect it's the result of some script that trawls the queues and performs some function or other however I shall get a definitive answer for you.

Edit: My assumption was correct, it's due to housekeeping scripts we have running on the servers to remove 'stuck' messages.

As an aside, I've just posted the following as we'll be making some config changes tomorrow morning based on our findings over the last 24 hours or so:

Quote
Maintenance Window:-
Thursday 1st November 6:00am-7:00am.

Service Affected:-
Incoming Email.

Duration of expected customer impact:-
No customer impact is expected.

Detailed description of work to be performed:-
During the initial phases of the recent Critical Path anti-spam trial, we have identified a number of minor problems with the configuration of our incoming email servers. We will be making some changes to this configuration in order to overcome these problems and further optimise the performance of the email platform. A brief summary of the changes follows:

* Spam/virus process changes - A problem was identified shortly after we introduced the first Critical Path box that resulted in a 15 minute period where the queues on the spam appliances increased dramatically. We found that this was caused by existing spam/virus processes reaching a state where they were unable to process any more files. We were able to separate these processes and fix the problem shortly after it occurred. We now need to make the same changes to the rest of our mail servers before they too are placed behind the Critical Path boxes.

* Disable RBL Checking - At the moment both the Critical Path servers and our mail servers are checking against RBLs:-
http://en.wikipedia.org/w...lackhole_List#Terminology
This is unecessary and we need to disable the RBL checking on our servers for any email that has already been checked by the Critical Path servers.

* Logging changes - Our mail servers are logging unnecessary information. One of the servers in front of the Critical Path box reached a point last night where this caused a small rise in the queues due to the log becoming full. We will be disabling the logging of this additional information.

Expected customer impact:-
No customer impact is expected.

Other Notes:-
Further details regarding the Critical Path trial can be found in the planned maintenance notice here:-
http://usertools.plus.net...us/archive/1193668404.htm

« Last Edit: 31/10/2007, 17:53 by Bob »

Bob Pullen
Plusnet Comms Team
Service Status :: RSS :: Email

twitter / plusnet
Logged
« Reply #27 on 01/11/2007, 09:17 »
The maintenance work this morning went fine and we're hoping this should put an end to the intermittent email delays we've been seeing.

There are another two delivery servers behind the Critical Path boxes now: sunmxcore11 and sunmxcore12.

We'll probably be looking at putting a fifth out on Friday if all goes well so we're then left to do two a day next week (We're only rolling out to the 11 primary mail servers initially).
Bob Pullen
Plusnet Comms Team
Service Status :: RSS :: Email

twitter / plusnet
Logged
  • jelv
  • Bright Spark
  • *
  • Posts: 9327
  • View Profile
« Reply #28 on 01/11/2007, 09:32 »
(We're only rolling out to the 11 primary mail servers initially).

Why? (especially as a lot of spam is targeted at mxlast)
jelv

Plusnet chatroom: /server usertools.plus.net   /join #usertools
Plusnet Unlimited is not without limits
Logged
« Reply #29 on 01/11/2007, 13:31 »
Forgive me for being a killjoy but the queues seem to be increasing on some of the servers (9,10,13,14,15 and 17)

I admit that it does not match the Critical Path platform servers, but it might still be a cause for concern....

The start of the increase seems to be 6am about the time of the maintenance.

« Last Edit: 01/11/2007, 13:33 by Tony W »

  • jelv
  • Bright Spark
  • *
  • Posts: 9327
  • View Profile
« Reply #30 on 01/11/2007, 14:02 »
The increasing queues show up well on the Mail Queues graphs. I see the mxlast servers are also affected.
jelv

Plusnet chatroom: /server usertools.plus.net   /join #usertools
Plusnet Unlimited is not without limits
Logged
« Reply #31 on 01/11/2007, 14:03 »
It does look like queues are rising across the entire platform.  My gut reaction would be that it's genuine load across the platform.  My assumption is that no maintenance was applied to mxcore09 and mxcore10 and they're also showing the same load increase so it's possibly not caused by the maintenance itself.

However, I'm sure Bob has his eagle eye on it.

B.
Barry Zubel : plusnet Community Site Forum Moderator
I'm a customer, not an employee
Logged
Pages: 1 [2] 3 4 5
Jump to:  

Related Sites

Community Apps

Here at Plusnet we're always trying to use clever open source things to make our lives easier. Sometimes we write our own and make other people's lives easier too!

View the Plusnet Open Source applications page

About Plusnet

We sell broadband, phone, VoIP and more to homes and businesses in the UK. Winner of 9 out of 11 Categories in the 2008 USwitch survey. Winner of "Best Consumer ISP" at 2008 ISPA awards. Voted number 1 in the Broadband Choices 2008 survey.

© Plusnet plc All Rights Reserved. E&OE

Powered by SMF | SMF © 2006-2008, Simple Machines LLC

Add to Technorati Favourites