Hmmmm, it's looking increasingly likely that over the last week or so some of my regular mail has been going AWOL again

Now, my main A/C mail does NOT go anywhere remotely close to the ironport boxes as ALL spam etc. filtering is quite intentionally turned well and truly OFF. Therefore it's not a spam filtering issue but something more to do with the likes of pih-inmx19.plus.net and fhw-inmx08.plus.net et al or something downstream of them.
I'm not aware of any problems affecting mail delivery Mike.
It could possibly be that all the servers were simply too busy to accept a message at a given time and whilst the sender(s) should have retried later they didn't.
I don't envisage that being the case looking at the graphing over the last few weeks. Bear in mind that the PTP graphs will look a bit odd as they show traffic to and from the new data centre which we only recently migrated to. We also had a few monitoring blips that are represented by the gaps in the graphing.




Are there any 'known' issues that might be relevant here such as whatever
THIS was supposed to fix perhaps being a good reason for random mail going AWOL ?
That work was relieve congestion on one of the switches in our network. The uplink was being maxed by traffic to the mail hosts that sit behind it. This was affecting internal DNS which is connected to the same switch. When the mail platform was busy, we were dropping DNS UDP packets. To fix it we moved two of the mail hosts to another switch. Worth noting that when I talk about the 'mail hosts', I'm not referring to our email delivery servers.
If so then it very much looks like whatever was done on Monday either wasn't sufficient to totally fix whatever the underlying problem was or perhaps non-ironport routed mail may have been lost if the work didn't get done until very much later in the day.
The work was completed during the published maintenance window.
I can also see that spam is already ramping up quite nicely in readiness for the imminent festive season (bah humbug and all that) are the mailcores struggling to cope or something already given the extra load from the (presumed) influx of new customers recently ?
See load graphs above.
The latest positive example I have is a pair of forum digest messages that get sent to two different F9/PN A/Cs at very slightly different times - usually just a few seconds or maybe a minute or 2 at most apart. 1 A/C uses ironport whilst the other doesn't. Both messages were received on 1 A/C at around 0015 hours yesterday (Tuesday 3rd November) delivered via ironport.. Only 1 of the 2 messages was received at around the same time on the non-ironport A/C. Yeah, I know, it's usually the other way round with ironport trying to quietly eat my mail and all that but certainly not this time !!
If this happens again then provide me with a copy of the headers from the message you
*did* receive and as long as we catch it quick enough I can ask one of the Net Ops guys to scrape the mail logs for any sign of the message that was sent to the account it never arrived at.