<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Community Site News &#187; Incident Reports</title>
	<atom:link href="http://community.plus.net/blog/category/incident-reports/feed/" rel="self" type="application/rss+xml" />
	<link>http://community.plus.net</link>
	<description>News and Updates on the Community.</description>
	<lastBuildDate>Sat, 21 Nov 2009 00:06:34 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Datacentre Outage - NEW</title>
		<link>http://community.plus.net/blog/2008/02/21/datacentre-outage-new/</link>
		<comments>http://community.plus.net/blog/2008/02/21/datacentre-outage-new/#comments</comments>
		<pubDate>Thu, 21 Feb 2008 09:14:22 +0000</pubDate>
		<dc:creator>Chris Parr</dc:creator>
				<category><![CDATA[Incident Reports]]></category>
		<category><![CDATA[PlusNet News]]></category>

		<guid isPermaLink="false">http://community.plus.net/blog/2008/02/21/datacentre-outage-new/</guid>
		<description><![CDATA[At approximately 01:00am this morning our primary datacentre suffered a complete power outage.
This meant that most of our internal systems, portals, and email were unavailable.
Our network engineers worked to restore the primary services, and internal systems and portal access are now available.
We are continuing to experience problems with email services, and are working on restoring [...]]]></description>
			<content:encoded><![CDATA[<p>At approximately 01:00am this morning our primary datacentre suffered a complete power outage.</p>
<p>This meant that most of our internal systems, portals, and email were unavailable.</p>
<p>Our network engineers worked to restore the primary services, and internal systems and portal access are now available.</p>
<p>We are continuing to experience problems with email services, and are working on restoring mail access as our priority.</p>
<p>We will provide a further update as soon as possible, and would like to thank you for your patience whilst we resolve these issues.</p>
<p>Kind Regards,<br />
Mand Beckett<br />
Customer Support</p>
<p>Please note that we don&#8217;t usually post service status notices on to the Community Site, however as the service status feed isn&#8217;t updating correctly to reflect the problems we are posting this here. Any further updates will be made on our usual service status post which can be found here:-</p>
<p><a href="http://usertools.plus.net/status/archive/" rel="nofollow">http://usertools.plus.net/status/archive/</a></p>
<p>As the service status threads are working correctly this post has been removed from the front page of the Community Site.</p>
]]></content:encoded>
			<wfw:commentRss>http://community.plus.net/blog/2008/02/21/datacentre-outage-new/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Email Incident Report (Lost email - 22nd August)</title>
		<link>http://community.plus.net/blog/2007/08/30/email-incident-report/</link>
		<comments>http://community.plus.net/blog/2007/08/30/email-incident-report/#comments</comments>
		<pubDate>Thu, 30 Aug 2007 16:39:30 +0000</pubDate>
		<dc:creator>Bob Pullen</dc:creator>
				<category><![CDATA[Incident Reports]]></category>

		<guid isPermaLink="false">http://community.plus.net/blog/2007/08/30/email-incident-report/</guid>
		<description><![CDATA[The following report provides a detailed account of the incident that resulted in the loss of legitimate customer email during the day and evening of Wednesday 22nd August. We would again like to express our sincere apologies for this disruption to your email service. Everyone at PlusNet understands the importance of email, and we do [...]]]></description>
			<content:encoded><![CDATA[<p>The following report provides a detailed account of the incident that resulted in the loss of legitimate customer email during the day and evening of Wednesday 22<sup>nd</sup> August. We would again like to express our sincere apologies for this disruption to your email service. Everyone at PlusNet understands the importance of email, and we do recognise the inconvenience this problem has caused.</p>
<p>Throughout these events, we have provided regular updates via <a href="http://usertools.plus.net/status/archive/1188147129.htm" title="Service Status">Service Status</a> and have been <a href="http://community.plus.net/forum/index.php?topic=991.0">discussing the incident</a> in our Community Site discussion forums.<span id="more-9833"></span></p>
<h2>Summary</h2>
<p>After a successful internal trial, we started the work to introduce a new spam detection system to our customer e-mail platform. Following the first stage of this deployment, a previously unidentified issue occurred which required us to make a responsive change to our mail delivery platform. This was not planned as part of the original upgrade and in making the change an error was introduced to a configuration file on the mail platform. As a direct result, a large number of legitimate e-mails sent to our customers on 22nd August were incorrectly dropped from our mail platform.</p>
<p>The remainder of this report provides background and a detailed explanation as to what happened and why. It hopefully answers the remaining questions posed by our customers during the last week.</p>
<h2>Background</h2>
<p>We have, for some time now, been seeking an appropriate solution to deal with the problem of unsolicited spam email. During this time we have looked at various anti-spam measures ranging from refusing to accept email that is malformed or clearly generated by a <a href="http://en.wikipedia.org/wiki/Botnets" title="Botnet">Botnet</a>, to completely outsourcing our mail handling to a specialist spam partner who could perform higher quality spam cleansing than our existing systems are capable of.</p>
<p>One promising solution we tested was a spam detection appliance from <a href="http://www.criticalpath.net" title="Critical Path website">Critical Path</a>. This system has been in operation for over three months on ‘Gatekeeper&#8217;, our internal email platform. Gatekeeper has always been a target for spammers due to the sheer volume of PlusNet email addresses which are in the public domain, and addresses like &#8220;support@plus.net&#8221; which are in many people&#8217;s address books. The trial proved successful, with about 97% of all email Gatekeeper handles being correctly identified as spam, making this significantly more effective than any of our other anti-spam measures. On this basis, we decided to move ahead with implementing the same anti-spam system for all customer email.</p>
<p>We initially <a href="http://usergroup.plus.net/forum/index.php/topic,5002.0.html" title="Usergroup forum requesting volunteers for Critical Path">planned a small trial</a> for volunteer customers as the first stage of implementing the solution. The work to install the Critical Path appliance in front of our customer mail platform was carried out during a planned maintenance window on the morning of 22nd August. At this point the appliance was set in a pass through mode, meaning that it did not actively change or block any emails, instead recording what action it would have taken before passing the mails onto the existing mail delivery platform. This diagram showing how the platform was reconfigured provides a reference for the timeline below:</p>
<p><img src="http://www.binarybob.plus.com/cp_delivery_process_incident_report.jpg" alt="Critical Path Deliveyr process" align="middle" /></p>
<h2>Timeline &#8211; 22ng Aug 2007</h2>
<h4> 06:30</h4>
<p>As <a href="http://usertools.plus.net/status/archive/1187607925.htm" title="Service Status Archive">announced beforehand</a>, work to deploy the new spam appliance in monitor-only mode onto our live network began early on Tuesday morning. Initially the device was activated in front of one mail server, where it was fully tested before being applied to all mail servers. At this time in the morning, with relatively low volumes of email, all tests proved successful and mail was passing through the spam appliance and into our mail delivery platform without difficulty. We knew at this stage however that the real test would come after 9AM when the platform began to get busier.</p>
<h4>10:00</h4>
<p>Mail queues began to form on the Spam Appliance from about 09:30. This had been anticipated to some degree, due to the normal burst in volume of mail at this time of the morning. During our internal trial and planning stage we had established mail throughput rates <em>should </em>be around four times that of the normal load on our mail platform. It was therefore expected that the queues formed during the busiest period would clear quickly once the morning mail rush was over.</p>
<p>By 10AM, the amount of queuing mail on the spam device was  still rising, and it was felt that further action was needed. The decision was either to roll-back immediately or to investigate further and see if we could identify and resolve whatever problem was causing mail to queue. During this period we were working with the Critical Path engineers, and upon initial investigation we found a large amount of undeliverable bounce messages had formed on the Critical Path appliance.</p>
<p>The bounce messages in question were formed as a result of an existing &#8216;<a href="http://www.clamav.net/" title="Clam AV Website">Clam AV</a>&#8216; process that deals specifically with known phishing and image spam attacks. Normally, this mail is refused and the sending mail server is sent an error message to explain why (Bob Pullen explained this further in a recent <a href="http://usergroup.plus.net/forum/index.php/topic,5002.msg67082.html#msg67082" title="Usergroup forum thread">Usergroup forum post</a>). In this case, Critical Path had accepted the mails, but they were then being refused by our mail delivery platform and were queuing up on the appliance. This problem had not been identified during load testing carried out, both in the vendors facility and during the local testing at PlusNet.</p>
<h4>10:15</h4>
<p>The engineer managing the Critical Path deployment held a workshop with colleagues and a plan of action was agreed. At this stage they believed strongly that a simple solution could be found which would provide a work-around to the queuing mail. Finding a solution was considered favourable to performing a full roll-back of the implementation and it was believed that this would have the least negative impact on email delivery and our customers. The idea was to change the configuration of the mail delivery platform so that it stopped rejecting the known phishing and image spam, instead accepting and silently dropping this mail. This would prevent further mails queuing  on the spam appliance.</p>
<p>At this point,  having tested the proposed new configuration within the PlusNet staging (test) environment and with a view to finding a quick solution, the engineering team decided not to seek formal change control in advance of making the change on the live platform. The testing had appeared to be successful and mail was passing through correctly on the test platform and being delivered without any issue. The configuration change was regarded as both urgent and low risk.</p>
<h4>11:00 &#8211; 12:00</h4>
<p>The configuration adjustment itself was simply a change within a part of the mail server configuration known as the ACL (Access control list). Once the change was made to the live mail servers the platform was monitored by checking the activity logs, mail queues and number of external connections, all of which indicated to the engineers that the changes had been applied successfully.</p>
<h4>13:00 &#8211; 16:00</h4>
<p>Having made this change, it soon became apparent that the queues on the Critical Path appliance were still rising. At this point it was clear that there was a more fundamental problem with mail throughput from the Critical Path servers, in that they were not passing messages to the mail delivery platform quickly enough. The Critical Path engineers, who had also been involved in the roll-out, were then asked to investigate and tune the configuration in order to try and resolve the problem. Several further changes were made during this period, but we continued to see increasing mail queues.</p>
<p><em>We should add that the Critical Path boxes had been tested with our normal mail volumes in the vendor&#8217;s labs. Although we believe the queues were caused by a local delivery issue, we have not been able to perform further diagnosis at this point. </em></p>
<h4>16:30</h4>
<p>As the throughput issue was still unresolved, the decision was made to re-route any new e-mail away from the Critical Path appliance so that the queues which had built up could be cleared. At this time, even though it was realised that the root cause was not the rejected spam messages, the configuration changes made to the mail delivery platform were left in place. The view was that this would allow the remaining mail to dequeue from the Critical Path boxes more quickly.</p>
<p>Continued monitoring of the logs, mail queues and external connections again indicated that mail was flowing correctly and at this point it was assumed that with time the mail platform would return to normal operation.</p>
<h4>21:00</h4>
<p>Our engineering team were alerted by the Customer Support Centre that a number of calls and tickets were being received from customers who were reporting missing e-mails. Initially this had been put down to the mail queues, but it was felt that further investigation was warranted by the on-call engineer.  The engineer who investigated this issue could not initially find any problem, and after around 90 minutes it was decided to call-out the engineer responsible for the Critical Path trial, who had performed the original work that morning. It was at this point that it became apparent that mail was being lost from the platform.</p>
<h4>22:30 &#8211; 23:30</h4>
<p>It was decided to systematically roll back all changes made during the day, including the ACL rule changes on the mail delivery platform. Once this had been done a large number of test messages were sent, all of which were received. This proved that the problem had been resolved for all new mail arriving on the mail platform. Some older email was still queued on the Critical Path appliances and this was cleared successfully over the following days.</p>
<h2>Investigation</h2>
<p>Once we understood that email had been lost we started a full investigation into the causes of this issue. It was quickly recognised that while all of the work for the deployment of the Critical Path appliances was planned and authorised in accordance with our change control and peer review procedures, the way we handled the first problem following the deployment was incorrect.</p>
<p>The investigation revealed that a formatting error within the ACL rule change had caused the mail delivery platform to start processing mail incorrectly. This specifically was a sequencing issue, whereby the order of the commands written into the ACL rule meant that the variable set when a message was known spam was not being reset correctly for each new mail. This resulted in legitimate messages being seen by the mail platform as known Phishing or Image spam and, because of the rule to drop instead of reject this type of mail, they were removed. Although the new configuration was checked informally by another engineer before being applied, operational procedure was breached when the changes made to the rule were not formally peer reviewed or documented via change control.</p>
<h2>Follow-Up</h2>
<p>Josh, the principal engineer working on this project (who made the fatal change) was the first to <a href="http://community.plus.net/forum/index.php?topic=991.msg9225#msg9225" title="Link to community site forum">hold his hands up and apologise</a> to customers for the impact this problem caused. The mistake itself was one that wasn&#8217;t picked up on our test platform (it is impossible to replicate the volume of email on the live platform and under less load the problematic condition was not triggered and no issue was apparent). Furthermore, the nature of the problem meant that the mail platform logs didn&#8217;t demonstrate any obvious faults (On a mail platform that handles 70 messages a second, where a minimum of 60% is known Spam, logged errors are easy enough to spot but incorrectly marked Spam message are not).</p>
<p>The biggest procedural issue here was that the correct peer review and change control procedure was not followed, although the process was overridden for what the engineer considered to be valid reasons at the time. We are now looking to streamline both these processes with a view to making them more agile and easier to follow, especially while working reactively on problems. We plan to produce an article to explain our change control processes in the near future. Obviously there has also been an internal process with those involved in the work on that day to address this appropriately.</p>
<p>In terms of the other questions we&#8217;ve been asked, one big comment is that customers don&#8217;t want us to drop or refuse any mail at all. That misunderstands a reality of much Botnet generated email traffic today. When obvious spam is recognised its perfectly normal for mail providers to prevent the delivery of that mail and is what almost email providers do. We will continue to reject mail which is recognised as known Spam, but it&#8217;s important to recognise that this is a different process to that of tagging suspected Spam and placing it in the Spam folder. Spam tagging is only performed on mail that has been accepted onto the mail platform because it has a valid form and we can&#8217;t be absolutely certain that it is spam.</p>
<p>Another question asked was about the type of logging we have on the mail platform, and whether that could be used to inform mail senders that their mail could have been incorrectly dropped. Unfortunately due to the nature of the error and the volume of mail involved, it is genuinely impossible for us to do this. The way that the logs record data do not allow us to see which mail was correctly handled and which was not. In addition to this the way the ACL was configured meant that we were unable to identify sender addresses. Customers should be sure that were there a practical way for us to achieve this, we would have gone to any lengths to make it so.</p>
<p>We hope this report and the details provided here do answer the valid questions customers raised in relation to this email problem. Like everyone affected by this we are extremely disappointed to be reporting a further set-back with email. We are as committed as it is possible to be to providing a stable and quality email solution for our customers, and will provide a further update regarding our plans in this regard shortly.</p>
<p>Kind Regards,</p>
<p>Bob Pullen</p>
<p>On Behalf of Team PlusNet</p>
]]></content:encoded>
			<wfw:commentRss>http://community.plus.net/blog/2007/08/30/email-incident-report/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Webmail Incident Report - Follow-up</title>
		<link>http://community.plus.net/blog/2007/07/10/webmail-incident-report-follow-up/</link>
		<comments>http://community.plus.net/blog/2007/07/10/webmail-incident-report-follow-up/#comments</comments>
		<pubDate>Tue, 10 Jul 2007 18:09:54 +0000</pubDate>
		<dc:creator>Ian Wild</dc:creator>
				<category><![CDATA[Incident Reports]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Webmail]]></category>

		<guid isPermaLink="false">http://community.plus.net/blog/2007/07/10/webmail-incident-report-follow-up/</guid>
		<description><![CDATA[Following-up on the webmail incident report, we pledged to provide answers to any remaining queries which the original report didn’t address. We have answered as many of those questions as we can here, although primarily for reasons of ongoing platform security there are a number of questions asked that we have not provided direct answers [...]]]></description>
			<content:encoded><![CDATA[<p>Following-up on the <a href="http://community.plus.net/comms/2007/05/23/webmail-incident-report/">webmail incident report</a>, we pledged to provide answers to any remaining queries which the original report didn’t address. We have answered as many of those questions as we can here, although primarily for reasons of ongoing platform security there are a number of questions asked that we have not provided direct answers to.</p>
<p>Since May, we have been focused on a 90 day plan which we formulated following the webmail incident. This combines a number of sub-projects, some of which were already in progress before the event (Such as ensuring <a href="http://www.pcicomplianceguide.org/aboutpcicompliance.html">PCI compliance</a>), and others that have come about as the result of the adoption of harder security standards across our operation. This has ranged from internal doors being locked down with biometric access, to the rebuilding of servers and other network elements with a view to application standardisation and code consolidation (which will have the added benefit of making future developments simpler and more robust). The project also includes publishing our data retention and privacy policies and ensuring these are implemented fully in all parts of our operation.<br />
<span id="more-9791"></span><br />
In terms of the questions we were asked, we have broken these into the following areas:</p>
<p><a href="#_Toc171856531">Questions about events leading up to the incident <!--[if gte mso 9]&amp;gt;  08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000E0000005F0054006F0063003100370031003800350036003500330031000000 --><!--[if supportFields]&amp;gt;--></a><br />
<a href="#_Toc171856532"> The Vulnerability<!--[if gte mso 9]&amp;gt;  08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000E0000005F0054006F0063003100370031003800350036003500330032000000 --><!--[if supportFields]&amp;gt;--></a><br />
<a href="#_Toc171856533"> Stored Data<!--[if gte mso 9]&amp;gt;  08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000E0000005F0054006F0063003100370031003800350036003500330033000000 --><!--[if supportFields]&amp;gt;--></a><br />
<a href="#_Toc171856534">Technical Questions about the impact of the incident <!--[if gte mso 9]&amp;gt;  08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000E0000005F0054006F0063003100370031003800350036003500330034000000 --><!--[if supportFields]&amp;gt;--></a><br />
<a href="#_Toc171856535">Questions about our initial response to the incident <!--[if gte mso 9]&amp;gt;  08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000E0000005F0054006F0063003100370031003800350036003500330035000000 --><!--[if supportFields]&amp;gt;--></a><br />
<a href="#_Toc171856536">Questions about changes we will make in our operation as a result of the incident <!--[if gte mso 9]&amp;gt;  08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000E0000005F0054006F0063003100370031003800350036003500330036000000 --><!--[if supportFields]&amp;gt;--></a><br />
<a href="#_Toc171856537">Other Questions<!--[if gte mso 9]&amp;gt;  08D0C9EA79F9BACE118C8200AA004BA90B02000000080000000E0000005F0054006F0063003100370031003800350036003500330037000000 --><!--[if supportFields]&amp;gt;--></a><br />
<!--[if supportFields]&amp;gt;--></p>
<h2><a title="_Toc171856531" name="_Toc171856531"></a>Questions about events leading up to the incident</h2>
<h3><a title="_Toc171856532" name="_Toc171856532"></a>The Vulnerability</h3>
<p>We were asked about the specific vulnerabilities that allowed the hacker access data held within the webmail system. The vulnerability was not unique to PlusNet’s modification of the @mail code, although our specific implementation made that vulnerability more serious once it had been exploited. At the time the incident occurred we had patched the webmail system with all the security patches known or available for @mail.</p>
<p>The nature of the exploit suggests that the attackers were already familiar with the @mail code and database structure. Coupled with the fact that we allow anyone to create a free webmail account, and to access that globally, it was possible for the attackers to get access to our webmail platform. Most other implementations of @mail, in one way or another are more restrictive about who can access them. The attack took place entirely via the website and web-server, however we are unable to publish more detail about the specific method used except to confirm that this was not an XSS based attack.</p>
<p>As we said in the incident report, whilst the initial exploit was something that we don’t think was easily preventable, the resulting impact could have been reduced with different technical and procedural measures. We have now implemented these measures, but will not be publishing specific details for security reasons. Whilst no network can be 100% secure, we regularly operate internal security tests and have used external security companies to perform penetration testing and external audits, the last one of these before the incident was in January 2007.</p>
<h3><a title="_Toc171856533" name="_Toc171856533"></a>Stored Data</h3>
<p>In 2004 as part of replacing our old home grown webmail application we imported all of our customer email addresses into the new @mail system. The previous webmail software had used customer contact email addresses by default. We wanted to ease the transition for webmail users to @mail and as a one off exercise we imported the contact addresses of existing customers to the new system. In hindsight we should have forced customers to set up their details again, but at the time we felt this would cause unnecessary inconvenience. It was only customer email addresses and contacts that were imported into the webmail system.</p>
<p>For our implementation of @mail, we made a decision to keep the servers entirely separate from the rest of our systems. While on one hand this limited what the attackers were able to access, the separation also resulted in changes to our main databases and mail systems never being reflected within the webmail platform itself. The outcome of this was that the details for accounts which were cancelled or mailboxes which were deleted were not removed from the webmail database. It would however have been impossible to access webmail without an active account, as authentication is performed against our main database.</p>
<h2><a title="_Toc171856534" name="_Toc171856534"></a>Technical Questions about the impact of the incident</h2>
<p>We’ve been asked for details about precisely what data the attackers were able to access. Although we can’t be precise, we must make a presumption that anything stored within the @mail databases could have been taken. As described in the incident report, the databases contained customer address books, email addresses and in certain circumstances email content. Specifically, this was email not stored in the default inbox for customers who logged into webmail using the ‘POP3’ option. Mail accessed using the IMAP option was stored on our main email servers, rather than within the webmail database. No other data was stored on the webmail platform which could be regarded as customer specific information.</p>
<p>Between the time when the server was first compromised and the issue was fully resolved it is possible that other information could have been accessed through customer interaction with the affected server. We have also been asked to clarify how authentication took place on the webmail platform and whether it was possible that password data could have been obtained. No evidence to suggest this is the case has come to light, but for the above reasons it remains a possibility (which is why we advised customer to change their passwords). Authentication of mail collection occurs via our mail platform and these servers were not affected.</p>
<h2><a title="_Toc171856535" name="_Toc171856535"></a>Questions about our initial response to the incident</h2>
<p>The first tickets relating to the incident were raised on the afternoon of Saturday 5<sup>th</sup> May. They entered the Customer Support ticket pool and were dealt with in normal rotation order (oldest first), meaning we picked them up the following day. Please refer to the <a href="http://community.plus.net/comms/2007/05/23/webmail-incident-report/">original incident report</a> (opens in new window) for further details of the timing of events.</p>
<p>When we identified the Trojan redirect on the WM04 server, we checked the other Webmail servers for a similar compromise (and found none) and also checked for other malicious activities at that time. We took down and investigated what we believed to be the only compromised server and not finding anything to suggest a further compromise had taken place we returned the remaining webmail servers to full service. It’s important to understand that running a virus scan or detecting a compromise on a unix server is a different thing entirely from the virus checking one might perform on a windows PC.  Although we felt we could make a quick fix and return the webmail service with minimal inconvenience we did continue to monitor the webmail platform and as soon as we realised that all was not well we made the decision to permanently remove the @mail platform from service.</p>
<p>During the early days of the incident our priorities were to understand and solve the webmail problem. Once we knew who had been affected we moved as quickly as we could to inform these customers about the issue, in hindsight this was not fast enough and in future we would be in a position to react faster. Initially we used the signature detection technology offered by our Ellacoya platform to identify customers who exhibited the signs of having been affected by the Trojan, and whose data transfer profiles matched the Trojan’s signature. We phoned some and emailed all such customers with specific instructions.</p>
<p>Although we communicated with the relevant authorities throughout the incident, we didn’t formally raise the issue with the police until the 16<sup>th</sup> May. By this time our Incident Team had carried out a full and thorough forensic examination. It was only when we had sufficient evidence to reasonably suspect that a criminal act had occurred that we were in a position to report the crime.<br />
<!--[endif]--></p>
<h2><a title="_Toc171856536" name="_Toc171856536"></a>Questions about changes we will make in our operation as a result of the incident</h2>
<p>We’ve been asked about why customer data was stored even when an account was closed and what we are doing to prevent that. As part of our 90 day follow-up plan we have conducted a full audit to ensure that there are no other areas where we are storing unnecessary data, and that only the information we are required to keep by law (under the Limitations Act and HMRC legislation) is retained anywhere on our systems. Until this point we maintained closed account data in a ‘destroyed’ state, which although fully compliant with law, only saw periodic purges of our archives to remove old data. Our data retention policy will be published shortly on the <a href="http://www.plus.net/support/service/policies/overview.shtml?supporta=policies">policies section of our website</a>. Ensuring that all of our systems are synchronised and not storing unnecessary information will be at the forefront of our minds during the design of all future projects.</p>
<p>A lot of questions about the new security team were asked and although we do plan to explain more about their remit in future communications, we will not go into the specifics of our security policies and procedures. The new team will enhance and augment our existing procedures for performing security audits and penetration testing. They will not be responsible for answering customer tickets directly, which is a responsibility that will remain with the CSC. If an issue is raised relating to a potential security concern these will be escalated to the most appropriate team. Customers can be sure that we will always treat any further suggestions of a security problem very seriously.</p>
<p>Other than that, we are nearing the completion of work on delivering most of the seven commitments we made in the incident report. Of these, all work is on track except encrypted email and FTP access, which is taking longer to deploy than we first expected. We will have a further update on all of these deliverables, along with more details about the steps we are taking to combat spam in general in an additional update. Our response has focussed on current PlusNet customers, and while we accept that we have not been able to address the inconvenience caused to those who are no longer customers, we would again like to offer our most sincere apologies to all. On the back of this incident we have become determined to do absolutely everything we can to ensure that the security of our platform can never be questioned again. We have adopted a robust set of security standards and continue to focus almost all of our network and development resources on the 90 day security project which ends 20<sup>th</sup> August.</p>
<h2><a title="_Toc171856537" name="_Toc171856537"></a>Other Questions</h2>
<p>A few questions asked were not directly related to the webmail incident. One of these was our recent entry onto the data protection register. It is in fact the case that we are not obliged to register under the terms of the Data Protection Act, although we are obliged to comply with the requirements of the Act. Our registration was actually in hand prior to the webmail incident, and was purely voluntary. The other question was about the length of time it took to implement stronger passwords for customers. This was no small piece of work, and as with all of the changes and improvements made during this period we had to drop all other development work to make this happen.</p>
<p>We hope that this final posting has helped to allay remaining concerns about what happened and why. Everyone at PlusNet remains very conscious of the inconvenience and concern that arose as a result of these events. We are now focusing on reducing the amount of spam email that our customers are receiving and more information about these initiatives is being published regularly on our <a href="http://community.plus.net/">community site</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://community.plus.net/blog/2007/07/10/webmail-incident-report-follow-up/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Webmail Incident Report</title>
		<link>http://community.plus.net/blog/2007/05/23/webmail-incident-report-2/</link>
		<comments>http://community.plus.net/blog/2007/05/23/webmail-incident-report-2/#comments</comments>
		<pubDate>Wed, 23 May 2007 22:50:13 +0000</pubDate>
		<dc:creator>Ian Wild</dc:creator>
				<category><![CDATA[Incident Reports]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Webmail]]></category>

		<guid isPermaLink="false">http://community.plus.net/blog/2007/05/23/webmail-incident-report-2/</guid>
		<description><![CDATA[The Webmail Incident Report is re-produced below. This will be distributed to customers via our newsletter soon. 
Web mail Incident Report
Our customers will be aware of a recent serious security issue affecting our Webmail service. This document contains the results of our investigation and what actions we are now taking to minimise the resulting impact [...]]]></description>
			<content:encoded><![CDATA[<p>The Webmail Incident Report is re-produced below. This will be distributed to customers via our newsletter soon. <span id="more-9761"></span></p>
<h1>Web mail Incident Report</h1>
<p>Our customers will be aware of a recent serious security issue affecting our Webmail service. This document contains the results of our investigation and what actions we are now taking to minimise the resulting impact for our customers.</p>
<p>We are extremely sorry for the inconvenience and upset this may have caused our customers. We hope this report will provide some reassurance to our customers about how seriously we are taking these events.</p>
<h2>Contents</h2>
<p>1. Incident Summary</p>
<p>2. Breakdown of events</p>
<p>3. Our response and plans going forward</p>
<h2><a title="_Toc167725934" name="_Toc167725934"></a>1.Incident Summary</h2>
<p>Further details, explanations and answers to questions we have been asked are below the summary section. </p>
<h3><a title="_Toc167725935" name="_Toc167725935"></a>a.What happened?</h3>
<p>A malicious attack was performed against vulnerability in our implementation of our Webmail software which allowed the attackers to: </p>
<p>- Take a copy of our webmail database, which contained the email addresses of customers who have used Webmail. This also included email addresses contained in customers&#8217; online address books and addresses customers had sent to and received from, using our Webmail service. Additionally the database contained some email addresses from our previous Webmail system.</p>
<p>- Change code in a web page, served by one of the six webmail servers, to present a pop-up page which, for a period of time, carried a Trojan payload. Only customers without up-to-date Windows software and inadequate anti-virus software would have the Trojan affect their PCs.</p>
<h3><a title="_Toc167725936" name="_Toc167725936"></a>b.When did it happen?</h3>
<p>Our investigations have shown that the exploit was initiated at around 17:30 on the evening of Friday the 4<sup>th</sup> May, 2007. Customers started receiving spam on the evening of Sunday, 13<sup>th </sup>May 2007.</p>
<h3><a title="_Toc167725937" name="_Toc167725937"></a>c.Why did it happen?</h3>
<p>A vulnerability within our implementation of Webmail code in our portal was discovered and used by malicious attackers. </p>
<p>Our subsequent investigations found a number of vulnerabilities with our implementation of the Atmail application, including the vulnerability which had been exploited. This led to the decision we took to stop using the software entirely. </p>
<h3><a title="_Toc167725938" name="_Toc167725938"></a>d.      Who is suspected of this attack?</h3>
<p>This has been the subject of a criminal investigation, which means we are not in a position to share all of the details which we are aware of. However, the timing of the attack and the sophistication of the exploit indicates a considerable amount of planning and expertise. The code was written in Russian and was of high quality. </p>
<h3><a title="_Toc167725939" name="_Toc167725939"></a>e.Was this preventable? </h3>
<p>With commercial hacking of this nature it just isn&#8217;t possible to be 100% protected. However, from this incident we have identified a number of processes and procedures that need improving, so that we can minimise the risk of future incidents and reduce the impact of any attacks. We have also increased the frequency of our scanning process to help us identify attempted intrusions in the future.<font face="Arial"></font></p>
<h3><a title="_Toc167725940" name="_Toc167725940"></a><a title="_Contents" name="_Contents"></a><a title="_Document_Summary" name="_Document_Summary"></a>f.        What have we done to resolve the problem and prevent something similar in the future?</h3>
<p>Since the issue came to light the entire team at PlusNet has focused on the security of our network and customer data, and email system improvements. In order to resolve the issue and limit the impact on our customers we have:</p>
<p><strong>- </strong>Undertaken a complete external security audit and rebuilt aspects of our platform that we felt didn&#8217;t meet stringent security best practices</p>
<p>- Created a dedicated PlusNet security team which is formally responsible for all aspects of data and software security on our platform</p>
<p>- Pledged to continue to proactive identify and contact customers who exhibit signs of having a malware infection</p>
<p>- Improved general security for all customers, such as the introduction of strong passwords, and additional safeguards against further attacks</p>
<p>We are also:</p>
<p>- Developing a method to allow customers to change from their current username-based email address. </p>
<p>- Improving our anti-spam platform to allow more options to have mails which are identified as spam to be separated, reviewed via our website, and/or deleted without them being delivered into the main mailbox </p>
<p>- Introducing more improvements to our help and support pages to help customers avoid spam </p>
<p>- Researching longer term options to upgrade our Squirrelmail webmail service to a higher specification that integrates with our portal and other communication tools</p>
<p>We&#8217;ll be contacting all our customers over the coming weeks to let them know about the improvements. You can also discuss proposals in the customer forums e.g. <a href="http://portal.plus.net/">http://portal.plus.net</a> </p>
<h2><a title="_Toc167725941" name="_Toc167725941"></a>2.      Breakdown of events</h2>
<p>This section of the document is structured into a timeline of events, split into four key periods: 4th May &#8211; 8th May, 9th May &#8211; 12th May, 13th May &#8211; 16th May and 17th May &#8211; 21<sup>st</sup> May. We have included details of both what we knew at the time and what came to light afterwards.</p>
<p>We very much hope this will answer everyone&#8217;s questions in a direct and factual way, but would ask that customers help us, if they see areas which require further explanation, by commenting on the Blog version of this article. We will be happy to update these details on the basis of comments where we agree that this could be clearer or more informative.</p>
<blockquote><p><strong>a.        Friday 4th May- Tuesday 8th May</strong></p></blockquote>
<p>Our first indication of an issue came from two tickets picked up on Sunday 6<sup>th</sup> May by our support team. These were flagged on a handover report by a Customer Support Centre (CSC) shift manager, but within the CSC at that time we were unable to replicate the problem. Initially, as both customers were using the same anti-virus software, the issue was thought to be related to a recent McAfee update which was misreporting a problem. By Bank Holiday Monday (7<sup>th</sup> May), we had received around 10 reports of the same issue but had still been unable to replicate the problem internally. The significance of the reports had not been recognised at this stage, and in our overall volume of calls and tickets these 10 reports were a tiny percentage of contacts received.</p>
<p>It has since transpired that due to the nature of the load balancing system in use on our Webmail platform that connections were &#8216;Persistent&#8217;. &#8220;This means that each computer is connected to the same server on each attempt, hence not being able to hit the affected server. This resulted in tests from within our CSC all hitting an unaffected server and as such, replicating this issue from our test machines proved to be very difficult. In the early hours of Wednesday 9<sup>th</sup> May, we identified the real source of the reported problem and were able to begin taking responsive action.  </p>
<blockquote><p><strong>b.       Wednesday 9th May &#8211; Saturday 12th May</strong></p></blockquote>
<p>On the morning of Wednesday 9<sup>th</sup> May we formed an incident response team made up of staff from Networks, Development and Customer Support and a priority problem (<a href="http://www.plus.net/support/service/problems/problem.php?intProblemId=42694">42694</a>) was raised. </p>
<p>A number of HTML files had been modified on the server &#8216;WM04&#8242;, which was one of our six public Webmail servers located in London. Each of these machines was a Linux server, operating the Apache Web server software and the Atmail Webmail code. The files which were changed had additional code added to them which generated an &#8216;IFRAME&#8217; pop-up containing a link to a video on a Russian website which, when played, activated the Trojan. </p>
<p>At this stage we were focussed on understanding the exploit which had resulted in customers potentially being exposed to a malicious webpage. We were not aware that customer email addresses had been obtained illegally at this point. </p>
<p>As the significance of the problem was understood we decided to take our Webmail service off-line to check the other servers for the same compromise. A Service Status notice was posted and our Webmail servers were taken offline at 16h00 on Wednesday 9<sup>th</sup> May 2007. Once we had removed the affected server, we did run a thorough check for the same problem as had affected WM04 across the platform and no compromise was found on the other servers. The underlying problem was resolved by taking the WM04 server out of action so we returned the Webmail platform to service. </p>
<p>From Wednesday afternoon, we identified customers with the Trojan data signature on their line; blocked the Trojans from getting config updates and we also looked for changes in customer SMTP traffic patterns. Using this data our CSC began calling customers who exhibited symptoms of the Trojan to offer advice on remedial action. </p>
<blockquote><p><strong>c.        Sunday 13th May &#8211; Tuesday 15th May</strong></p></blockquote>
<p>While we were continuing to address the Trojan issue, a new issue (<a href="http://www.plus.net/support/service/problems/problem.php?intProblemId=42837">42837</a>) was raised on the evening of Sunday 13<sup>th</sup> May. This followed numerous reports of spam to mailboxes which had never received spam before. This immediately took priority over the Trojan investigation due to the number of customers affected but the two issues were worked in parallel. On initial investigation by our on-call engineers we were unable to locate the specific cause of this and none of our monitoring showed a notable increase in overall mail volume on our platform (less than 2% difference to the same period the previous week). By this time the first influx of spam seemed to have stopped.</p>
<p>At this stage we believed that we had mitigated the original Webmail security flaw and successfully dealt with the underlying issue. It however appeared likely those responsible for injecting the Trojan on WM04 were also behind the spam outbreak. We did think it would be unlikely for anyone to have run the type of query needed to extract this information without triggering certain alarm systems we have in place. (These are targeted at looking for unusual or large database queries). </p>
<p>Following further investigation, at 11h00 on Monday 14<sup>th</sup> May we managed to replicate a query that could have been used to do this without triggering our monitoring. As such we began cross-checking all of the affected email addresses in order to confirm that a Webmail breach could account for all of the spam sent to email addresses that was being reported. We did eventually find every email address that was reported as receiving these spams within the Webmail system. These were stored in a number of different areas such as address book contacts, account identities and the logs of received and sent mail which were stored within the Webmail system. </p>
<p>We issued an announcement alerting customers of our findings and continued with our full scale investigation into the breach. At that stage we decided to leave Webmail running, but took the decision to lock down logins to Webmail and our website to UK only IP addresses to reduce the risk of speculative follow-up attacks. </p>
<p>At this stage, we had engineers on site in London who were performing integrity checks and server forensics across the entire platform. We believed that one server (WM04) had been compromised using a vulnerability in part of the Atmail code, which was coupled with some security-related weaknesses on our own server configuration.</p>
<p>Once the affected server had been recovered for further forensic investigation, we found a malicious file disguised to look like an image file.  This image file contained PHP code that allowed an attacker to run commands on our webmail server. Using this code they were able to transmit tables from the Webmail database to a remote location, and make changes to other HTML files so that visitors were presented a pop-up page containing a Trojan. After a full sweep we also found other files of a similar nature dated from the 4th May 2007 on WM04.</p>
<p>The attacker gained information from the Webmail database from several tables which included customer email addresses, entries in customer address books,  email addresses for mail sent by customers via Webmail, email addresses for mail received by customers via Webmail. Additionally when we first implemented the Atmail solution in 2004, we imported customers email addresses contact details into the Webmail database from our main databases. We did this to make the solution more convenient for users by setting up their email account for them initially. Customers who joined us after this time only had data created in the Webmail database the first time they logged in.</p>
<p>We advised Calacode, authors of Atmail of the incident and they provided excellent support and have since published hotfixes to the code. The incident was confined to the Webmail database and personal data including payment details were not obtained. </p>
<p>Our Networks team worked through the night on Monday/Tuesday performing further security checks and server hardening where necessary. This included turning off Website Wizard and our WiFi portal applications, which were not running the latest version of PHP and for which a quick upgrade was not possible. Overnight we also performed penetration testing (Using Nessus and other tools) across our entire platform. </p>
<p>In the afternoon of Tuesday 15<sup>th</sup> May we emailed those broadband customers who were exhibiting signs of being infected by a Trojan virus. We sent an additional email on Wednesday 16<sup>th</sup> May to dial-up customers who may have logged into WM04. </p>
<p>Shortly after this we made a detailed Service Status to customers to explain the progress we had made in our investigation. At this time we also felt we had enough evidence of malicious intent to notify the Police, and an incident was raised.</p>
<blockquote><p><strong>d.       </strong><strong>Wednesday 16th May &#8211; 21st May</strong>
</p></blockquote>
<p>Our network engineers and software developers continued working throughout the night on Tuesday/Wednesday. At this stage we were planning to rebuild our Webmail service using brand new servers configured to use the PHP version of Atmail, rather than our original Perl-based version. At the time we believed this would prove more secure, and our priority was to restore a working Webmail service. As well as building servers, work continued to focus on network hardening throughout night.</p>
<p>By Wednesday morning we had identified further potential vulnerabilities within the newest available versions of the Atmail Webmail system and decided to take our existing Webmail out of service and we commenced work on a replacement system using Open Source code.</p>
<p>During this time we were continuing to harden every aspect of our platform security we could, and were working closely with the BT security team to ensure that our network was as secure as technically possible. This included work on areas such as the improvement of our scripts that detect abnormal sessions activity and the commencement of work to allow stronger customer passwords. By Thursday afternoon we had completed the majority of these improvements and our attention turned to restoring a working Webmail service, using alternative Webmail software. </p>
<p>By Friday, a temporary Webmail solution based upon the SquirrelMail software had been built and we were in the final stages of security, penetration and functionality testing of the new servers and software. Our replacement Webmail service was released on Saturday evening. </p>
<p><a title="_Believed_source_of_the exploit" name="_Believed_source_of_the exploit"></a> </p>
<p><a title="_Toc167725942" name="_Toc167725942"></a><strong>3.       Our Response and plans going forward</strong> </p>
<p>Throughout the Webmail issue, we have been keen to listen to our customers, and we hope this document answers many of our customers&#8217; questions about the incident. </p>
<p>As soon as we were aware of the severity of the situation we swung into action and we feel we can be proud of our response to the issue. Many members of our Networks, Development and Customer Support Teams gave up their social and family lives and gave their own time to work on implementing the things that we identified as needing to be done urgently.</p>
<p>We are determined that out of these issues will come a change in our company culture We want to be regarded as an on-line business where security is placed at the forefront of everything we build. We are currently in negotiation with a number of potential partners who we will work closely with to ensure that our network and the software running on it always remains as secure as it is possible to be and that this is regularly audited by a third party. We are also reviewing processes to ensure that diagnosis, communication and resolution can be swifter in the future. </p>
<p>We have had literally hundreds of suggestions and requests from customers regarding ways to reduce spam and we are grateful for them all. Listed below are the current items we are able to commit to at this stage, timelines for which will be provided as soon as possible. Other customer suggestions are currently being reviewed and prioritised. We have been grateful for the response and support expressed by so many of our customers and we want to repay that by ensuring that we deliver as much of what is requested as soon as we possibly can.</p>
<p>From the suggestions we have received from customers so far we will implement the following:</p>
<p>1) The ability to change your username </p>
<p>2) Get a free .uk domain name</p>
<p>3) SSL encrypted connections for POP3 and IMAP email and FTP</p>
<p>4) Improvements to the &#8216;Manage my mailbox&#8217; tool</p>
<p>5) The ability to blackhole <a href="mailto:username@username.plus.com">username@username.plus.com</a></p>
<p>6) Publishing our spam detection rates on our website for comparison with other ISPs</p>
<p>7) Publishing our new Privacy Policy and Data Retention policies on our portal.</p>
<p>We will contact all our customers with firm timescales for the above improvements.<a title="112a06d0a9a8a1f0_424948" name="112a06d0a9a8a1f0_424948"></a></p>
<p>&#8212;</p>
<p>Again, apologies for the lateness of this one  tonight &#8211; It&#8217;s been another late one for us all.</p>
<p>Regards,</p>
<p>Ian Wild</p>
]]></content:encoded>
			<wfw:commentRss>http://community.plus.net/blog/2007/05/23/webmail-incident-report-2/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
		</item>
	</channel>
</rss>
