When it
comes to systems change there are a number of notable failures in the
financial services industry:
January
2009: - It was reported
that IT systems engineer Rajendrasinh B. Makwana almost brought down
4,000 critical servers with a logic bomb, embedded in developed
scripts, which could have lost Fannie Mae “many millions of
dollars” that was only discovered by chance by another engineeri.
January
2010: It was reported that
a HSBC Mainframe upgrade shut down cash machines and online banking
for HSBC customers as part of upgrade to One HSBC platformii.
This was in addition to a similar outage in June 2009 a further
telephone banking outage in February 2008 due to “coding”
changesiii.
September
2010: It was reported that
J.P. Morgan’s online banking service was offline for 3 days due to
third party database software “corrupting the login process”
impacting 16 million customersiv.
It was reported that J.P. Morgan appeared not to have a roll-back
plan so they could recover while continuing business as normalv.
June
2012: It was reported that
the Royal Bank of Scotland to pay £125 million in costs related to a
glitch in the CA7 batch process scheduler as part of systems
maintenance activity that resulted in 12 million customer accounts
being frozen for almost a weekvi.
August
2012: It was reported that Knight Capital Group
lost $440 million in 30 minutes and wiped 62% of its stock price, due
to a trading software algorithm glitch that generated erratic trades
and that bought high and sold low for nearly 150 stocksvii.
The glitch resulted in 4 million additional trades in 550 million
shares that would not have occurred otherwiseviii.
August
2013: It was reported that Goldman Sachs lost
$100 million due to an automated trading systems glitch that caused a
number of incorrect options trades that disrupted US exchange trading
affecting shares with listing symbols starting with the letter H
through Lix.
The glitch caused automated trading systems to accidentally send
indications of interest as real orders to be filled at the US
exchanges. The cause was reported to be due to inadequate software
testingx.
September
2013: It was reported that Clydesdale Bank was fined
£8.9 million by the Financial Conduct Authority for failing to
inform customers of their rights after a software glitch caused the
miscalculation of repayments on over 42,500 mortgagesxi.
Risk
and associated controls
A good,
actionable risk statement that captures these events is:
“Customer
data leakage, corruption or system unavailability caused by defective
or malicious system changes resulting in financial losses of UK £100
million, customer churn of 6.4 percentxii
and regulatory sanction by the Financial Conduct Authority and
Information Commissioner’s Office.”
This
risk statement is a lower level risk that contributes to the
organisational level risk of for example:
“Loss
of market share caused by eroded customer confidence in the
organisation’s information security resulting in net revenue
reduction to the order of hundreds of millions and bank share value
reduced from loss of market confidence in operational management.”
From the
lower level risk statement we can then identify the risk causes that
need to be controlled. In this case we need to control defective or
malicious systems changes that might result in customer data leakage,
corruption or systems unavailability.
To take
these in turn, we’d need to implement a change quality testing
process to ensure that system changes are adequately tested which may
include activities such as code quality reviews, unit, functional,
systems, integration and regression testing. An additional step for
business supporting systems would be user acceptance testing by the
business that also includes tests for boundary conditions and invalid
data inputs to the system data input interfaces.
We’d
then need to implement a change control strategy that uses technical
and administrative controls to restrict the ability to make changes
to production or critical systems unless these changes are approved.
The approval should not be a simple tick in the box but should
require appropriately senior stakeholder approval of changes with
high risk changes signed off at senior executive levels within the IT
and business areas. Part of this sign-off should be that they have
assured themselves that the change has been adequately tested and is
fit for purpose.
There is
a further control required to make these two controls work. This
control is to ensure there is a technically enforced separation of
duties so that those making changes cannot implement these changes in
the target environment.
In order
to ensure these controls are adequately and effectively implemented
there needs to be clearly articulated and enforceable policies,
standards, procedures and guidelines in place. The policies and
standards need to be clear and unambiguous, have an owner and
describe the enforcement actions that will be taken if the policy or
standard is not complied with. These enforcement actions must then be
applied for all cases of non-compliance. Where a non-compliance is
expected this needs to be pre-approved with the policy owner and
clearly highlighted to the system senior stakeholders and approved at
the appropriate senior executive level within the technology and
business areas involved in the change.
Endnotes
i
Keizer, G., Ex-Fannie Mae
engineer pleads innocent to server bomb charge,
United States of America, January 2009. Available at:
http://www.computerworld.com/s/article/9127157/Ex_Fannie_Mae_engineer_pleads_innocent_to_server_bomb_charge
(Accessed 6 March 2014).
ii
ComputerWeekly.com,
HSBC
mainframe outage causes major HSBC network crash,
United States, January 2010. Available at:
http://www.computerweekly.com/news/1280091797/HSBC-mainframe-outage-causes-major-HSBC-network-crash
(Accessed on 11 March 2014).
iii
Ibid
iv
Fitzpatrick, D., J.P. Morgan Wrestles Web Snarl, United
States, September 2010. Available at:
v
Ibid
vi
Flinders, K., RBS computer
problem costs £125m, United States,
August 2012. Available at:
http://www.computerweekly.com/news/2240160860/RBS-computer-problem-costs-125m
(Accessed 11 March 2014).
vii
Philips, M., Knight Shows
How to Lose $440 Million in 30 Minutes,
United States, August 2012. Available at:
http://www.businessweek.com/articles/2012-08-02/knight-shows-how-to-lose-440-million-in-30-minutes
(Accessed 11 March 2014).
viii
Ibid
ix
Holley, E., Goldman Sachs
trading error is “a warning to all”,
United States, August 2013. Available at:
http://www.bankingtech.com/161162/goldman-sachs-trading-error-is-a-warning-to-all/
(Accessed 11 March 2014).
x
Ibid
xi
Nguyen, A., Clydesdale Bank fined £8.9m over mortgage system
problem, United Kingdom, September 2013. Available at:
http://www.computerworlduk.com/news/it-business/3470789/clydesdale-bank-fined-89m-over-mortgage-system-problem/
(Accessed 11 March 2014).
xii
Figure of 6.4% customer churn comes from: Ponemon Institute, 2011
Cost of Data Breach Study: United Kingdom, United Kingdom, March
2012.
No comments:
Post a Comment