UWCISA: Official Blog: risk

Showing posts with label risk. Show all posts

Tuesday, October 3, 2017

Should drone inventors have thought about this risk?

Came across this article on Wall Street Journal about how the wedge-tailed eagles have turned out to be the drones worst nightmare. Here are some videos that illustrate the problem:

Being someone who works on innovation as the GRC Strategist - risk is something that I think about daily. Of course, you need need to be prudent and make sure that you've documented. All the known risks and have a plan and how to mitigate them. For example, you should patch your software when the vendor tells you there is an issue.

But how could drone inventors possibly think about the risk formula about the impact and likelihood of eagles tearing up your drone?

It's a good illustration of how innovation requires taking risks of which you will only encounter when actually deploying innovation into the real world. They're just some things that literally will fall out of the sky that you didn't think of and a workaround will need to be designed after the fact.

Author: Malik Datardina, CPA, CA, CISA. Malik works at Auvenir as a GRC Strategist that is working to transform the engagement experience for accounting firms and their clients. The opinions expressed here do not necessarily represent UWCISA, UW, Auvenir (or its affiliates), CPA Canada or anyone else

Wednesday, June 17, 2015

Can Inadequate Disaster Recovery Planning be worse than locusts?

Why are US farmers facing a disaster?

Is it due to locusts? No.

It's due to inadequate IT disaster recovery planning.

As reported in the Wall Street Journal, the US Immigration Department is unable to issue visas to temporary workers due to a system failure. Specifically:

"“The system that helps perform necessary security checks has suffered hardware failure,” said Niles Cole, a State Department spokesman. “Until it is repaired, no visas can be issued.” He said technicians are working around the clock to resolve the issue but couldn’t offer a timeline for when the system would be back in action.

Specifically, a central database isn't receiving biometric information from U.S. consulates world-wide, he said. Biometric data, including fingerprints, are used for security screening of applicants."

And the losses are mounting daily. Over 200 workers are sitting at the Mexican-US border waiting to be processed by system so they can get into the US and help harvest the crops. The article reported that farmers are losing between $500,000 to $1,000,000 per day because the fruits are spoiling.

Reading this article I had the following questions

Why isn't there a hot site?
Given the importance of the technology, why don't they have the ability to swap to a new piece of hardware instantaneously?

Was the security information backed up and why is there no manual work around?
If it's digital information, why isn't there a manual work around to transmit the information and circumvent the faulty hardware? The data could be manually uploaded to the central database.

Was a proper risk assessment done? When a disaster recovery plan (DRP) is created for a system, the organization must determine the Recovery Time Objective (RTO) that determines how quickly a system will be stored after failure. Google, for example, has an RTO of zero. To determine what the RTO is there needs to be an assessment of the impact of such a failure. In this case when setting the RTO did the risk management professional include the fact that this system was critical in supporting the visa program H2-A for temporary farm workers? It should be noted that the US farmers association had paid into this program and now they are suffering losses of over $500,000. This will also reduce the amount of tourist visas issued potentially resulting in lost tourist dollars to the US.

The lesson we can learn from this is to ensure that we understand what business processes a system supports and understand the impact to those business processes should the system go down.

Sunday, January 20, 2013

Unauthorized Access to China? Value of IT Audits and Control Frameworks

Various media sites and blogs, including the BBC, picked up on the story reported by this blog about one enterprising individual who decided to apply what all the major manufacturing companies and service companies are doing: outsource work to cheap labour pools in China (and also India). According to the Verizon post, the individual would basically show his face to work and surf the Internet, while the developers in China were doing all the hard work. Although many have attacked him as being lazy and "scamming" the system, the reality is that many enterprises, such as Apple, depend on such strategies for their profitability. Regardless of this debate, it ultimately the individual violated his agreement with the company. (I am assuming that he had a standard terms of employment that required him to do the work assigned to him and not to provide his credentials to unauthorized users).

From Information Security Risk and Control perspective, this story is a good one for IT Audit and Security practitioners to highlight the importance of IT control framework, risk analysis and audits. The company that discovered the issue was reviewing the security logs. As Andrew Valentine notes in the original Verizon security blog post that noted the incident: "In early May 2012, after reading the 2012 DBIR, their IT security department decided that they should start actively monitoring logs being generated at the VPN concentrator. (As illustrated within our DBIR statistics, continual and pro-active log review happens basically never – only about 8% of breaches in 2011 were discovered by internal log review)." Effectively, the DBIR acted a control framework. It illustrated the importance of best practices to those that read it. And this is ultimately the role of IT Control Frameworks. COBIT, Trust Services and ISO 27001/2, all identify the need to log access and review such access. COBIT 4.1, published by the Information Systems Audit and Control Association (ISACA), identifies the following control in their framework:

DS5.5 Security Testing, Surveillance and Monitoring
"Test and monitor the IT security implementation in a proactive way. IT security should be reaccredited in a timely manner to ensure that the approved enterprise’s information security baseline is maintained. A logging and monitoring function will enable the early prevention and/or detection and subsequent timely reporting of unusual and/or abnormal activities that may need to be addressed."

Trust Services, jointly published by AICPA and the CICA, requires the following (See the Security Principle, 3.2(g) on page 10):
"The information security team, under the direction of the CIO, maintains access to firewall and other logs, as well as access to any storage media. Any access is logged and reviewed in accordance with the company’s IT policies."

ISO 27001/2 requires "Audit logging" under 10.10.1 See page 5 of this sales document from Splunk, a big data company that analyzes logs. ISO keeps this document confidential and so no direct link to the control could be provided.

The other important aspect of this story is that the individuals who read Verizon's DBIR understood how the control related to a specific risk (if you read the report the information security controls identified are linked to the risks they manage). Consequently, to get buy in, IS assurance professionals need to link the IT controls or frameworks. Presenting controls in isolation fails to illustrate the importance of such controls. It would be interesting if ISACA could either team with Verizon to publish the next report or actually map the report to its framework.

Finally, Verizon's work illustrates the importance of IT audit. Organizations that want to keep on top of security threats and risks need to have competent security and risk professionals that can investigate and analyze risks when the are identified.

Sunday, July 8, 2012

Electrical and cloud outages: Is it time to bring both on premise?

Amazon experienced an outage that affected a number of companies that rely on their cloud service. The company informed its users that its service went down due to the power outage stating:

"On June 29, 2012 at about 8:33 PM PDT, one of the Availability Zones (AZ) in our US-EAST-1 Region experienced a power issue. While we were able to restore access to a vast majority of RDS DB Instances that were impacted by this event, some Single-AZ DB Instances in the affected AZ experienced storage inconsistency issues and access could not be restored despite our recovery efforts. These affected DB Instances have been moved into the “failed ” state."

This notice was actually taken from CodeGuard (a start-up that takes snapshots of websites enabling owners to undo unwanted changes) who was one of the companies affected by the outage.

As can be expected, many will use this as an opportunity to illustrates the danger of moving from on premise to the cloud. A parallel argument would be to highlight the dangers of drawing on electricity from the central grid. One would argue one is more reliant on power than on computing - so why not bring electricity "back" on premise? This is an absurd argument, but that is exactly the point. Companies, as pointed out by Nicholas Carr in the Big Switch, used to produce their own electricity, but eventually moved to rely on the grid for power. Today hardly anyone produces their own power, but has backup generators in place to provide power should grid go down. And that's the right question to ask: why was there inadequate backup power at Amazon? In other words, society has decided to live with the fact that electricity is delivered centrally - but has built in controls to manage issues that may arise.

Instead of viewing this as a black mark against cloud computing, it is important to view this discussion in the context of risk. Charles Babcock, InformationWeek published a good article on the reaction to the Outage. He noted that some are leaving AWS in reaction to the service. Specifically, Whatsyourprice.com (an online dating service) is moving to a hosted solution - away from the cloud. However, he also mentions, Okta (an identity management service) that was unaffected by the outage because they designed their application to be fault tolerant.

In other words companies need to focus on whether the benefits of cloud computing outweigh its risks. Cloud provide pay-as-you-go computing - giving companies who have uneven workloads the ability to buy compute resources when they need it. It also give start ups, like CodeGuard, a chance to get their offerings into the market. Here,here and here are the follow-up posts to their outage - they were able to get back online and they are sticking with Amazon. And this should not be a surprise to anyone. Technology startups can leverage the pay-as-you-go model of cloud computing to conserve their capital and instead focus on getting their offering out. For example, the founder of Animoto, points out they went from 50 to 80 compute to 3,500 instances over three-days (they were signing up 25,000 new users per hour at the peak) when their app went viral. So companies will hopefully use the cloud outage to highlight the need for good design and appropriate controls instead of an excuse to stick to the status quo of on-premise computing.