Govciooutlook

Resilience by Design... Technology Recovery When it Counts

By Ted Ross, General Manager, CIO, City of Los Angeles Information Technology Agency

Ted Ross, General Manager, CIO, City of Los Angeles Information Technology Agency

At no time do your customers rely on you more than during a disaster. As technology leaders, we are responsible for introducing powerful, innovative tools that revolutionize the way our business or government is run. Our users and customers have grown to rely on these tools, not even realizing the extent to which they do. Technology is at the center of every major business process (supply chains, product development, government services, payroll), and yet many organizations are unprepared to recover key technologies during a disaster..  Even more ubiquitous, technology is at the center of every interaction: communications, customer management, workflow, reporting, etc. Increasingly, every trigger, response, and decision made in your organization has technology threaded through it.   

“Technology is at the center of every major business process (supply chains, product development, government services, payroll), and yet many organizations are unprepared to recover key technologies during a disaster.”

Could the Stakes be Any Higher?

Technology has made our organizations highly efficient and effective, transforming the customer/constituent experience. But of course, users won’t truly appreciate this dependency until the worst possible moment-during a disaster. That is when business continuity counts the most and when mission critical operations must continue; otherwise, the disaster will continue to multiply its effects and leave more casualties in its wake. In private sector, a lack of business continuity means a tremendous loss of revenue, reputation, and market share. It can often result in a fundamental shift in the marketplace, where market leaders fall to the back of the pack. In public sector, it becomes even more critical. It means the disappearance of core citizen services necessary for sanitation, utilities, transportation, medical services, public safety, etc. And yet, IT departments in various industries commonly express either an under appreciation for being prepared for the eventual disruption or a “check the box” approach to disaster recovery (DR) that offers little business continuity during a disaster.

Everyone Will be Affected, Right?

Sometimes we blame it on a lack of time (Who has time to prepare for disruption while growing a business?), lack of resources (Who wants to allocate precious resources when risk is uncertain?), lack of funding (Who wants to invest in emergency solutions that need to be maintained or replaced every 3 to 5 years?), or lack of clarity (“If the users would simply tell me what they need in case of an emergency, I would have it ready”).  However, our organizations are relying on us. Where do you think they will look when the foundation of their processes is unavailable?

Reliance on Technology + Disaster = All Eyes on You

We Need Technology Recovery When it Counts

In the traditional world of disaster recovery, IT organizations focus on the typical measurements of success; Recovery Time Objectives (RTOs), Restore Point Objectives (RPOs), and Service Level Agreements (SLAs). Unfortunately, this can lead to technical solutions that still fail to meet the true needs of the organization. As evidenced through large-scale natural disasters, such as Hurricane Sandy, or localized disruptions, such as cyberattacks, traditional disaster recovery can often focus on restoration of infrastructure and not restoration of business operations. In other words, backup servers are whirling away, but the organization still doesn’t have the tools it has come to rely on for key processes. In addition, as IT investments are increasingly growing in the business operations themselves, so does the need for coordinated disaster recovery to prevent a disjointed response. I do not suggest that this is your opportunity to re-gain control over technology assets in the organization, but the opportunity to apply influence across the organization to ensure continuity when it really counts. If successful, you are a cross-department leader who will be the hero after a disruption.  Since disasters are unpredictable and the vectors of approach are varied. You need to approach disaster recovery with a focus on critical operations and adaptive planning.  

Set Reasonable Expectations Now– Let’s be clear, every user wants you to provide the exact physical and virtual environment that they are accustomed to. In a significant disaster, your users will not be sitting at their usual desks, using their usual computer, with access to all of their usual tools (e.g. shared drives, email, intranet, applications).  Set reasonable expectations now and get them to focus on mission critical operations.  That will be your target.

“Mind Your Own Business”– If the role of technology is to deliver valuable outcomes to the organization, then no outcome is more important than the restoration of critical operations during a disaster. To do this, your team needs to clearly understand what the critical operations are, regardless of who manages the technology it is running on.  Engage the entire business, understand their strategic priorities in a disaster, and establish disaster recovery for the key technology tools they need to perform those operations. Keep in mind, business users have a tendency to overstate the tools they require (remember, they want to re-create their office). So, be sure to understand the process and stress function, not a long list of tools.   

Be Prepared For Various Degrees of Disaster– Disasters bring varying degrees of outages. Focus less on anticipating specific scenarios and more on preparations for varying levels of impact to your infrastructure. For example, always include a worst case scenario with no tech/low tech solutions for 3 days. Next, consider moderate scenarios with limited functionality for 5 days. What is critical?  How will you deliver it? Then, consider geographic scenarios, such as how to conduct operations from another geographic area.  

Test, Test, Test– There is no better gauge of preparedness than hands-on testing with exercises, at least annually. Be sure you rotate staff in the testing (you can’t expect your best staff will be accessible during a disaster). Test varying degrees of outages from year-to-year.  Ideally, your disaster recovery platforms are already integrated with your production platforms (e.g. Cloud infrastructure), so you are basically testing throughout the year and keeping your environments current.       

Simplicity is Your Friend– Don’t over-engineer your disaster recovery model to emphasize complex processes using anticipated disaster scenarios. Focus on simplified planning that is easy to follow, stresses continuity of critical operations, and uses flexible tools regardless of the scenario.  In the midst of the “fog of war”, simplicity is your friend.   

One could write extensively about disaster recovery and business continuity. However, I believe the core principles are simple. Get to know all of the business, understand their core operations and the technologies that drive them, establish disaster recovery for these key tools, and test them periodically to ensure they will be available when needed. When the eventual disruption strikes and all eyes are on you, you will be the hero and deliver technology recovery when it counts.

Read Also

Creating a Cloud Culture

Creating a Cloud Culture

Gary Barlet, CIO, USPS OIG
Security of Cloud Solutions

Security of Cloud Solutions

Mike Maier, CIO, Certified Security Officer, CTO, City of Fort Lauderdale
Reigning the Cloud in a High-Velocity Digital World

Reigning the Cloud in a High-Velocity Digital World

Srikanth Karra, CIO, Jefferson County Commission
New tsunami of vulnerabilities heading toward the smart city evolution.

New tsunami of vulnerabilities heading toward the smart city evolution.

Michael T. Dent, CISO, Fairfax County Government