Disaster recovery (DR) is all too often an afterthought in business continuity strategies. Even enterprises with complex systems and terabyte upon terabyte of sensitive data can be guilty of having outdated and untested DR plans, or no DR plan at all. An effective DR plan focuses on the technology systems supporting critical business functions; it involves a set of policies and procedures for recovering and/or continuing vital technology infrastructure and systems following any kind of disaster. Essentially, in an effective DR plan, technology systems will transition from the primary site to the DR site.

One of the biggest challenges companies face when creating DR plans is deciding between self-managed, on-prem hardware or cloud solutions. For enterprises and organizations with complex monolithic applications, the relative ease of expanding their existing on-prem solutions for disaster recovery is tempting; after all, using a cloud DR solution would require refactoring and modernization. But there are some hefty risks associated with on-premises hardware—labor-intensive maintenance, infrastructural rigidity, potential outages, networking limitations, high latencies, and data storage and retrieval issues. SAP customers teetering between the two strategies should consider a number of important factors.

What matters to your disaster recovery strategy

There is no one-size-fits-all approach to disaster recovery. Strategies differ from application to application according to structure, function, and objective. The most successful DR plans consider the entire technology network and the company’s end-goals.

Identifying the best strategy, architecture, and toolset for your business begins with defining your Recovery Time Objective (RTO), which is how long you can afford to have your business offline, and your Recovery Point Objective (RPO), which is how much data loss you can sustain before you run into compliance issues due to financial losses. The smaller your RTO and RPO goals are, the more costly the application will be.

Every organization, regardless of its situation and goals, also needs to determine and factor in the costs to the business while the system is offline, and the costs for data loss and re-creation.

3 types of applications and 3 paths to DR

Depending on the application and databases involved, there are several ways of replicating data and the corresponding application configuration from the primary site to the DR site.

Path 1: RTO within days/RPO depends on function

This scenario is meant for non-critical business applications and non-production environments; it has a recovery time objective in the range of a few hours to a few days, with a recovery point objective of less than a day. In the event of a disaster, SAP systems running in Google Cloud are recovered from persistent disk snapshots, backups stored in Cloud Storage buckets, or both. New VMs for database and application servers can also be created from Compute Engine machine images (beta). In addition, SAP HANA databases can be recovered directly from Cloud Storage buckets, when the SAP HANA Backint agent for Google Cloud (beta) is used for database backup. The frequency of backups for SAP system database and application servers determines the RPO. One of the key advantages with this path is that there are no costs incurred for having systems in standby mode (hot or cold) during normal operations until the time point of a disaster, as new VMs are created after a disaster. Additionally, managed backup solutions from third parties such as Actifio, Commvault and Dell EMC can also be used.

Path 2: RTO in less than one day/RPO within minutes

This path is meant for applications that a business can function without temporarily, provided there’s a reasonable recovery plan. In the event of a disaster, the recovery approach for SAP application servers is from persistent disk snapshots or Google machine images (which is the same as that of the previous path). For database server recovery, the approach will differ based on the type of database that’s underlying the SAP system (SAP HANA or other databases). The SAP HANA database has an asynchronous replication feature that ensures near real-time replication. For other databases, the recovery approach is based on the specific features for replication or restore from backup, and replay of the most recent logs that are replicated. Because you can recover the database to any point in time until the time of the last replicated log, you help protect the system from potential user error.

In Google Cloud, persistent disk snapshots and Compute Engine machine images can have multiregional storage locations for geo-redundancy of data. Cloud Storage buckets also offer the additional option of dual-region storage locations that combine the performance of a single region with geo-redundancy. The key consideration in this approach is the benefit of shorter RTO/less RPO, which comes with the cost that’s incurred for running a database server in a DR site (for data or log replication). An additional risk could be the potential capacity crunch in the DR region to stand up application servers within the targeted RTO. This can be mitigated by either making reservations for capacity (at an additional cost) or by running a non-productive system, like a quality assurance or test system, in the DR region whose capacity can be repurposed for the recovery of a production system in the event of a disaster.

Path 3: RTO in minutes/RPO as close to zero as possible

This final strategy is best suited for business-critical applications. With this path, the full reservation of resources is guaranteed at the disaster recovery site. The SAP systems in the DR region are always on and configured to the same size as the source systems, which ensures that your applications will recover quickly. While the benefit of the lowest RTO/RPO numbers comes at the cost of constantly running servers in the DR region, Google Cloud’s innovative pricing, with options like Sustained use discounts, allows you to architect a cost-effective DR strategy.

In any of the paths that you choose for DR, Google Cloud’s premium networking brings industry-leading network performance, software-defined networking, global virtual private networks, and best-in-class security, all of which enable a simplified, yet robust and reliable DR architecture.

#google cloud platform #partners #sap on google cloud #cloud

3 paths for disaster recovery for SAP systems on Google Cloud
1.10 GEEK