Written by Chuck Armstrong, Dell Storage Engineering
With the release of Storage Center OS (SCOS) 7, Dell Storage Manager (DSM) 2016 R1, and PS Series Firmware v9.0, the number of options for replication and migration have grown substantially. With so many available options, you might ask: Which one is right for my environment?
I’ve got good news! This blog post will help identify the ideal solutions for several scenarios. After reading through these, you should be able to determine which solution fits best for your environment.
Let’s start with replication.
If your environment has multiple locations, chances are pretty good that you’re using replication, or at least have plans to do so. How to best utilize replication depends on what you need to replicate, the distance over which the replication will occur, and the Recovery Point Objective (RPO) and Recovery Time Objective (RTO).
To replicate between SC Series arrays, the options from which to choose are:
With synchronous replication, either high availability or high consistency can be selected as the mode of operation. Beyond that, replication can be configured with multiple sites using mixed (parallel), cascade, or hybrid as the replication topology.
When to use which mode and topology is explained below.
Synchronous replication is the only option to achieve an RPO of zero. However, distance (latency incurred through distance) is a limiting factor in the ability to implement synchronous replication. Synchronous replication increases application latency as a result of how replication takes place: For every write an application makes to the primary volume, that write must be replicated to the remote volume, provide an acknowledgement back to the primary volume, and finally acknowledge the write to the application. What all of this means is, the combination of allowable latency in the application and the latency of replication in the specific environment will determine if synchronous replication is a viable option for your environment.
If you’ve identified synchronous replication as your method, it’s time to select which mode: high availability or high consistency.
High availability mode: If the replication communication from the primary (source) volume to the remote (destination) volume is interrupted, the primary volume, and the applications using it, remain active, resulting in no interruption in user productivity. However, the data on the remote volume would become stale since it cannot receive updates from the primary volume. Following a replication communication interruption, a site failure could potentially result in data loss due to the stale data on the remote volume.
High consistency mode: This addresses communication interruptions between primary and destination volumes differently and prioritizes data consistency. If there is a communication interruption preventing replication, the primary volume stops performing writes because they cannot be performed on the remote volume. This ensures data will always be consistent between primary and remote volumes, eliminating the data-loss vulnerability. However, if volume replication cannot occur, applications using that volume will stop functioning because they cannot execute writes, resulting in an interruption in user productivity.
Asynchronous replication differs from synchronous replication in the way that writes occur on primary and remote volumes. Writes from an application to the primary volume are written and acknowledgement is immediately sent back to the application. Replication occurs at a later time (when a snapshot is taken), after which, acknowledgement of the write is sent from the remote volume to the primary. This method does not increase latency in the application, but reduces consistency between the primary and remote volumes.
When the RPO allows for more flexibility, or when the distance over which replication needs to occur is too far to support synchronous replication, asynchronous replication is an option. In fact, asynchronous replication is used more often than synchronous replication, primarily due to a lower cost of infrastructure required to support asynchronous compared to synchronous replication. The RPO can be reduced by improving the connection and changing the snapshot schedules to replicate more often. The better the connection, the more data can be sent, and the more frequently that data can be sent.
Semi-sync replication is the best of both worlds regarding synchronous and asynchronous replication. From the synchronous side, every write an application sends to the primary volume is automatically sent to the remote volume for replication, as opposed to holding it for a snapshot to trigger the replication. From the asynchronous side, the acknowledgement of the write back to the application is sent immediately after the write occurs on the primary volume, instead of waiting for acknowledgement from the remote volume before acknowledging the write to the application.
Semi-sync replication reduces the RPO — nearing zero — without introducing the application latency incurred with synchronous replication. Semi-sync replication is as close as asynchronous replication can get to synchronous replication without actually being synchronous replication.
So your environment has more than two locations and you want another level of data protection? We’ve got you covered.
Mixed topology: In this topology, a volume can be replicated to two separate volumes at two separate locations. With mixed topology, one replication method can be synchronous and the other asynchronous, or both methods can be asynchronous. The mixed replication topology is useful if your environment has two failover sites. When using synchronous replication to one of the two sites, the synchronous replication rules and requirements still apply.
Cascade topology: Rather than replicating from a single source to multiple targets (mixed), the cascade replication topology replicates the source volume to a single destination volume which is then replicated to a secondary destination volume at a third location. The replication method from the source to the first destination can be either synchronous or asynchronous, but the replication from the first to the second destination can only be asynchronous. This is useful if your environment has a hot site as the first destination site, and as an added measure of protection, a cold or more distant site as a third location.
Hybrid topology: Because replication is configured on a per-volume basis, a hybrid replication topology can also be configured. Hybrid topology includes one or more volumes being replicating using the mixed (parallel) topology and one or more volumes being replicated using the cascade topology. This might be used when multiple locations are involved and different applications require different levels of protection. For example, some applications might need two hot sites, while other applications might only need one hot site with a secondary cold site configuration. In this case, the hot and cold designations are related to portions of the protected environment rather than the site location itself.
Live Volume has many similarities and some differences to replication. One similarity is that Live Volume can be configured to be synchronous or asynchronous. The major difference is that Live Volume enables mobility of your environment between SC Series storage, rather than protection from failure. The SC Series storage taking part in Live Volume can be located in the same data center or a different data center. Live Volume enables movement of the workload to a different set of servers and storage to enable hardware maintenance, which can be especially challenging for non-virtualized environments.
Although Live Volume is different than replication, a Live Volume can be used as the source volume in a replication configuration. That means the Live Volume environment can also be protected by a remote site. One replication limitation is that asynchronous Live Volumes can be the source for either a synchronous or asynchronous destination volume, whereas a synchronous Live Volume can be the source for only an asynchronous destination volume. For a deeper look into replication and Live Volume options, see the Storage Center Synchronous Replication and Live Volume Solutions Guide.
Bi-directional, asynchronous replication between SC arrays and PS groups is a new option available with our latest software and firmware releases. This new capability enables an environment with both SC and PS storage to better utilize existing storage assets. For example, one or more locations with PS groups could replicate to an SC array at a remote site. Or, existing PS groups can be moved to a remote site as the replication target, protecting an SC array at the primary site. For additional details, see the Cross-Platform Replication solutions guide and the Cross-Platform Replication video series.
Now that both PS Series and SC Series storage platforms can be integrated into a single management environment, the ability to migrate from PS Series to SC Series storage has become more relevant. Migrating from PS Series to SC Series storage is accomplished using Thin Import. This process supports either online or offline migration of volumes to SC Series storage, and allows you to repurpose the PS array to a remote site for replication, if desired. There is much more information on this feature in the Thin Import solution guide and Thin Import video.
All this information is great, but what about your VMware environment? Should you use Live Volume or VMware Site Recovery Manager (SRM), which uses the Dell Storage Replication Adapter (SRA)?
Live Volume provides a different type of protection than SRM. Live Volume enables mobility of the virtual environment between two SC Series arrays, which can be in the same or different data centers. This would be ideal if each location supports shifts of end users. For example, the day shift workers all report to site one and the night shift workers all report to site two. In this case, using Live Volume to move the running environment from site one to site two and back again is a great solution.
Alternatively, VMware SRM enables recovery from a site failure to a secondary location. This is used to protect a primary site using a hot site. Additionally, when the updated SRA becomes available, a Live Volume environment will be able to be protected with an SRM solution.
While this post summarizes each replication method, be sure to check out these additional resources:
Find even more information on storage topics at Dell.com/StorageResources.