High-Availability Storage Cluster

Synology HA Storage Cluster
Synology HA Storage Cluster

We are building a High-Availability (HA) Storage Cluster to complement our Proxmox HA Server Cluster. Synology has a nice HA solution that we can use for this. To use Synology’s HA’s solution, one must have the following:

  • Two Identical Synology NAS devices (we are using a pair of RS1221+ rack-mounted Synology NAS’)
  • Both NAS devices must have identical memory and disk configurations.
  • Both NAS devices must have at least two network interfaces available (we are using dual 10 GbE network cards in both of our NAS devices)

The two NAS devices work in an active/standby configuration and present a single IP interface for access to storage and administration.

Synology HA Documentation

Synology provides good documentation for their HA system. Here are some useful links:

The video above provides a good overview of Synology HA and how to configure it.

Storage Cluster Hardware

Synology RS1221+ NAS
Synology RS1221+ NAS

We are using a pair of Synology RS1221+ rack-mounted NAS servers. Each one is configured with the following hardware options:

Networking

Our Proxmox Cluster will connect to our HA Storage Cluster via ethernet connections. We will be storing the virtual disk drives for our VMs and LXC in this cluster on our HA Storage Cluster. Maximizing these connections’ speed and minimizing latency is important to maximize our workload’s overall performance.

Each node in our Proxmox Cluster has dedicated high-speed connections (25 GbE for pve1, 10 GbE for pve2 and pve3)  to a dedicated Storage VLAN. These connections are made through a Unfi Switch – an Enterprise XG 24. This switch is supported by a large UPS that provides battery backup power for our Networking Rack.

Ubiquity EnterpriseXG 24 Switch
Ubiquity EnterpriseXG 24 Switch

This approach is taken to minimize latency as the storage traffic cluster is completely handled with a single switch.

Ideally, we would have a pair of these switches and redundant connections to our Proxmox and HA Storage clusters to maximize reliability. While this would be a nice enhancement, we have chosen to use a single switch for cost reasons.

The NAS drives in our HA Storage Cluster are configured to provide an interface to both our Storage VLAN. This approach ensures that the nodes in our Proxmox cluster can access the HA Storage Cluster directly without a routing hop through our firewall. We also set the MTU for this network to 9000 (Jumbo Frames) to minimize packet overhead.

Storage Design

Each Synology RS1221+ in our cluster has eight 960 GB Enterprise SSDs. The performance of the resulting storage system is important as we will be storing the disks for the VMs and LXCs in our Proxmox Cluster on our HA Storage System. The following are the criteria we used to select a storage pool configuration:

  • Performance – we want to be able to saturate the 10 GbE interfaces to our HA Storage Cluster
  • Reliability – we want to be protected against single-drive failures. We will keep spare drives and use backups to manage the chance of simultaneous multiple-drive failures.
  • Storage Capacity – we want to use the available SSD storage capacity efficiently.

We considered using either a RAID-10 or RAID-5 configuration.

Storage Devices – 960 GB Enterprise SSDs

Toshiba 960 GB SSD Performance
Toshiba 960 GB SSD Performance

Our SSD drives are enterprise models with good throughput and IO/s (IOPs) performance.

960 GB SSD Reliability Features
960 GB SSD Reliability Features

They also feature some desirable reliability features, including good write endurance and MTBF numbers. Our drives also feature sudden power-off features to maintain data integrity in the event of a power failure that cannot be backed up by our UPS system.

Performance Comparison – RAID-10 vs. RAID-5

We used a RAID performance calculator to estimate the performance of our storage system. Based on actual runtime data from our VMs and LXCs running in Proxmox, our IO workload is almost completely written operation-dominated. This is probably due to the fact that read caching handles most read operations from memory on our servers.

The first option we considered was RAID-10. The estimated performance for this configuration is shown below.

RAID-10 Throughput Performance
RAID-10 Throughput Performance

As you can see, this configuration’s throughput will more than saturate our 10 GbE connections to our HA Storage Cluster.

The next option we considered was RAID-5. The estimated performance for this configuration is shown below.

RAID-5 Throughput Performance
RAID-5 Throughput Performance

As you can see, performance is a substantial hit due to the need to generate and store parity data each time storage is written. The RAID-5 configuration should also be able to saturate our 10 GbE connections to the Storage Cluster.

The result is that the RAID-10 and RAID-5 configurations will provide the same performance level given our 10 GbE connections to our Storage Cluster.

Capacity Comparison – RAID-10 vs. RAID-5

The next step in our design process was to compare the usable storage capacity between RAID-10 and RAID-5 using Synology’s RAID Calculator.

RAID-10 vs. RAID-5 Usable Storage Capacity
RAID-10 vs. RAID-5 Usable Storage Capacity

Not surprisingly, the RAID-5 configuration creates roughly twice as much usable storage when compared to the RAID-10 configuration.

Chosen Configuration

We decided to formate our SSDs as a Btrfs storage pool configured as a RAID-5. We choose RAID-5 for the following reasons:

  • A good balance between write performance and reliability
  • Efficient use of available SSD storage space
  • Acceptable overall reliability (single disk failures) given the following:
    • Our storage pools are fully redundant between the primary and secondary NAS pools
    • We run regular automatic snapshots, replications, and backups via Synology’s Hyper Backup as well as server-side backups via Proxmox Backup Server.

The following shows the expected IO/s (IOPs) for our storage system.

RAID-5 IOPs Performance
RAID-5 IOPs Performance

This level of performance should be more than adequate for our three-node cluster’s workload.

Dataset / Share Configuration

The final dataset format that we will use for our vdisks is TBD at this point. We plan to test the performance of both iSCSI LUNs and NFS shares. If these perform roughly the same for our workloads, we will use NFS to gain better support for snapshots and replication features. At present, we are using an NFS dataset to store our vdisks.

HA Configuration

Configuring the pair of RS1212+ NAS servers for HAS was straightforward. Only minimal configurations are needed on the secondary NAS to get the storage and network configurations to match the primary NAS. The process that enables HA on the primary NAS will overwrite all of the settings on the secondary NAS.

Here are the steps that we used to do this.

  • Install all of the upgrades and SSDs in both units
  • Connect both units to our network and install an ethernet connection between the two units for heartbeats and synchronization
  • Install DSM on each unit and set a static IP address for the network-facing ethernet connections (we do not set IPs for the heartbeat connections – Synology HAS takes care of this)
  • Configure the network interfaces on both units to provide direct interfaces to our Storage VLAN (see the previous section)
  • Make sure that the MTU settings are identical on each unit. This includes the MTU setting for unused ethernet interfaces. We had to edit the /etc/synoinfo.conf file on each unit to set the MTU values for the inactive interfaces.
  • Ensure both units are running up-to-date versions of the DSM software
  • Configure the pair for HA (see the documentation above)
  • Complete the configuration of the cluster pair, including –
    • Shares
    • Backups
    • Snapshots and Replication
    • Install Apps

The following shows the completed configuration of our HA Storage Cluster.

Completed HA Cluster Configuration
Completed HA Cluster Configuration

The cluster uses a single IP address to present a GUI that configures and manages the primary and secondary NAS units as if they were a single NAS. The same IP address always points to the active NAS for file sharing and iSCSI I/O operations.

Voting Server

A voting server avoids split-brain scenarios where both units in the HA cluster try to act as the master. Any server that is always accessible via ping to both NAS drives in the cluster can serve as a Voting Server. We used the gateway for the Storage VLAN where the cluster is connected for this purpose.

Performance Benchmarking

We used the ATTO Disk Benchmarking Tool to perform benchmark tests on the complete HA cluster. The benchmarks were run from an M2 Mac Mini running macOS, which used an SMB share to access the Storage Cluster over a 10 GbE connection on the Storage VLAN.

Storage Cluster Benchmark Configuration
Storage Cluster Benchmark Configuration

The following are the benchmark results –

Storage Throughput Benchmarks
Storage Cluster Throughput Benchmarks

The Storage Cluster’s performance is quite good, and the 10 GbE connection is saturated for 128 KB writes and larger. The slightly lower read throughput results from a combination of our SSD’s wire performance and the additional latency on writes due to the need to copy data from the primary NAS storage to the secondary NAS.

Storage Cluster IOPs Benchmarks
Storage Cluster IOPs Benchmarks

IOs/sec (IOPs) performance is important for virtual disks such as VMs and LXC containers, as they frequently perform smaller writes.

We also ran benchmarks from a VM running Windows 10 in our Proxmox Cluster. These benchmarks benefit from a number of caching and compression features in our architecture, including:

  • Write Caching with the Windows 10 OS
  • Write Caching with the iSCSI vdisk driver in Proxmox
  • Write Caching on the NAS drives in our Storage Cluster
Windows VM Disk Benchmarks
Windows VM Disk Benchmarks

The overall performance figures for the Windows VM benchmark exceed the capacity of the 10 GbE connections to the Storage Cluster and are quite good. Also, the IOPs performance is close to the specified maximum performance values for the RS1221+ NAS.

Windows VM IOPs Benchmarks
Windows VM IOPs Benchmarks

Failure Testing

The following scenarios were tested under a full workload –

  • Manual Switch between Active and Standby NAS devices
  • Simulate a network failure by disconnecting the primary NAS ethernet cable.
  • Simulate active NAS failure by pulling power from the primary NAS.
  • Simulate a disk failure by pulling a disk from the primary NAS pool.

In all cases, our system failed over within 30 seconds or less and continued handling the workload without error.

Anita's and Fred's Home Lab

WordPress Appliance - Powered by TurnKey Linux