ZFS based storage appliance orchestration

ZFS based storage appliance orchestration

This package demonstrates how to sue the SSH plug-in to orchestrate the replication between two ZFS based storage appliances. For this example I have used FreeNAS since it is free, lightweight, easy to install and operate but the workflows should work for other ZFS based appliances.

I have only basic knowledge of FreeNAS and ZFS replication mechanism so this is not intended as being a ZFS replication best practice but instead a kick start in getting an orchestrated replication. Once this is up and running you will be able to expand on this solution and optimize it for your needs.

Screen Shot 2012-10-22 at 2.49.01 PM.png

Pre-requisites

Here are the different components and their requirements:

  • A vCO server version 4.x or higher with the SSH plug-in enabled.
  • Two FreeNAS VMS with the following requirements
    • FreeNAS 8.3.0 (tested version was 8.3.0  RC1 but other versions expected to work).
    • Set the operating system to FreeBSD 64 bits
    • RAM > 4GB (highly recommended to get decent ZFS performance when using deduplication)
    • Add a 2nd HD as big as you want your storage to be. Install FreeNAS on disk da0.  We will create an additional disk for storage on da1.

In terms of network connection we need the SSH port to be open on the two FreeNAS appliances. In addition:

  • The vCO server must be able to SSH to the two FreeNAS appliances (both for initial configuration and source to start a replication).
  • The source FreeNAS appliance must be able to SSH to the destination FreeNAS appliance (for replicating the data).
  • Using hostname or fixed IP is recommended since a change of IP would require reconfiguring the authorizations.

FreeNAS UI based configuration

There are a few steps involved in doing the basic configuration of the appliances. At first I wrote a workflow automating these tasks using HTTP POST but since upgrading to a more recent version of FreeNAS broke these, and since this is a one time operation that is helpful to understand the different options I decided to document them.

Authentication

The first steps are to secure the appliances and set the initial access right for vCO.

On each VM. Log on the web interface and Click on Account (first icon on top right) and Change password tab. Set a password and click on "Change Admin password".

Screen Shot 2012-10-18 at 11.42.13 AM.png

Click on the services icon (last icon on the top left) and then on the little wrench for SSH.  Configure SSH to login as root with password. Password authentication will only be available during the vCO first configuration, we will remove it after.

Screen Shot 2012-10-18 at 11.45.11 AM.png

Turn the SSH service on.

Screen Shot 2012-10-18 at 2.45.32 PM.png

Storage management

Now that we have configured SSH access we need to configure the appliance storage.

Click on the storage icon in the top left bar. In the volume manager create a new volume on da1 (select it).

  • Name it "vol1" (the vol1 name is used as workflow attribute so you will not have to provide a volume name for each operation).
  • Use ZFS as the filesystem type.
  • You can enable deduplication if wanted (then make sure you gave > 4 GB RAM to the VM). Deduplication saves a lot of storage space but will require high memory and CPU consumption. You may decide against it and allocate more disk space if for example using the appliance in your provider cloud. Click on "Add Volume".

Screen Shot 2012-10-18 at 3.14.22 PM.png

On the source appliance and only on this one, in volume manager change the permissions.

Screen Shot 2012-10-18 at 9.18.25 PM.png

Set the owner (user and group) to "nobody".

Screen Shot 2012-10-18 at 9.26.06 PM.png

Sharing the storage

There are different options to share the storage:

  • iSCSI
  • NFS
  • SMB
  • AFP

For my use case NFS was the most appropriate. The synchronization workflow was build / tested for NFS. Here is how to setup NFS on the source appliance. Again these may not be the best practices but a good way to get it working quickly.

Click on the sharing icon (top left bar). In the UNIX (NFS) tab add a Unix NFS share. It is mandatory to set an authorized network of IP address. Here I have authorized a host on my local network. Set the "Mapall" fields as bellow. These allows to map incoming users on the nobody group we have granted read write access on the appliance. With knowing this you want to make sure you restrict the authorized network / hosts as much as possible. The most important field is the path that in our case is "/mnt/vol1".

Screen Shot 2012-10-18 at 9.39.26 PM.png

On the destination appliance Add a NFS share. This one will be set to read only for a given host or network. Here the host that will access the appliance locally has the same IP as the one that access my source appliance but it is not necessarily the same host in the case we use two network interfaces on the storage appliance (one public facing and one internal facing). Set the path to "/mnt/vol1".

Screen Shot 2012-10-19 at 9.12.39 AM.png

All the manual configuration in the FreeNAS UI is now done. Of course it is possible to make all of these settings as a workflow right after the SSH access is enabled but since this is a one time configuration having full view over the different settings allows understanding the different options.

vCO based configuration

A part of the configuration is automated with vCO workflows. These configurations workflows are located in PSO / FreeNAS / First configuration

Orchestrate from vCO

The FreeNAS appliance is going to be orchestrated using SSH. While we could use a password based authentication using public key encryption is more secure and more convenient.

Run the "- 1 - Register vCO public key on FreeNAS host" on the source appliance.

You will have to provide the IP or hostname of the FreeNAS host and a username and password with root access.

Screen Shot 2012-10-22 at 4.01.34 PM.png

This workflow will register the vCO public key on the FreeNAS appliance. If the key was never generated it will generate one.

Repeat this operation with the destination appliance. This is required by the next workflow.

Configure replication authorizations

In a similar way that we authorize vCO to SSH in the FreeNAS appliances it is required to have the source FreeNAS appliance SSH into the destination. For this running the " - 2 - Configure replication authorizations" is required.

Provide the hostname or IP address for the source FreeNAS appliance and for the destination one.

Screen Shot 2012-10-22 at 4.13.52 PM.png

Initial replication

Here is the first true test of the solution. Make sure to copy some files on the source appliance. If possible you should start with something that can be copied relatively quickly and be able to add more file later on in incremental copies.

Once done Run the "- 3 - ZFS Create and send snapshot (Full)" workflow. Provide the hostname or IP address for the source FreeNAS appliance and for the destination one.

Screen Shot 2012-10-22 at 4.24.49 PM.png

The workflow can take a long time to complete depending on how much data has to be copied and how much bandwidth exists between the source and destination storage appliance.

Disable SSH password authentication

Once you have everything working up to this point you can disable password based authentication since it is not used for the replication process. Disabling SSH password authentication makes your storage appliances more secure.

Screen Shot 2012-10-22 at 9.34.23 PM.png

Alternatively you can use the FreeNAS web user interface (see second screenshot in this article - "Allow password authentication"). In my case I prefer to do it from either the vCO client or the vSphere Web client.

Operations

Once the initial replication has been completed you will want to run the incremental replication process when there will be changes on the source storage appliance. The workflow is calls "ZFS create and send snapshot (incremental)

You may trigger this workflow:

  • If the changes on the source storage appliance are triggered by a workflow
  • If the changes on the source storage appliance are monitored by a policy
  • Using a vCO scheduled task

Screen Shot 2012-10-22 at 9.26.16 PM.png

Here are the steps in this workflow:

  • Create a snapshot on the source storage appliance
  • Stop the NFS server on the destination storage appliance
  • Send an incremental snapshot from the source storage appliance to the destination one
  • Delete the snapshot created in the source storage appliance
  • Delete the snapshot copied in the destination storage appliance

Here are the commands:

cmd1 = "zfs snapshot " + sourceVolumeName +  "@" + sourceSnapshotName;
cmd2 = remoteExecute + " /etc/rc.d/nfsd stop";
cmd3 = "zfs send -i " + sourceVolumeName +  "@" + lastSnapshot + " " + sourceVolumeName +  "@" + sourceSnapshotName + " | " + remoteExecute + " zfs receive -F "+ destinationVolumeName;
cmd4 = "zfs destroy " + sourceVolumeName +  "@" + sourceSnapshotName;
cmd5 = remoteExecute + " zfs destroy " + sourceVolumeName +  "@" + sourceSnapshotName;
cmd6 = remoteExecute + " /etc/rc.d/nfsd start";

All these operations are performed from the vCO server on the source  appliance which in turn sends commands to the destination one.

There are a lot of ways in which this could be improved.

For example if you have low bandwidth and high CPU resources you can use bzip2 compression to send the incremental snapshots.

Another example is that here I am assuming the destination storage can be offline during the replication (in my case I will not need the destination storage data until my replication workflow is finished). There are techniques using snapshot clones that can be used so the destination storage can stay online.

On the workflow side I did not implement:

  • Rolling back operations in case of failure: For example restarting NFSD and deleting a snapshot if it could not be sent.
  • Enforcing a single instance of this workflow can be run at a given time for the same source & destination so make sure not to start two of these at the same time as it would certainly fail and you may have to clean up things manually.

Disclaimer: These sample workflows are provided AS IS and are not considered production quality and are not officially supported. Use at your own risk. Feel  free to modify and  expand and share your contributions.

Attachments
Version history
Revision #:
1 of 1
Last update:
‎10-18-2012 01:06 PM
Updated by: