Why GlusterFS should not be integrated with OpenStack

OpenStack is one of the most promising and fastest growing tools for building clouds. Even though it is able to configure most of the necessary components for hosting virtual machines (VM) such as hypervisors, storage servers and networking, it still depends on third-party software. In this article we would like to focus on GlusterFS which is one of the supported back-ends in OpenStack Block Storage (Cinder).

Typically, instances are booted from an image file stored in Glance, which provides services for managing virtual machine images. Compute node downloads such image, puts it on a local disk and boots a VM. This method makes it impossible to use the highly desired live migration, which is very useful for performing maintenance of compute nodes. Shared storage is necessary to support it. Here comes GlusterFS.

Instead of booting an instance from an image it is possible to boot it from a volume. Cinder is responsible for managing such volumes in back-end storage system. Now, if the volume is stored on shared storage it is possible to perform live migration of the instance.

GlusterFS is very easy to understand and configure. All data is stored in volumes which consist of bricks. Bricks are basically directories stored on nodes. Each node can have many bricks. Here is an example of a replicated volume with 2 bricks:

# gluster volume info cinder

Volume Name: cinder
Type: Replicate
Volume ID: 1872c922-87c9-4060-80f3-f414acc0b033
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.255.128.254:/srv/glusterfs/cinder
Brick2: 10.255.128.1:/srv/glusterfs/cinder
Options Reconfigured:

Even though it is easy to install and manage there are many disadvantages to going with GlusterFS.

GlusterFS volume != Cinder volume

First thing which may come to an administrator's mind: creating a volume in Cinder leads to creating a volume in GlusterFS as well. One could not be more wrong. Cinder is using a single GlusterFS volume (might use more) for storing Cinder volumes as regular or sparse files. This implementation has many implications for managing volumes.

Creating volumes from snapshots, other volumes or images is impossible

It is quite inconvenient to manage volumes that consist of regular files stored on GlusterFS. There is no support for snapshotting at all. Right now GlusterFS devs are working on file snapshotting which will probably be available in GlusterFS 3.5 and no one knows when in Cinder.

Moreover, it is also not possible to snapshot a whole GlusterFS volume. It would be still quite useless considering that all Cinder volumes are stored on one GlusterFS volume.

Given we cannot create snapshots it is also difficult to implement making volumes from other volumes if they are in use. If not, it is just a matter of copying a volume file but it is not implemented in Cinder anyway.

In case of creating volumes from images it works the same as volume's clones except the fact that an image must be fetched from Glance. There have been done some optimizations in Havana but it still requires copying files locally.

Lack of copy-on-write

Unfortunately, previous sections lead to a very important conclusion. There is no support for copy-on-write volumes at all. There is no hierarchy among them so common data are not stored only once on a disk. It also means that even if Cinder devs decide to implement creating volumes from snapshots/other volumes/images they will have to rely on qcow format and not future support for such features in GlusterFS.

Some work has already been done but still you will not find it in Grizzly. No one knows how solid it will be once it is released in Havana.

Mounting GlusterFS volume is poorly handled

All GlusterFS volumes may be mounted with a command:

# mount.glusterfs 10.255.128.1:/volume-name -o backupvolfile-server=10.255.128.254 mount-point

Client tries to mount volume volume-name at mount point mount-point by connecting with brick 10.255.128.1:/test-vol. GlusterFS native protocol is used. If something goes wrong the other brick is used (provided by backupvolfile-server parameter). Cinder does not use backupvolfile-server at all. In such case if 10.255.128.1 is not available, mounting the volume will fail. Things get worse if the volume consists of many bricks. There is no way to specify them all. The fix is in progress.

In Cinder the issue is resolved by mounting all shares specified in glusterfs_shares_config file. One can put all bricks there. It is definitely not elegant and requires changes in case of a brick manipulation on a volume.

Lack of support for login/password authentication

Gluster CLI tool does not support login/password authentication for volumes. Only IP based authentication is possible. Since it is quite easy to spoof it, a separate L2 network for communication or at least a little bit more sophisticated ways of filtering network traffic are requried.

This is rather odd for us. During a volume management one can observe that such parameters exist and yet after more than 2.5 years since opening the ticket it is still not implemented.

Volume can be mounted on many machines with R/W access

Concurrent access to a volume from many clients with read/write access is actually one of the advantages of GlusterFS. It might be very useful in some scenarios. Imagine a set of application servers which run website for storing and viewing pictures. Each of them is running on a separate VM and requires access to shared directory where all images are stored. In such situation GlusterFS fits perfectly. The same happens when compute nodes access Cinder volumes. All of them mount GlusterFS volume and read/write to/from volume files stored there.

Unfortunately, attaching the same volume to many VMs is not yet supported by OpenStack.

How can we make things better?

GlusterFS is not going to be our first choice as Cinder back-end. It might be very useful for multi-attach volumes but it's lack of basic operations on volumes forces to search for additional tools like qcow image format.

Fortunately, there are other solutions. From our experience one of the most promising open source storage systems is Ceph. It not only provides all functionality which OpenStack/GlusterFS lacks but is also much better designed which makes working with it very pleasant. For more information we recommend Sebastien Han's Blog together with Ceph's documentation.


Shelly Cloud is a platform for hosting Ruby and Ruby on Rails applications. You can focus on development without getting distracted by deployment, optimization and maintenance.


Want to be up-to-date with Shelly?

Follow us on Twitter and get latest news about platform, ongoing issues and blog posts.

Talk to a human