OpenStack is one of the most promising and fastest growing tools for building clouds. Even though it is able to configure most of the necessary components for hosting virtual machines (VM) such as hypervisors, storage servers and networking, it still depends on third-party software. In this article we would like to focus on GlusterFS which is one of the supported back-ends in OpenStack Block Storage (Cinder).
Typically, instances are booted from an image file stored in Glance, which provides services for managing virtual machine images. Compute node downloads such image, puts it on a local disk and boots a VM. This method makes it impossible to use the highly desired live migration, which is very useful for performing maintenance of compute nodes. Shared storage is necessary to support it. Here comes GlusterFS.
Instead of booting an instance from an image it is possible to boot it from a volume. Cinder is responsible for managing such volumes in back-end storage system. Now, if the volume is stored on shared storage it is possible to perform live migration of the instance.
GlusterFS is very easy to understand and configure. All data is stored in volumes which consist of bricks. Bricks are basically directories stored on nodes. Each node can have many bricks. Here is an example of a replicated volume with 2 bricks:
# gluster volume info cinder Volume Name: cinder Type: Replicate Volume ID: 1872c922-87c9-4060-80f3-f414acc0b033 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.255.128.254:/srv/glusterfs/cinder Brick2: 10.255.128.1:/srv/glusterfs/cinder Options Reconfigured:
Even though it is easy to install and manage there are many disadvantages to going with GlusterFS.
GlusterFS volume != Cinder volume
First thing which may come to an administrator's mind: creating a volume in Cinder leads to creating a volume in GlusterFS as well. One could not be more wrong. Cinder is using a single GlusterFS volume (might use more) for storing Cinder volumes as regular or sparse files. This implementation has many implications for managing volumes.
Creating volumes from snapshots, other volumes or images is impossible
It is quite inconvenient to manage volumes that consist of regular files stored on GlusterFS. There is no support for snapshotting at all. Right now GlusterFS devs are working on file snapshotting which will probably be available in GlusterFS 3.5 and no one knows when in Cinder.
Moreover, it is also not possible to snapshot a whole GlusterFS volume. It would be still quite useless considering that all Cinder volumes are stored on one GlusterFS volume.
Given we cannot create snapshots it is also difficult to implement making volumes from other volumes if they are in use. If not, it is just a matter of copying a volume file but it is not implemented in Cinder anyway.
In case of creating volumes from images it works the same as volume's clones except the fact that an image must be fetched from Glance. There have been done some optimizations in Havana but it still requires copying files locally.
Lack of copy-on-write
Unfortunately, previous sections lead to a very important conclusion. There is no support for copy-on-write volumes at all. There is no hierarchy among them so common data are not stored only once on a disk. It also means that even if Cinder devs decide to implement creating volumes from snapshots/other volumes/images they will have to rely on qcow format and not future support for such features in GlusterFS.
Some work has already been done but still you will not find it in Grizzly. No one knows how solid it will be once it is released in Havana.
Mounting GlusterFS volume is poorly handled
All GlusterFS volumes may be mounted with a command:
# mount.glusterfs 10.255.128.1:/volume-name -o backupvolfile-server=10.255.128.254 mount-point
Client tries to mount volume
volume-name at mount point
connecting with brick
10.255.128.1:/test-vol. GlusterFS native protocol is
used. If something goes wrong the other brick is used (provided by
backupvolfile-server parameter). Cinder does not use
at all. In such case if
10.255.128.1 is not available, mounting the volume
will fail. Things get worse if the volume consists of many bricks. There is no
way to specify them all. The fix is in
In Cinder the issue is resolved by mounting all shares specified in
glusterfs_shares_config file. One can put all bricks there. It is definitely
not elegant and requires changes in case of a brick manipulation on a volume.
Lack of support for login/password authentication
Gluster CLI tool does not support login/password authentication for volumes. Only IP based authentication is possible. Since it is quite easy to spoof it, a separate L2 network for communication or at least a little bit more sophisticated ways of filtering network traffic are requried.
This is rather odd for us. During a volume management one can observe that such parameters exist and yet after more than 2.5 years since opening the ticket it is still not implemented.
Volume can be mounted on many machines with R/W access
Concurrent access to a volume from many clients with read/write access is actually one of the advantages of GlusterFS. It might be very useful in some scenarios. Imagine a set of application servers which run website for storing and viewing pictures. Each of them is running on a separate VM and requires access to shared directory where all images are stored. In such situation GlusterFS fits perfectly. The same happens when compute nodes access Cinder volumes. All of them mount GlusterFS volume and read/write to/from volume files stored there.
Unfortunately, attaching the same volume to many VMs is not yet supported by OpenStack.
How can we make things better?
GlusterFS is not going to be our first choice as Cinder back-end. It might be very useful for multi-attach volumes but it's lack of basic operations on volumes forces to search for additional tools like qcow image format.
Fortunately, there are other solutions. From our experience one of the most promising open source storage systems is Ceph. It not only provides all functionality which OpenStack/GlusterFS lacks but is also much better designed which makes working with it very pleasant. For more information we recommend Sebastien Han's Blog together with Ceph's documentation.