Release notes for Gluster 3.11.1

This is a bugfix release. The release notes for 3.11.0, contains a listing of all the new features that were added and bugs fixed, in the GlusterFS 3.11 stable release.

Major changes, features and limitations addressed in this release

Improved disperse performance

Fix for bug #1456259 changes the way messages are read and processed from the socket layers on the Gluster client. This has shown performance improvements on disperse volumes, and is applicable to other volume types as well, where there maybe multiple applications or users accessing the same mount point.

Group settings for enabling negative lookup caching provided

Ability to serve negative lookups from cache was added in 3.11.0 and with this release, a group volume set option is added for ease in enabling this feature.

See group-nl-cache for more details.

Gluster fuse now implements "-oauto_unmount" feature.

libfuse has an auto_unmount option which, if enabled, ensures that the file system is unmounted at FUSE server termination by running a separate monitor process that performs the unmount when that occurs. This release implements that option and behavior for glusterfs.

Note that "auto unmount" (robust or not) is a leaky abstraction, as the kernel cannot guarantee that at the path where the FUSE fs is mounted is actually the toplevel mount at the time of the umount(2) call, for multiple reasons, among others, see: - fuse-devel: "fuse: feasible to distinguish between umount and abort?" - https://github.com/libfuse/libfuse/issues/122

Major issues

  1. Expanding a gluster volume that is sharded may cause file corruption
    • Sharded volumes are typically used for VM images, if such volumes are expanded or possibly contracted (i.e add/remove bricks and rebalance) there are reports of VM images getting corrupted.
    • Status of this bug can be tracked here, #1465123

Bugs addressed

Bugs addressed since release-3.11.0 are listed below.

  • #1456259: limited throughput with disperse volume over small number of bricks
  • #1457058: glusterfs client crash on io-cache.so(__ioc_page_wakeup+0x44)
  • #1457289: tierd listens to a port.
  • #1457339: DHT: slow readdirp performance
  • #1457616: "split-brain observed [Input/output error]" error messages in samba logs during parallel rm -rf
  • #1457901: nlc_lookup_cbk floods logs
  • #1458570: [brick multiplexing] detach a brick if posix health check thread complaints about underlying brick
  • #1458664: [Geo-rep]: METADATA errors are seen even though everything is in sync
  • #1459090: all: spelling errors (debian package maintainer)
  • #1459095: extras/hook-scripts: non-portable shell syntax (debian package maintainer)
  • #1459392: possible repeatedly recursive healing of same file with background heal not happening when IO is going on
  • #1459759: Glusterd segmentation fault in ' _Unwind_Backtrace' while running peer probe
  • #1460647: posix-acl: Whitelist virtual ACL xattrs
  • #1460894: Rebalance estimate time sometimes shows negative values
  • #1460895: Upcall missing invalidations
  • #1460896: [Negative Lookup Cache]Need a single group set command for enabling all required nl cache options
  • #1460898: Enabling parallel-readdir causes dht linkto files to be visible on the mount,
  • #1462121: [GNFS+EC] Unable to release the lock when the other client tries to acquire the lock on the same file
  • #1462127: [Bitrot]: Inconsistency seen with 'scrub ondemand' - fails to trigger scrub
  • #1462636: Use of force with volume start, creates brick directory even it is not present
  • #1462661: lk fop succeeds even when lock is not acquired on at least quorum number of bricks
  • #1463250: with AFR now making both nodes to return UUID for a file will result in georep consuming more resources