Release notes for Gluster 6.0

This is a major release that includes a range of code improvements and stability fixes along with a few features as noted below.

A selection of the key features and changes are documented in this page. A full list of bugs that have been addressed is included further below.

Announcements
Major changes and features
Major issues
Bugs addressed in the release

Announcements

Releases that receive maintenance updates post release 6 are, 4.1 and 5 (reference)
Release 6 will receive maintenance updates around the 10th of every month for the first 3 months post release (i.e Apr'19, May'19, Jun'19). Post the initial 3 months, it will receive maintenance updates every 2 months till EOL. (reference)
A series of features/xlators have been deprecated in release 6 as follows, for upgrade procedures from volumes that use these features to release 6 refer to the release 6 upgrade guide.

This deprecation was announced at the gluster-users list here.

Features deprecated:

Block device (bd) xlator
Decompounder feature
Crypt xlator
Symlink-cache xlator
Stripe feature
Tiering support (tier xlator and changetimerecorder)

Major changes and features

Highlights

Several stability fixes addressing,
coverity, clang-scan, address sanitizer and valgrind reported issues
removal of unused and hence, deprecated code and features
Client side inode garbage collection
This release addresses one of the major concerns regarding FUSE mount process memory footprint, by introducing client side inode garbage collection
See standalone section for more details
Performance Improvements
--auto-invalidation on FUSE mounts to leverage kernel page cache more effectively

Features are categorized into the following sections,

Management
Standalone
Developer

Management

NOTE: There have been several stability improvements around the brick multiplexing feature

GlusterD2

GlusterD2 (or GD2, in short) was planned as the next generation management service for Gluster project.

Currently, GD2s main focus is not replacing glusterd, but to serve as a thin management layer when using gluster with container orchestration systems.

There is no specific update around GD2 provided as a part of this release.

Standalone

1. client-side inode garbage collection via LRU list

A FUSE mount's inode cache can now be limited to a maximum number, thus reducing the memory footprint of FUSE mount processes.

See the lru-limit option in man 8 mount.glusterfs for details.

NOTE: Setting this to a low value (say less than 4000), will evict inodes from FUSE and Gluster caches at a much faster rate, and can cause performance degrades. The setting has to be determined based on the available client memory and required performance.

glusterfind tool has an added option "--type", to be used with the "--full" option. The option supports finding and listing files or directories only, and defaults to both if not specified.

Example usage with the pre and query commands are given below,

Pre command (reference):
Lists both files and directories in OUTFILE: glusterfind pre SESSION_NAME VOLUME_NAME OUTFILE
Lists only files in OUTFILE: glusterfind pre SESSION_NAME VOLUME_NAME OUTFILE --type f
Lists only directories in OUTFILE: glusterfind pre SESSION_NAME VOLUME_NAME OUTFILE --type d
Query command:
Lists both files and directories in OUTFILE: glusterfind query VOLUME_NAME --full OUTFILE
Lists only files in OUTFILE: glusterfind query VOLUME_NAME --full --type f OUTFILE
Lists only directories in OUTFILE: glusterfind query VOLUME_NAME --full --type d OUTFILE

3. FUSE mounts are enhanced to handle interrupts to blocked lock requests

FUSE mounts are enhanced to handle interrupts to blocked locks.

For example, scripts using the flock (man 1 flock) utility without the -n(nonblock) option against files on a FUSE based gluster mount, can now be interrupted when the lock is not granted in time or using the -w option with the same utility.

4. Optimized/pass-through distribute functionality for 1-way distributed volumes

NOTE: There are no user controllable changes with this feature

The distribute xlator now skips unnecessary checks and operations when the distribute count is one for a volume, resulting in improved performance.

5. Options introduced to disable invalidations of kernel page cache

For workloads, where multiple FUSE client mounts do not concurrently operate on any files in the volume, it is now possible to maintain a longer duration kernel page cache using the following options in conjunction,

Setting --auto-invalidation option to "no" on the glusterfs FUSE mount process
Disabling the volume option performance.global-cache-invalidation

This enables better performance as the data is served from the kernel page cache where possible.

Previously all GlusterFS volumes were being exported by default via smb.conf in a Samba-CTDB setup. This includes creating a share section for CTDB lock volume too which is not recommended. Along with few syntactical errors these scripts failed to execute in a non-Samba setup in the absence of necessary configuration and binary files.

Hereafter newly created GlusterFS volumes are not exported as SMB share via Samba unless either of 'user.cifs' or 'user.smb' volume set options are enabled on the volume. The existing GlusterFS volume share sections in smb.conf will remain unchanged.

7. ctime feature is enabled by default

The ctime feature which maintains (c/m) time consistency across replica and disperse subvolumes is enabled by default.

Also, with this release, a single option is provided to enable/disable ctime feature,

#gluster vol set <volname> ctime <on/off>

NOTE: The time information used is from clients, hence it's required that clients are synced with respect to their times, using NTP or other such means.

Limitations:

Mounting gluster volume with time attribute options (noatime, realatime...) is not supported with this feature
This feature does not guarantee consistent time for directories if the hashed sub-volume for the directory is down
Directory listing is not supported with this feature, and may report inconsistent time information
Older files created before upgrade, would witness update of ctime upon accessing after upgrade BUG:1593542

Developer

1. Gluster code can be compiled and executed using TSAN

While configuring the sources for a build use the extra option --enable-tsan to enable thread sanitizer based builds.

2. gfapi: A class of APIs have been enhanced to return pre/post gluster_stat information

A set of apis have been enhanced to return pre/post gluster_stat information. Applications using gfapi would need to adapt to the newer interfaces to compile against release-6 apis. Pre-compiled applications, or applications using the older API SDK will continue to work as before.

Major issues

None

Bugs addressed

Bugs addressed since release-5 are listed below.

#1138841: allow the use of the CIDR format with auth.allow
#1236272: socket: Use newer system calls that provide better interface/performance on Linux/*BSD when available
#1243991: "gluster volume set group " is not in the help text
#1285126: RFE: GlusterFS NFS does not implement an all_squash volume setting
#1343926: port-map: let brick choose its own port
#1364707: Remove deprecated stripe xlator
#1427397: script to strace processes consuming high CPU
#1467614: Gluster read/write performance improvements on NVMe backend
#1486532: need a script to resolve backtraces
#1511339: In Replica volume 2*2 when quorum is set, after glusterd restart nfs server is coming up instead of self-heal daemon
#1535495: Add option -h and --help to gluster cli
#1535528: Gluster cli show no help message in prompt
#1560561: systemd service file enhancements
#1560969: Garbage collect inactive inodes in fuse-bridge
#1564149: Agree upon a coding standard, and automate check for this in smoke
#1564890: mount.glusterfs: can't shift that many
#1575836: logic in S30samba-start.sh hook script needs tweaking
#1579788: Thin-arbiter: Have the state of volume in memory
#1582516: libgfapi: glfs init fails on afr volume with ctime feature enabled
#1590385: Refactor dht lookup code
#1593538: ctime: Access time is different with in same replica/EC volume
#1596787: glusterfs rpc-clnt.c: error returned while attempting to connect to host: (null), port 0
#1598345: gluster get-state command is crashing glusterd process when geo-replication is configured
#1600145: [geo-rep]: Worker still ACTIVE after killing bricks
#1605056: [RHHi] Mount hung and not accessible
#1605077: If a node disconnects during volume delete, it assumes deleted volume as a freshly created volume when it is back online
#1608512: cluster.server-quorum-type help text is missing possible settings
#1624006: /var/run/gluster/metrics/ wasn't created automatically
#1624332: [Thin-arbiter]: Add tests for thin arbiter feature
#1624724: ctime: Enable ctime feature by default and also improve usability by providing single option to enable
#1624796: mkdir -p fails with "No data available" when root-squash is enabled
#1625850: tests: fixes to bug-1015990-rep.t
#1625961: Writes taking very long time leading to system hogging
#1626313: fix glfs_fini related problems
#1626610: [USS]: Change gf_log to gf_msg
#1626994: split-brain observed on parent dir
#1627610: glusterd crash in regression build
#1627620: SAS job aborts complaining about file doesn't exist
#1628194: tests/dht: Additional tests for dht operations
#1628605: One client hangs when another client loses communication with bricks during intensive write I/O
#1628664: Update op-version from 4.2 to 5.0
#1629561: geo-rep: geo-rep config set fails to set rsync-options
#1630368: Low Random write IOPS in VM workloads
#1630798: Add performance options to virt profile
#1630804: libgfapi-python: test_listdir_with_stat and test_scandir failure on release 5 branch
#1630922: glusterd crashed and core generated at gd_mgmt_v3_unlock_timer_cbk after huge number of volumes were created
#1631128: rpc marks brick disconnected from glusterd & volume stop transaction gets timed out
#1631357: glusterfsd keeping fd open in index xlator after stop the volume
#1631886: Update database profile settings for gluster
#1632161: [Disperse] : Set others.eager-lock on for ec-1468261.t test to pass
#1632236: Provide indication at the console or in the logs about the progress being made with changelog processing.
#1632503: FUSE client segfaults when performance.md-cache-statfs is enabled for a volume
#1632717: EC crashes when running on non 64-bit architectures
#1632889: 'df' shows half as much space on volume after upgrade to RHGS 3.4
#1633926: Script to collect system-stats
#1634102: MAINTAINERS: Add sunny kumar as a peer for snapshot component
#1634220: md-cache: some problems of cache virtual glusterfs ACLs for ganesha
#1635050: [SNAPSHOT]: with brick multiplexing, snapshot restore will make glusterd send wrong volfile
#1635145: I/O errors observed on the application side after the creation of a 'linkto' file
#1635480: Correction for glusterd memory leak because use "gluster volume status volume_name --detail" continuesly (cli)
#1635593: glusterd crashed in cleanup_and_exit when glusterd comes up with upgrade mode.
#1635688: Keep only the valid (maintained/supported) components in the build
#1635820: Seeing defunt translator and discrepancy in volume info when issued from node which doesn't host bricks in that volume
#1635863: Gluster peer probe doesn't work for IPv6
#1636570: Cores due to SIGILL during multiplex regression tests
#1636631: Issuing a "heal ... full" on a disperse volume causes permanent high CPU utilization.
#1637196: Disperse volume 'df' usage is extremely incorrect after replace-brick.
#1637249: gfid heal does not happen when there is no source brick
#1637802: data-self-heal in arbiter volume results in stale locks.
#1637934: glusterfsd is keeping fd open in index xlator
#1638453: Gfid mismatch seen on shards when lookup and mknod are in progress at the same time
#1639599: Improve support-ability of glusterfs
#1640026: improper checking to avoid identical mounts
#1640066: [Stress] : Mismatching iatt in glustershd logs during MTSH and continous IO from Ganesha mounts
#1640165: io-stats: garbage characters in the filenames generated
#1640489: Invalid memory read after freed in dht_rmdir_readdirp_cbk
#1640495: [GSS] Fix log level issue with brick mux
#1640581: [AFR] : Start crawling indices and healing only if both data bricks are UP in replica 2 (thin-arbiter)
#1641344: Spurious failures in bug-1637802-arbiter-stale-data-heal-lock.t
#1642448: EC volume getting created without any redundant brick
#1642597: tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t failing
#1642800: socket: log voluntary socket close/shutdown and EOF on socket at INFO log-level
#1642807: remove 'tier' translator from build and code
#1642810: remove glupy from code and build
#1642850: glusterd: raise default transport.listen-backlog to 1024
#1642865: geo-rep: geo-replication gets stuck after file rename and gfid conflict
#1643349: [OpenSSL] : auth.ssl-allow has no option description.
#1643402: [Geo-Replication] Geo-rep faulty sesion because of the directories are not synced to slave.
#1643519: Provide an option to silence glfsheal logs
#1643929: geo-rep: gluster-mountbroker status crashes
#1643932: geo-rep: On gluster command failure on slave, worker crashes with python3
#1643935: cliutils: geo-rep cliutils' usage of Popen is not python3 compatible
#1644129: Excessive logging in posix_update_utime_in_mdata
#1644164: Use GF_ATOMIC ops to update inode->nlookup
#1644629: [rpcsvc] Single request Queue for all event threads is a performance bottleneck
#1644755: CVE-2018-14651 glusterfs: glusterfs server exploitable via symlinks to relative paths [fedora-all]
#1644756: CVE-2018-14653 glusterfs: Heap-based buffer overflow via "gf_getspec_req" RPC message [fedora-all]
#1644757: CVE-2018-14659 glusterfs: Unlimited file creation via "GF_XATTR_IOSTATS_DUMP_KEY" xattr allows for denial of service [fedora-all]
#1644758: CVE-2018-14660 glusterfs: Repeat use of "GF_META_LOCK_KEY" xattr allows for memory exhaustion [fedora-all]
#1644760: CVE-2018-14654 glusterfs: "features/index" translator can create arbitrary, empty files [fedora-all]
#1644763: CVE-2018-14661 glusterfs: features/locks translator passes an user-controlled string to snprintf without a proper format string resulting in a denial of service [fedora-all]
#1645986: tests/bugs/glusterd/optimized-basic-testcases-in-cluster.t failing in distributed regression
#1646104: [Geo-rep]: Faulty geo-rep sessions due to link ownership on slave volume
#1646728: [snapview-server]:forget glfs handles during inode forget
#1646869: gNFS crashed when processing "gluster v status [vol] nfs clients"
#1646892: Portmap entries showing stale brick entries when bricks are down
#1647029: can't enable shared-storage
#1647074: when peer detach is issued, throw a warning to remount volumes using other cluster IPs before proceeding
#1647651: gfapi: fix bad dict setting of lease-id
#1648237: Bumping up of op-version times out on a scaled system with ~1200 volumes
#1648298: dht_revalidate may not heal attrs on the brick root
#1648687: Incorrect usage of local->fd in afr_open_ftruncate_cbk
#1648768: Tracker bug for all leases related issues
#1649709: profile info doesn't work when decompounder xlator is not in graph
#1650115: glusterd requests are timing out in a brick multiplex setup
#1650389: rpc: log flooding with ENODATA errors
#1650403: Memory leaks observed in brick-multiplex scenario on volume start/stop loop
#1650893: fails to sync non-ascii (utf8) file and directory names, causes permanently faulty geo-replication state
#1651059: [OpenSSL] : Retrieving the value of "client.ssl" option,before SSL is set up, fails .
#1651165: Race in per-thread mem-pool when a thread is terminated
#1651431: Resolve memory leak at the time of graph init
#1651439: gluster-NFS crash while expanding volume
#1651463: glusterd can't regenerate volfiles in container storage upgrade workflow
#1651498: [geo-rep]: Failover / Failback shows fault status in a non-root setup
#1651584: [geo-rep]: validate the config checkpoint date and fail if it is not is exact format hh:mm:ss
#1652118: default cluster.max-bricks-per-process to 250
#1652430: glusterd fails to start, when glusterd is restarted in a loop for every 45 seconds while volume creation is in-progress
#1652852: "gluster volume get" doesn't show real default value for server.tcp-user-timeout
#1652887: Geo-rep help looks to have a typo.
#1652911: Add no-verify and ssh-port n options for create command in man page
#1653277: bump up default value of server.event-threads
#1653359: Self-heal:Improve heal performance
#1653565: tests/geo-rep: Add arbiter volume test case
#1654138: Optimize for virt store fails with distribute volume type
#1654181: glusterd segmentation fault: glusterd_op_ac_brick_op_failed (event=0x7f44e0e63f40, ctx=0x0) at glusterd-op-sm.c:5606
#1654187: [geo-rep]: RFE - Make slave volume read-only while setting up geo-rep (by default)
#1654270: glusterd crashed with seg fault possibly during node reboot while volume creates and deletes were happening
#1654521: io-stats outputs json numbers as strings
#1654805: Bitrot: Scrub status say file is corrupted even it was just created AND 'path' in the output is broken
#1654917: cleanup resources in server_init in case of failure
#1655050: automatic split resolution with size as policy should not work on a directory which is in metadata splitbrain
#1655052: Automatic Splitbrain with size as policy must not resolve splitbrains when both the copies are of same size
#1655827: [Glusterd]: Glusterd crash while expanding volumes using heketi
#1655854: Converting distribute to replica-3/arbiter volume fails
#1656100: configure.ac does not enforce automake --foreign
#1656264: Fix tests/bugs/shard/zero-flag.t
#1656348: Commit c9bde3021202f1d5c5a2d19ac05a510fc1f788ac causes ls slowdown
#1656517: [GSS] Gluster client logs filling with 0-glusterfs-socket: invalid port messages
#1656682: brick memory consumed by volume is not getting released even after delete
#1656771: [Samba-Enhancement] Need for a single group command for setting up volume options for samba
#1656951: cluster.max-bricks-per-process 250 not working as expected
#1657607: Convert nr_files to gf_atomic in posix_private structure
#1657744: quorum count not updated in nfs-server vol file
#1657783: Rename of a file leading to stale reads
#1658045: Resolve memory leak in mgmt_pmap_signout_cbk
#1658116: python2 to python3 compatibilty issues
#1659327: 43% regression in small-file sequential read performance
#1659432: Memory leak: dict_t leak in rda_opendir
#1659708: Optimize by not stopping (restart) selfheal deamon (shd) when a volume is stopped unless it is the last volume
#1659857: change max-port value in glusterd vol file to 60999
#1659868: glusterd : features.selinux was missing in glusterd-volume-set file
#1659869: improvements to io-cache
#1659971: Setting slave volume read-only option by default results in failure
#1660577: [Ganesha] Ganesha failed on one node while exporting volumes in loop
#1660701: Use adaptive mutex in rpcsvc_program_register to improve performance
#1661214: Brick is getting OOM for tests/bugs/core/bug-1432542-mpx-restart-crash.t
#1662089: NL cache: fix typos
#1662264: thin-arbiter: Check with thin-arbiter file before marking new entry change log
#1662368: [ovirt-gluster] Fuse mount crashed while deleting a 1 TB image file from ovirt
#1662679: Log connection_id in statedump for posix-locks as well for better debugging experience
#1662906: Longevity: glusterfsd(brick process) crashed when we do volume creates and deletes
#1663077: memory leak in mgmt handshake
#1663102: Change default value for client side heal to off for replicate volumes
#1663223: profile info command is not displaying information of bricks which are hosted on peers
#1663243: rebalance status does not display localhost statistics when op-version is not bumped up
#1664122: do not send bit-rot virtual xattrs in lookup response
#1664124: Improve information dumped from io-threads in statedump
#1664551: Wrong description of localtime-logging in manpages
#1664647: dht: Add NULL check for stbuf in dht_rmdir_lookup_cbk
#1664934: glusterfs-fuse client not benefiting from page cache on read after write
#1665038: glusterd crashed while running "gluster get-state glusterd odir /get-state"
#1665332: Wrong offset is used in offset for zerofill fop
#1665358: allow regression to not run tests with nfs, if nfs is disabled.
#1665363: Fix incorrect definition in index-mem-types.h
#1665656: testcaes glusterd/add-brick-and-validate-replicated-volume-options.t is crash while brick_mux is enable
#1665826: [geo-rep]: Directory renames not synced to slave in Hybrid Crawl
#1666143: Several fixes on socket pollin and pollout return value
#1666833: move few recurring logs to DEBUG level.
#1667779: glusterd leaks about 1GB memory per day on single machine of storage pool
#1667804: Unable to delete directories that contain linkto files that point to itself.
#1667905: dict_leak in __glusterd_handle_cli_uuid_get function
#1668190: Block hosting volume deletion via heketi-cli failed with error "target is busy" but deleted from gluster backend
#1668268: Unable to mount gluster volume
#1669077: [ovirt-gluster] Fuse mount crashed while creating the preallocated image
#1669937: Rebalance : While rebalance is in progress , SGID and sticky bit which is set on the files while file migration is in progress is seen on the mount point
#1670031: performance regression seen with smallfile workload tests
#1670253: Writes on Gluster 5 volumes fail with EIO when "cluster.consistent-metadata" is set
#1670259: New GFID file recreated in a replica set after a GFID mismatch resolution
#1671213: core: move "dict is NULL" logs to DEBUG log level
#1671637: geo-rep: Issue with configparser import
#1672205: 'gluster get-state' command fails if volume brick doesn't exist.
#1672818: GlusterFS 6.0 tracker
#1673267: Fix timeouts so the tests pass on AWS
#1673972: insufficient logging in glusterd_resolve_all_bricks
#1674364: glusterfs-fuse client not benefiting from page cache on read after write
#1676429: distribute: Perf regression in mkdir path
#1677260: rm -rf fails with "Directory not empty"
#1678570: glusterfs FUSE client crashing every few days with 'Failed to dispatch handler'
#1679004: With parallel-readdir enabled, deleting a directory containing stale linkto files fails with "Directory not empty"
#1679275: dht: fix double extra unref of inode at heal path
#1679965: Upgrade from glusterfs 3.12 to gluster 4/5 broken
#1679998: GlusterFS can be improved
#1680020: Integer Overflow possible in md-cache.c due to data type inconsistency
#1680585: remove glupy from code and build
#1680586: Building RPM packages with _for_fedora_koji_builds enabled fails on el6
#1683008: glustereventsd does not start on Ubuntu 16.04 LTS
#1683506: remove experimental xlators informations from glusterd-volume-set.c
#1683716: glusterfind: revert shebangs to #!/usr/bin/python3
#1683880: Multiple shd processes are running on brick_mux environmet
#1683900: Failed to dispatch handler
#1684029: upgrade from 3.12, 4.1 and 5 to 6 broken
#1684777: gNFS crashed when processing "gluster v profile [vol] info nfs"
#1685771: glusterd memory usage grows at 98 MB/h while being monitored by RHGSWA
#1686364: [ovirt-gluster] Rolling gluster upgrade from 3.12.5 to 5.3 led to shard on-disk xattrs disappearing
#1686399: listing a file while writing to it causes deadlock
#1686875: packaging: rdma on s390x, unnecessary ldconfig scriptlets
#1687248: Error handling in /usr/sbin/gluster-eventsapi produces IndexError: tuple index out of range
#1687672: [geo-rep]: Checksum mismatch when 2x2 vols are converted to arbiter
#1688218: Brick process has coredumped, when starting glusterd