X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=blobdiff_plain;f=lustre%2FChangeLog;h=ac222b10871f552162c09ee7f38756271a229cdd;hp=c89792fa526af0fdc8dae93a9fc8d96ea4070bf4;hb=789f19e876e24c61209f80baa3c0e431e161c7c7;hpb=ea6598377a832b594e09a286eae8b3571145188c diff --git a/lustre/ChangeLog b/lustre/ChangeLog index c89792f..ac222b1 100644 --- a/lustre/ChangeLog +++ b/lustre/ChangeLog @@ -1,16 +1,2851 @@ tbd Cluster File Systems, Inc. + * version 1.6.1 + * Support for kernels: + 2.6.9-42.0.10.EL (RHEL 4) + 2.6.5-7.283 (SLES 9) + 2.6.12.6 vanilla (kernel.org) + 2.6.16.27-0.9 (SLES 10) + * Client support for unpatched kernels: + (see https://mail.clusterfs.com/wikis/lustre/PatchlessClient) + 2.6.16 - 2.6.19 vanilla (kernel.org) + 2.6.9-42.0.8EL (RHEL 4) + * Recommended e2fsprogs version: 1.39.cfs6 + * bug fixes + * Note that reiserfs quotas are temporarily disabled on SLES 10 in this + kernel. + +Severity : minor +Bugzilla : 11512 +Description: Remove write from health_check, add configure option +Details : While an OSS is under a heavy ost_destroy load reading the + proc entry /proc/fs/lustre/health_check can take an unreasonably + long time. This disrupts our ability the effectively monitor + the health of the filesystem. (LLNL) + +Severity : enhancement +Bugzilla : 11548 +Description: Add LNET router traceability for debug purposes +Details : If a checksum failure occurs with a router as part of the + IO path, the NID of the last router that forwarded the bulk data + is printed so it can be identified. + +Severity : normal +Frequency : rare +Bugzilla : 11315 +Description: OST "spontaneously" evicts client; client has imp_pingable == 0 +Details : Due to a race condition, liblustre clients were occasionally + evicted incorrectly. + +Severity : enhancement +Bugzilla : 10997 +Description: lfs setstripe use optional parameters instead of postional + parameters. + +Severity : enhancement +Bugzilla : 10651 +Description: Nanosecond timestamp support for ldiskfs +Details : The on-disk ldiskfs filesystem has added support for nanosecond + resolution timestamps. There is not yet support for this at + the Lustre filesystem level. + +Severity : normal +Frequency : during server recovery +Bugzilla : 11203 +Description: MDS failing to send precreate requests due to OSCC_FLAG_RECOVERING +Details : request with rq_no_resend flag not awake l_wait_event if they get a + timeout. + +Severity : minor +Frequency : nfs export on patchless client +Bugzilla : 11970 +Description: connectathon hang when test nfs export over patchless client +Details : Disconnected dentry cannot be found with lookup, so we do not need + to unhash it or make it invalid + +Bugzilla : 11757 +Description: fix llapi_lov_get_uuids() to allow many OSTs to be returned +Details: : Change llapi_lov_get_uuids() to read the UUIDs from /proc instead + of using an ioctl. This allows lfsck for > 160 OSTs to succeed. + +Severity : minor +Frequency : rare +Bugzilla : 11546 +Description: open req refcounting wrong on reconnect +Details : If reconnect happened between getting open reply from server and + call to mdc_set_replay_data in ll_file_open, we will schedule + replay for unreferenced request that we are about to free. + Subsequent close will crash in variety of ways. + Check that request is still eligible for replay in + mdc_set_replay_data(). + +-------------------------------------------------------------------------------- + +2007-05-03 Cluster File Systems, Inc. + * version 1.6.0.1 + * bug fixes + +Severity : normal +Frequency : on some architectures +Bugzilla : 12404 +Description: 1.6 client sometimes fails to mount from a 1.4 MDT +Details : Uninitialized flags sometimes cause configuration commands to + be skipped. + +-------------------------------------------------------------------------------- + +2007-04-19 Cluster File Systems, Inc. + * version 1.6.0 + * CONFIGURATION CHANGE. This version of Lustre WILL NOT + INTEROPERATE with older versions automatically. In many cases a + special upgrade step is needed. Please read the + user documentation before upgrading any part of a 1.4.x system. + * WARNING: Lustre configuration and startup changes are required with + this release. See https://mail.clusterfs.com/wikis/lustre/MountConf + for details. + * Support for kernels: + 2.4.21-47.0.1.EL (RHEL 3) + 2.6.5-7.283 (SLES 9) + 2.6.9-42.0.10.EL (RHEL 4) + 2.6.12.6 vanilla (kernel.org) + 2.6.16.27-0.9 (SLES10) + * Client support for unpatched kernels: + (see https://mail.clusterfs.com/wikis/lustre/PatchlessClient) + 2.6.16 - 2.6.19 vanilla (kernel.org) + 2.6.9-42.0.8EL (RHEL 4) + * Recommended e2fsprogs version: 1.39.cfs6 + * Note that reiserfs quotas are disabled on SLES 10 in this kernel + * bug fixes + +Severity : enhancement +Bugzilla : 8007 +Description: MountConf +Details : Lustre configuration is now managed via mkfs and mount + commands instead of lmc and lconf. New obd types (MGS, MGC) + are added for dynamic configuration management. See + https://mail.clusterfs.com/wikis/lustre/MountConf for + details. + +Severity : enhancement +Bugzilla : 4482 +Description: dynamic OST addition +Details : OSTs can now be added to a live filesystem + +Severity : enhancement +Bugzilla : 9851 +Description: startup order invariance +Details : MDTs and OSTs can be started in any order. Clients only + require the MDT to complete startup. + +Severity : enhancement +Bugzilla : 4899 +Description: parallel, asynchronous orphan cleanup +Details : orphan cleanup is now performed in separate threads for each + OST, allowing parallel non-blocking operation. + +Severity : enhancement +Bugzilla : 9862 +Description: optimized stripe assignment +Details : stripe assignments are now made based on ost space available, + ost previous usage, and OSS previous usage, in order to try + to optimize storage space and networking resources. + +Severity : enhancement +Bugzilla : 4226 +Description: Permanently set tunables +Details : All writable /proc/fs/lustre tunables can now be permanently + set on a per-server basis, at mkfs time or on a live + system. + +Severity : enhancement +Bugzilla : 10547 +Description: Lustre message v2 +Details : Add lustre message format v2. + +Severity : enhancement +Bugzilla : 9866 +Description: client OST exclusion list +Details : Clients can be started with a list of OSTs that should be + declared "inactive" for known non-responsive OSTs. + +Severity : normal +Bugzilla : 12123 +Description: ENOENT returned for valid filehandle during dbench. +Details : Check if a directory has children when invalidating dentries + associated with an inode during lock cancellation. This fixes + an incorrect ENOENT sometimes seen for valid filehandles during + testing with dbench. + +Severity : minor +Frequency : SFS test only (otherwise harmless) +Bugzilla : 6062 +Description: SPEC SFS validation failure on NFS v2 over lustre. +Details : Changes the blocksize for regular files to be 2x RPC size, + and not depend on stripe size. + +Severity : enhancement +Bugzilla : 10088 +Description: fine-grained SMP locking inside DLM +Details : Improve DLM performance on SMP systems by removing the single + per-namespace lock and replace it with per-resource locks. + +Severity : enhancement +Bugzilla : 9332 +Description: don't hold multiple extent locks at one time +Details : To avoid client eviction during large writes, locks are not + held on multiple stripes at one time or for very large writes. + Otherwise, clients can block waiting for a lock on a failed OST + while holding locks on other OSTs and be evicted. + +Severity : enhancement +Bugzilla : 9293 +Description: Multiple MD RPCs in flight. +Details : Further unserialise some read-only MDT RPCs - learn about intents. + To avoid overly-overloading MDT, introduce a limit on number of + MDT RPCs in flight for a single client and add /proc controls + to adjust this limit. + +Severity : enhancement +Bugzilla : 22484 +Description: client read/write statistics +Details : Add client read/write call usage stats for performance + analysis of user processes. + /proc/fs/lustre/llite/*/offset_stats shows non-sequential + file access. extents_stats shows chunk size distribution. + extents_stats_per_process show chunk size distribution per + user process. + +Severity : enhancement +Bugzilla : 22485 +Description: per-client statistics on server +Details : Add ldlm and operations statistics for each client in + /proc/fs/lustre/mds|obdfilter/*/exports/ + +Severity : enhancement +Bugzilla : 22486 +Description: improved MDT statistics +Details : Add detailed MDT operations statistics in + /proc/fs/lustre/mds/*/stats + +Severity : enhancement +Bugzilla : 10968 +Description: VFS operations stats +Details : Add client VFS call stats, trackable by pid, ppid, or gid + /proc/fs/lustre/llite/*/vfs_ops_stats + /proc/fs/lustre/llite/*/vfs_track_[pid|ppid|gid] + +Severity : minor +Frequency : always +Bugzilla : 6380 +Description: Fix client-side osc byte counters +Details : The osc read/write byte counters in + /proc/fs/lustre/osc/*/stats are now working + +Severity : minor +Frequency : always as root on SLES +Bugzilla : 10667 +Description: Failure of copying files with lustre special EAs. +Details : Client side always return success for setxattr call for lustre + special xattr (currently only "trusted.lov"). + +Severity : minor +Frequency : always +Bugzilla : 10345 +Description: Refcount LNET uuids +Details : The global LNET uuid list grew linearly with every startup; + refcount repeated list entries instead of always adding to + the list. + +Severity : enhancement +Bugzilla : 2258 +Description: Dynamic service threads +Details : Within a small range, start extra service threads + automatically when the request queue builds up. + +Severity : major +Frequency : mixed-endian client/server environments +Bugzilla : 11214 +Description: mixed-endian crashes +Details : The new msg_v2 system had some failures in mixed-endian + environments. + +Severity : enhancement +Bugzilla : 11229 +Description: Easy OST removal +Details : OSTs can be permanently deactivated with e.g. 'lctl + conf_param lustre-OST0001.osc.active=0' + +Severity : enhancement +Bugzilla : 11335 +Description: MGS proc entries +Details : Added basic proc entries for the MGS showing what filesystems + are served. + +Severity : enhancement +Bugzilla : 10998 +Description: provide MGS failover +Details : Added config lock reacquisition after MGS server failover. + +Severity : enhancement +Bugzilla : 11461 +Description: add Linux 2.4 support +Details : Added support for RHEL 2.4.21 kernel for 1.6 servers and clients + +Severity : normal +Bugzilla : 11330 +Description: a large application tries to do I/O to the same resource and dies + in the middle of it. +Details : Check the req->rq_arrival time after the call to + ost_brw_lock_get(), but before we do anything about + processing it & sending the BULK transfer request. This + should help move old stale pending locks off the queue as + quickly as obd_timeout. + +Severity : major +Frequency : when an incorrect nid is specified during startup +Bugzilla : 10734 +Description: ptlrpc connect to non-existant node causes kernel crash +Details : LNET can't be re-entered from an event callback, which + happened when we expire a message after the export has been + cleaned up. Instead, hand the zombie cleanup off to another + thread. + +Severity : enhancement +Bugzilla : 10902 +Description: plain/inodebits lock performance improvement +Details : Grouping plain/inodebits in granted list by their request modes + and bits policy, thus improving the performance of search through + the granted list. + +Severity : major +Frequency : only if OST filesystem is corrupted +Bugzilla : 9829 +Description: client incorrectly hits assertion in ptlrpc_replay_req() +Details : for a short time RPCs with bulk IO are in the replay list, + but replay of bulk IOs is unimplemented. If the OST filesystem + is corrupted due to disk cache incoherency and then replay is + started it is possible to trip an assertion. Avoid putting + committed RPCs into the replay list at all to avoid this issue. + +Severity : major +Frequency : liblustre (e.g. catamount) on a large cluster with >= 8 OSTs/OSS +Bugzilla : 11684 +Description: System hang on startup +Details : This bug allowed the liblustre (e.g. catamount) client to + return to the app before handling all startup RPCs. This + could leave the node unresponsive to lustre network traffic + and manifested as a server ptllnd timeout. + +Severity : enhancement +Bugzilla : 11667 +Description: Add "/proc/sys/lustre/debug_peer_on_timeout" +Details : liblustre envirable: LIBLUSTRE_DEBUG_PEER_ON_TIMEOUT + boolean to control whether to print peer debug info when a + client's RPC times out. + +Severity : minor +Frequency : only for kernels with patches from Lustre below 1.4.3 +Bugzilla : 11248 +Description: Remove old rdonly API +Details : Remove old rdonly API which unused from at least lustre 1.4.3 + +Severity : major +Frequency : only for devices with external journals +Bugzilla : 10719 +Description: Set external device read-only also +Details : During a commanded failover stop, we set the disk device + read-only while the server shuts down. We now also set any + external journal device read-only at the same time. + +Severity : minor +Frequency : when upgrading from 1.4 while trying to change parameters +Bugzilla : 11692 +Description: The wrong (new) MDC name was used when setting parameters for + upgraded MDT's. Also allows changing of OSC (and MDC) + parameters if --writeconf is specified at tunefs upgrade time. + +Severity : major +Frequency : when setting specific ost indicies +Bugzilla : 11149 +Description: QOS code breaks on skipped indicies +Details : Add checks for missing OST indicies in the QOS code, so OSTs + created with --index need not be sequential. + +Severity : enhancement +Bugzilla : 11264 +Description: Add uninit_groups feature to ldiskfs2 to speed up e2fsck +Details : The uninit_groups feature works in conjunction with the kernel + filesystem code (ldiskfs2 only) and e2fsprogs-1.39-cfs6 to speed + up the pass1 processing of e2fsck. This is a read-only feature + in ldiskfs2 only, so older kernels and current ldiskfs cannot + mount filesystems that have had this feature enabled. + +Severity : enhancement +Bugzilla : 10816 +Description: Improve multi-block allocation algorithm to avoid fragmentation +Details : The mballoc3 code (ldiskfs2 only) adds new mechanisms to improve + allocation locality and avoid filesystem fragmentation. + +------------------------------------------------------------------------------ + +2007-04-01 Cluster File Systems, Inc. + * version 1.4.10 + * Support for kernels: + 2.4.21-47.0.1.EL (RHEL 3) + 2.6.5-7.283 (SLES 9) + 2.6.9-42.0.10.EL (RHEL 4) + 2.6.12.6 vanilla (kernel.org) + 2.6.16.27-0.9 (SLES 10) + * Recommended e2fsprogs version: 1.39.cfs5 + + * Note that reiserfs quotas are disabled on SLES 10 in this kernel + * bug fixes + +Severity : normal +Frequency : always +Bugzilla : 3244 +Description: Addition of EXT3_FEATURE_RO_COMPAT_DIR_NLINKS flag for + > 32000 subdirectories +Details : Add EXT3_FEATURE_RO_COMPAT_DIR_NLINK flag to + EXT3_FEATURE_RO_COMPAT_SUPP. This flag will be set whenever + subdirectory count crosses 32000. This will aid e2fsck to + correctly handle more than 32000 subdirectories. + +Severity : major +Frequency : liblustre (e.g. catamount) on a large cluster with >= 8 OSTs/OSS +Bugzilla : 11684 +Description: System hang on startup +Details : This bug allowed the liblustre (e.g. catamount) client to + return to the app before handling all startup RPCs. This + could leave the node unresponsive to lustre network traffic + and manifested as a server ptllnd timeout. + +Severity : enhancement +Bugzilla : 11667 +Description: Add "/proc/sys/lustre/debug_peer_on_timeout" + (liblustre envirable: LIBLUSTRE_DEBUG_PEER_ON_TIMEOUT) + boolean to control whether to print peer debug info when a + client's RPC times out. + +Severity : normal +Frequency : always +Bugzilla : 10214 +Description: make O_SYNC working on 2.6 kernels +Details : 2.6 kernels use different method for mark pages for write, + so need add a code to lustre for O_SYNC work. + +Severity : minor +Frequency : always +Bugzilla : 11110 +Description: Failure to close file and release space on NFS +Details : Put inode details into lock acquired in ll_intent_file_open. + Use mdc_intent_lock in ll_intent_open to properly + detect all kind of errors unhandled by mdc_enqueue. + +Severity : major +Frequency : rare +Bugzilla : 10866 +Description: proc file read during shutdown sometimes raced obd removal, + causing node crash +Details : Add lock to prevent obd access after proc file removal. + +Severity : normal +Frequency : Only for files larger than 4GB on 32-bit clients. +Bugzilla : 11237 +Description: improperly doing page alignment of locks +Details : Modify lustre core code to use CFS_PAGE_* defines instead of + PAGE_*. Make CFS_PAGE_MASK a 64-bit mask. + +Severity : normal +Frequency : rarely +Bugzilla : 11203 +Description: RPCs being resent when they shouldn't be +Details : Some RPCs that should not be resent are being resent. This + can cause inconsistencies in the RPC state machine. Do not + resend such requests. + +Severity : normal +Frequency : rare, only with NFS export +Bugzilla : 11669 +Description: Crash on NFS re-export node +Details : under very unusual load conditions an assertion is hit in + ll_intent_file_open() + +Severity : major +Frequency : only if OST filesystem is corrupted +Bugzilla : 9829 +Description: client incorrectly hits assertion in ptlrpc_replay_req() +Details : for a short time RPCs with bulk IO are in the replay list, + but replay of bulk IOs is unimplemented. If the OST filesystem + is corrupted due to disk cache incoherency and then replay is + started it is possible to trip an assertion. Avoid putting + committed RPCs into the replay list at all to avoid this issue. + +Severity : normal +Frequency : always +Bugzilla : 10901 +Description: large O_DIRECT requests fail under memory pressure/fragmentation +Details : Large single O_DIRECT read and write calls can fail to allocate + a sufficiently large buffer to process the request. In case of + allocation failure the allocation is retried with a smaller + buffer and broken into smaller requests. + +Severity : enhancement +Bugzilla : 11563 +Description: Add -o localflock option to simulate old noflock behaviour. +Details : This will achieve local-only flock/fcntl locks coherentness. + +Severity : normal +Frequency : always +Bugzilla : 11090 +Description: versioning check is incomplete +Details : Checking the version difference of client vs. server, report + error if the gap is too big. + +Severity : major +Bugzilla : 11710 +Frequency : always +Description: add support PG_writeback bit +Details : add support for PG_writeback bit for Lustre, for more carefull + work with page cache in 2.6 kernel. This also fix some deadlocks + and remove hack for work O_SYNC with 2.6 kernel. + +Severity : enhancement +Bugzilla : 11264 +Description: Add uninit_groups feature to ldiskfs2 to speed up e2fsck +Details : The uninit_groups feature works in conjunction with the kernel + filesystem code (ldiskfs2 only) and e2fsprogs-1.39-cfs6 to speed + up the pass1 processing of e2fsck. This is a read-only feature + in ldiskfs2 only, so older kernels and current ldiskfs cannot + mount filesystems that have had this feature enabled. + +Severity : enhancement +Bugzilla : 10816 +Description: Improve multi-block allocation algorithm to avoid fragmentation +Details : The mballoc3 code (ldiskfs2 only) adds new mechanisms to improve + allocation locality and avoid filesystem fragmentation. + +------------------------------------------------------------------------------ + +2007-02-09 Cluster File Systems, Inc. + * version 1.4.9 + * Support for kernels: + 2.6.9-42.0.3EL (RHEL 4) + 2.6.5-7.276 (SLES 9) + 2.4.21-47.0.1.EL (RHEL 3) + 2.6.12.6 vanilla (kernel.org) + * Recommended e2fsprogs version: 1.39.cfs2-0 + + * The backwards-compatible /proc/sys/portals symlink has been removed + in this release. Before upgrading, please ensure that you change + any configuration scripts or /etc/sysctl.conf files that access + /proc/sys/portals/* or sysctl portals.* to use the corresponding + entry in /proc/sys/lnet or sysctl lnet.*. This change can be made + in advance of the upgrade on any system running Lustre 1.4.6 or + newer, since /proc/sys/lnet was added in that version. + * Note that reiserfs quotas are disabled on SLES 10 in this kernel + * bug fixes + +Severity : minor +Frequency : only when quota is used +Bugzilla : 11286 +Description: avoid scanning export list for quota master +Details : Change the algorithms to avoid scanning export list in order + to improve the efficiency. + +Severity : critical +Frequency : MDS failover only, very rarely +Bugzilla : 11125 +Description: "went back in time" messages on mds failover +Details : The greatest transno may be lost when the current operation + finishes with an error (transno==0) and the client's last_rcvd + record is over-written. Save the greatest transno in the + mds_last_transno for this case. + +Severity : minor +Frequency : always for specific kernels and striping counts +Bugzilla : 11042 +Description: client may get "Matching packet too big" without ACL support +Details : Clients compiled without CONFIG_FS_POSIX_ACL get an error message + when trying to access files in certain configurations. The + clients should in fact be denied when mounting because they do + not understand ACLs. + +Severity : major +Frequency : Cray XT3 with more than 4000 clients and multiple jobs +Bugzilla : 10906 +Description: many clients connecting with IO in progress causes connect timeouts +Details : Avoid synchronous journal commits to avoid delays caused by many + clients connecting/disconnecting when bulk IO is in progress. + Queue liblustre connect requests on OST_REQUEST_PORTAL instead of + OST_IO_PORTAL to avoid delays behind potentially many pending + slow IO requests. + +Severity : normal +Frequency : occasionally with multiple writers to a single file +Bugzilla : 11081 +Description: shared writes to file may result in wrong size reported by stat() +Details : Allow growing of kms when extent lock is cancelled + +Severity : minor +Frequency : always with random mmap IO to multi-striped file +Bugzilla : 10919 +Description: mmap write might be lost if we are writing to a 'hole' in stripe +Details : Only if the hole is at the end of OST object so that kms is too + small. Fix is to increase kms accordingly in ll_nopage. + +Severity : normal +Frequency : rare, only if OST filesystem is inconsistent with MDS filesystem +Bugzilla : 11211 +Description: writes to a missing object would leak memory on the OST +Details : If there is an inconsistency between the MDS and OST filesystems, + such that the MDS references an object that doesn't exist, writes + to that object will leak memory due to incorrect cleanup in the + error handling path, eventually running out of memory on the OST. + +Severity : minor +Frequency : rare +Bugzilla : 11040 +Description: Creating too long symlink causes lustre errors +Details : Check symlink and name lengths before sending requests to MDS. + +Severity : normal +Frequency : only if flock is enabled (not on by default) +Bugzilla : 11415 +Description: posix locks not released on fd closure on 2.6.9+ +Details : We failed to add posix locks to list of inode locks on 2.6.9+ + kernels, this caused such locks not to be released on fd close and + then assertions on fs unmount about still used locks. + +Severity : minor +Frequency : MDS failover only, very rarely +Bugzilla : 11277 +Description: clients may get ASSERTION(granted_lock != NULL) +Details : When request was taking a long time, and a client was resending + a getattr by name lock request. The were multiple lock requests + with the same client lock handle and + mds_getattr_name->fixup_handle_for_resent_request found one of the + lock handles but later failed with ASSERTION(granted_lock != NULL). + +Severity : major +Frequency : rare +Bugzilla : 10891 +Description: handle->h_buffer_credits > 0, assertion failure +Details : h_buffer_credits is zero after truncate, causing assertion + failure. This patch extends the transaction or creates a new + one after truncate. + +Severity : normal +Frequency : NFS re-export or patchless client +Bugzilla : 11179, 10796 +Description: Crash on NFS re-export node (__d_move) +Details : We do not want to hash the dentry if we don't have a lock. + But if this dentry is later used in d_move, we'd hit uninitialised + list head d_hash, so we just do this to init d_hash field but + leave dentry unhashed. + +Severity : normal +Frequency : NFS re-export or patchless client +Bugzilla : 11135 +Description: NFS exports has problem with symbolic link +Details : lustre client didn't properly install dentry when re-exported + to NFS or running patchless client. + +Severity : normal +Frequency : NFS re-export or patchless client +Bugzilla : 10796 +Description: Various nfs/patchless fixes. +Details : fixes reuse disconected alias for lookup process - this fixes + warning "find_exported_dentry: npd != pd", + fix permission error with open files at nfs. + fix apply umask when do revalidate. + +Severity : normal +Frequency : occasional +Bugzilla : 11191 +Description: Crash on NFS re-export node +Details : calling clear_page() on the wrong pointer triggered oops in + generic_mapping_read(). + +Severity : normal +Frequency : rarely, using O_DIRECT IO +Bugzilla : 10903 +Description: unaligned directio crashes client with LASSERT +Details : check for unaligned buffers before trying any requests. + +Severity : major +Frequency : rarely, using CFS RAID5 patches in non-standard kernel series +Bugzilla : 11313 +Description: stale data returned from RAID cache +Details : If only a small amount of IO is done to the RAID device before + reading it again it is possible to get stale data from the RAID + cache instead of reading it from disk. + +Severity : normal +Frequency : always for sles10 kernel +Bugzilla : 10947 +Description: sles10 support +Details : ll_follow_link: compile fixes and using of nd_set_link + under newer kernels. + +Severity : major +Frequency : depends on arch, kernel and compiler version, always on sles10 + kernel and x86_64 +Bugzilla : 11562 +Description: recursive or deep enough symlinks cause stack overflow +Details : getting rid of large stack-allocated variable in + __vfs_follow_link + +Severity : minor +Frequency : depends on hardware +Bugzilla : 11540 +Description: lustre write performance loss in the SLES10 kernel +Details : the performance loss is caused by using of write barriers in the + ext3 code. The SLES10 kernel turns barrier support on by + default. The fix is to undo that change for ldiskfs. + + +------------------------------------------------------------------------------ + +2006-12-09 Cluster File Systems, Inc. + * version 1.4.8 + * Support for kernels: + 2.6.9-42.0.3EL (RHEL 4) + 2.6.5-7.276 (SLES 9) + 2.4.21-47.0.1.EL (RHEL 3) + 2.6.12.6 vanilla (kernel.org) + * bug fixes + +Severity : major +Frequency : quota enabled and large files being deleted +Bugzilla : 10707 +Description: releasing more than 4GB of quota at once hangs OST +Details : If a user deletes more than 4GB of files on a single OST it + will cause the OST to spin in an infinite loop. Release + quota in < 4GB chunks, or use a 64-bit value for 1.4.7.1+. + +Severity : minor +Frequency : rare +Bugzilla : 10845 +Description: statfs data retrieved from /proc may be stale or zero +Details : When reading per-device statfs data from /proc, in the + {kbytes,files}_{total,free,avail} files, it may appear + as zero or be out of date. + +Severity : minor +Frequency : systems with MD RAID1 external journal devices +Bugzilla : 10832 +Description: lconf's call to blkid is confused by RAID1 journal devices +Details : Use the "blkid -l" flag to locate the MD RAID device instead + of returning all block devices that match the journal UUID. + +Severity : normal +Frequency : always, for aggregate stripe size over 4GB +Bugzilla : 10725 +Description: "lfs setstripe" fails assertion when setting 4GB+ stripe width +Details : Using "lfs setstripe" to set stripe size * stripe count over 4GB + will fail the kernel with "ASSERTION(lsm->lsm_xfersize != 0)" + +Severity : minor +Frequency : always if "lfs find" used on a local file/directory +Bugzilla : 10864 +Description: "lfs find" segfaults if used on a local file/directory +Details : The case where a directory component was not specified wasn't + handled correctly. Handle this properly. + +Severity : normal +Frequency : always on ppc64 +Bugzilla : 10634 +Description: the write to an ext3 filesystem mounted with mballoc got stuck +Details : ext3_mb_generate_buddy() uses find_next_bit() which does not + perform endianness conversion. + +Severity : major +Frequency : rarely (truncate to non-zero file size after write under load) +Bugzilla : 10730, 10687 +Description: Files padded with zeros to next 4K multiple +Details : With filesystems mounted using the "extents" option (2.6 kernels) + it is possible that files that are truncated to a non-zero size + immediately after being written are filled with zero bytes beyond + the truncated size. No file data is lost. + +Severity : enhancement +Bugzilla : 10452 +Description: Allow recovery/failover for liblustre clients. +Details : liblustre clients were unaware of failover configurations until + now. + +Severity : enhancement +Bugzilla : 10743 +Description: user file locks should fail when not mounting with flock option +Details : Set up an error-returning stub in ll_file_operations.lock field + to prevent incorrect behaviour when client is mounted without + flock option. Also, set up properly f_op->flock field for + RHEL4 kernels. + +Severity : minor +Frequency : always on ia64 +Bugzilla : 10905 +Description: "lfs df" loops on printing out MDS statfs information +Details : The obd_ioctl_data was not initialized and in some systems + this caused a failure during the ioctl that did not return + an error. Initialize the struct and return an error on failure. + +Severity : minor +Frequency : SLES 9 only +Bugzilla : 10667 +Description: Error of copying files with lustre special EAs as root +Details : Client side always return success for setxattr call for lustre + special xattr (currently only "trusted.lov"). + +Severity : normal +Frequency : rarely on clusters with both ia64+i386 clients +Bugzilla : 10672 +Description: ia64+i686 clients doing shared IO on the same file may LBUG +Details : In rare cases when both ia64+i686 (or other mixed-PAGE_SIZE) + clients are doing concurrent writes to the same file it is + possible that the ia64 clients may LASSERT because the OST + extent locks are not PAGE_SIZE aligned. Ensure that grown + locks are always aligned on the request boundary. + +Severity : normal +Frequency : specific use, occasional +Bugzilla : 7040 +Description: Overwriting in use executable truncates on-disk binary image +Details : If one node attempts to overwrite an executable in use by + another node, we now correctly return ETXTBSY instead of + truncating the file. + +Severity : enhancement +Bugzilla : 4900 +Description: Async OSC create to avoid the blocking unnecessarily. +Details : If a OST has no remain object, system will block on the creating + when need to create a new object on this OST. Now, ways use + pre-created objects when available, instead of blocking on an + empty osc while others are not empty. If we must block, we block + for the shortest possible period of time. + +Severity : normal +Frequency : rare +Bugzilla : 2707 +Description: chmod on Lustre root is propagated to other clients +Details : Re-validate root's dentry in ll_lookup_it to avoid having it + invalid by the follow_mount time. + +Severity : minor +Frequency : liblustre clients only +Bugzilla : 10883 +Description: Race in 'instant cancel' lock handling could lead to such locks + never to be granted in case of SMP MDS +Details : Do not destroy not yet granted but cbpending locks in + handle_enqueue + +Severity : minor +Frequency : replay/resend of open +Bugzilla : 10991 +Description: non null lock assetion failure in mds_intent_policy +Details : Trying to replay/resend lockless open requests resulted in + mds_open() returning 0 with no lock. Now it sets a flag if + a lock is going to be returned. + +Severity : enhancement +Bugzilla : 10889 +Description: Checksum enhancements +Details : New checksum enhancements allow for resending RPCs that failed + checksum checks. + +Severity : enhancement +Bugzilla : 7376 +Description: Tunables on number of dirty pages in cacche +Details : Allow to set limit on number of dirty pages cached. + +Severity : normal +Frequency : rare +Bugzilla : 10643 +Description: client crash on unmount - lock still has references +Details : In some error handling cases it was possible to leak a lock + reference on a client while accessing a file. This was not + harmful to the client during operation, but would cause the + client to crash when the filesystem is unmounted. + +Severity : normal +Frequency : specific case, rare +Bugzilla : 10921 +Description: ETXTBSY on mds though file not in use by client +Details : ETXTBSY is no longer incorrectly returned when attempting to + chmod or chown a directory that the user previously tried to + execute or a currently-executing binary. + +Severity : major +Frequency : extremely rare except on liblustre-based clients +Bugzilla : 10480 +Description: Lustre space not freed when files are deleted +Details : Clean up open-unlinked files after client eviction. Previously + the unlink was skipped and the files remained as orphans. + +Severity : normal +Frequency : rare +Bugzilla : 10999 +Description: OST failure "would be an LBUG" in waiting_locks_callback() +Details : In some cases it was possible to send a blocking callback to + a client doing a glimpse, even though that client didn't get + a lock granted. When the glimpse lock is cancelled on the OST + the freed lock is left on the waiting list and corrupted the list. + +Severity : major +Frequency : all core dumps +Bugzilla : 11103 +Description: Broke core dumps to lustre +Details : Negative dentry may be unhashed if parent does not have UPDATE + lock, but some callers, e.g. do_coredump, expect dentry to be + hashed after successful create, hash it in ll_create_it. + +------------------------------------------------------------------------------ + +2006-09-13 Cluster File Systems, Inc. + * version 1.4.7.1 + * Support for kernels: + 2.6.9-42.0.2.EL (RHEL 4) + 2.6.5-7.276 (SLES 9) + 2.4.21-40.EL (RHEL 3) + 2.6.12.6 vanilla (kernel.org) + * bug fix + +Severity : major +Frequency : always on RHEL 3 +Bugzilla : 10867 +Description: Number of open files grows over time +Details : The number of open files grows over time, whether or not + Lustre is started. This was due to a filp leak introduced + by one of our kernel patches. + +------------------------------------------------------------------------------ + +2006-08-20 Cluster File Systems, Inc. + * version 1.4.7 + * Support for kernels: + 2.6.9-42.EL (RHEL 4) + 2.6.5-7.267 (SLES 9) + 2.4.21-40.EL (RHEL 3) + 2.6.12.6 vanilla (kernel.org) + * bug fixes + +Severity : major +Frequency : rare +Bugzilla : 5719, 9635, 9792, 9684 +Description: OST (or MDS) trips assertions in (re)connection under heavy load +Details : If a server is under heavy load and cannot reply to new + connection requests before the client resends the (re)connect, + the connection handling code can behave badly if two service + threads are concurrently handing separate (re)connections from + the same client. Add better locking to the connection handling + code, and ensure that only a single connection will be processed + for a given client UUID, even if the lock is dropped. + +Severity : enhancement +Bugzilla : 3627 +Description: add TCP zero-copy support to kernel +Details : Add support to the kernel TCP stack to allow zero-copy bulk + sends if the hardware supports scatter-gather and checksumming. + This allows socklnd to do client-write and server-read more + efficiently and reduce CPU utilization from skbuf copying. + +Severity : minor +Frequency : only if NFS exporting from client +Bugzilla : 10258 +Description: NULL pointer deref in ll_iocontrol() if chattr mknod file +Details : If setting attributes on a file created under NFS that had + never been opened it would be possible to oops the client + if the file had no objects. + +Severity : major +Frequency : rare +Bugzilla : 9326, 10402, 10897 +Description: client crash in ptlrpcd_wake() thread when sending async RPC +Details : It is possible that ptlrpcd_wake() dereferences a freed async + RPC. In rare cases the ptlrpcd thread alread processed the RPC + before ptlrpcd_wake() was called and the request was freed. + +Severity : minor +Frequency : always for liblustre +Bugzilla : 10290 +Description: liblustre client does MDS+OSTs setattr RPC for each write +Details : When doing a write from a liblustre client, the client + incorrectly issued an RPC to the MDS and each OST the file was + striped over in order to update the timestamps. When writing + with small chunks and many clients this could overwhelm the MDS + with RPCs. In all cases it would slow down the write because + these RPCs are unnecessary. + +Severity : enhancement +Bugzilla : 9340 +Description: allow number of MDS service threads to be changed at module load +Details : It is now possible to change the number of MDS service threads + running. Adding "options mds mds_num_threads={N}" to the MDS's + /etc/modprobe.conf will set the number of threads for the next + time Lustre is restarted (assuming the "mds" module is also + reloaded at that time). The default number of threads will + stay the same, 32 for most systems. + +Severity : major +Frequency : rare +Bugzilla : 10300 +Description: OST crash if filesystem is unformatted or corrupt +Details : If an OST is started on a device that has never been formatted + or if the filesystem is corrupt and cannot even mount then the + error handling cleanup routines would dereference a NULL pointer. + +Severity : normal +Frequency : rare +Bugzilla : 10047 +Description: NULL pointer deref in llap_from_page. +Details : get_cache_page_nowait can return a page with NULL (or otherwise + incorrect) mapping if the page was truncated/reclaimed while it was + searched for. Check for this condition and skip such pages when + doing readahead. Introduce extra check to llap_from_page() to + verify page->mapping->host is non-NULL (so page is not anonymous). + +Severity : minor +Frequency : Sometimes when using sys_sendfile +Bugzilla : 7020 +Description: "page not covered by a lock" warnings from ll_readpage +Details : sendfile called ll_readpage without right page locks present. + Now we introduced ll_file_sendfile that does necessary locking + around call to generic_file_sendfile() much like we do in + ll_file_read(). + +Severity : normal +Frequency : with certain MDS communication failures at client mount time +Bugzilla : 10268 +Description: NULL pointer deref after failed client mount +Details : a client connection request may delayed by the network layer + and not be sent until after the PTLRPC layer has timed out the + request. If the client fails the mount immediately it will try + to clean up before the network times out the request. Add a + reference from the request import to the obd device and delay + the cleanup until the network drops the request. + +Severity : normal +Frequency : occasionally during client (re)connect +Bugzilla : 9387 +Description: assertion failure during client (re)connect +Details : processing a client connection request may be delayed by the + client or server longer than the client connect timeout. This + causes the client to resend the connection request. If the + original connection request is replied in this interval, the + client may trip an assertion failure in ptlrpc_connect_interpret() + which thought it would be the only running connect process. + +Severity : normal +Frequency : only with obd_echo servers and clients that are rebooted +Bugzilla : 10140 +Description: kernel BUG accessing uninitialized data structure +Details : When running an obd_echo server it did not start the ping_evictor + thread, and when a client was evicted an uninitialized data + structure was accessed. Start the ping_evictor in the RPC + service startup instead of the OBD startup. + +Severity : enhancement +Bugzilla : 10193 (patchless) +Description: Remove dependency on various unexported kernel interfaces. +Details : No longer need reparent_to_init, exit_mm, exit_files, + sock_getsockopt, filemap_populate, FMODE_EXEC, put_filp. + +Severity : minor +Frequency : rare (only users of deprecated and unsupported LDAP config) +Bugzilla : 9337 +Description: write_conf for zeroconf mount queried LDAP incorrectly for client +Details : LDAP apparently contains 'lustreName' attributes instead of + 'name'. A simple remapping of the name is sufficient. + +Severity : major +Frequency : rare (only with non-default dump_on_timeout debug enabled) +Bugzilla : 10397 +Description: waiting_locks_callback trips kernel BUG if client is evicted +Details : Running with the dump_on_timeout debug flag turned on makes + it possible that the waiting_locks_callback() can try to dump + the Lustre kernel debug logs from an interrupt handler. Defer + this log dumping to the expired_lock_main() thread. + +Severity : enhancement +Bugzilla : 10420 +Description: Support NFS exporting on 2.6 kernels. +Details : Implement non-rawops metadata methods for NFS server to use without + changing NFS server code. + +Severity : normal +Frequency : very rare (synthetic metadata workload only) +Bugzilla : 9974 +Description: two racing renames might cause an MDS thread to deadlock +Details : Running the "racer" program may cause one MDS thread to rename + a file from being the source of a rename to being the target of + a rename at exactly the same time that another thread is doing + so, and the second thread has already enqueued these locks after + doing a lookup of the target and is trying to relock them in + order. Ensure that we don't try to re-lock the same resource. + +Severity : major +Frequency : only very large systems with liblustre clients +Bugzilla : 7304 +Description: slow eviction of liblustre clients with the "evict_by_nid" RPC +Details : Use asynchronous set_info RPCs to send the "evict_by_nid" to + all OSTs in parallel. This allows the eviction of stale liblustre + clients to proceed much faster than if they were done in series, + and also offers similar improvements for other set_info RPCs. + +Severity : minor +Frequency : common +Bugzilla : 10265 +Description: excessive CPU usage during initial read phase on client +Details : During the initial read phase on a client, it would agressively + retry readahead on the file, consuming too much CPU and impacting + performance (since 1.4.5.8). Improve the readahead algorithm + to avoid this, and also improve some other common cases (read + of small files in particular, where "small" is files smaller than + /proc/fs/lustre/llite/*/max_read_ahead_whole_mb, 2MB by default). + +Severity : minor +Frequency : rare +Bugzilla : 10450 +Description: MDS crash when receiving packet with unknown intent. +Details : Do not LBUG in unknown intent case, just return -EFAULT + +Severity : enhancement +Bugzilla : 9293, 9385 +Description: MDS RPCs are serialised on client. This is unnecessary for some. +Details : Do not serialize getattr (non-intent version) and statfs. + +Severity : minor +Frequency : occasional, when OST network is overloaded/intermittent +Bugzilla : 10416 +Description: client evicted by OST after bulk IO timeout +Details : If a client sends a bulk IO request (read or write) the OST + may evict the client if it is unresposive to its data GET/PUT + request. This is incorrect if the network is overloaded (takes + too long to transfer the RPC data) or dropped the OST GET/PUT + request. There is no need to evict the client at all, since + the pinger and/or lock callbacks will handle this, and the + client can restart the bulk request. + +Severity : minor +Frequency : Always when mmapping file with no objects +Bugzilla : 10438 +Description: client crashes when mmapping file with no objects +Details : Check that we actually have objects in a file before doing any + operations on objects in ll_vm_open, ll_vm_close and + ll_glimpse_size. + +Severity : minor +Frequency : Rare +Bugzilla : 10484 +Description: Request leak when working with deleted CWD +Details : Introduce advanced request refcount tracking for requests + referenced from lustre intent. + +Severity : Enhancement +Bugzilla : 10482 +Description: Cache open file handles on client. +Details : MDS now will return special lock along with openhandle, if + requested and client is allowed to hold openhandle, even if unused, + until such a lock is revoked. Helps NFS a lot, since NFS is opening + closing files for every read/write openration. + +Severity : Enhancement +Bugzilla : 9291 +Description: Cache open negative dentries on client when possible. +Details : Guard negative dentries with UPDATE lock on parent dir, drop + negative dentries on lock revocation. + +Severity : minor +Frequency : Always +Bugzilla : 10510 +Description: Remounting a client read-only wasn't possible with a zconf mount +Details : It wasn't possible to remount a client read-only with llmount. + +Severity : enhancement +Description: Include MPICH 1.2.6 Lustre ADIO interface patch +Details : In lustre/contrib/ or /usr/share/lustre in RPM a patch for + MPICH is included to add Lustre-specific ADIO interfaces. + This is based closely on the UFS ADIO layer and only differs + in file creation, in order to allow the OST striping to be set. + This is user-contributed code and not supported by CFS. + +Severity : minor +Frequency : Always +Bugzilla : 9486 +Description: extended inode attributes (immutable, append-only) work improperly + when 2.4 and 2.6 kernels are used on client/server or vice versa +Details : Introduce kernel-independent values for these flags. + +Severity : enhancement +Frequency : Always +Bugzilla : 10248 +Description: Allow fractional MB tunings for lustre in /proc/ filesystem. +Details : Many of the /proc/ tunables can only be tuned at a megabyte + granularity. Now, Fractional MB granularity is be supported, + this is very useful for low memory system. + +Severity : enhancement +Bugzilla : 9292 +Description: Getattr by fid +Details : Getting a file attributes by its fid, obtaining UPDATE|LOOKUP + locks, avoids extra getattr rpc requests to MDS, allows '/' to + have locks and avoids getattr rpc requests for it on every stat. + +Severity : major +Frequency : Always, for filesystems larger than 2TB +Bugzilla : 6191 +Description: ldiskfs crash at mount for filesystem larger than 2TB with mballoc +Details : Kenrel kmalloc limits allocations to 128kB and this prevents + filesystems larger than 2TB to be mounted with mballoc enabled. + +Severity : critical +Frequency : Always, for 32-bit kernel without CONFIG_LBD and filesystem > 2TB +Bugzilla : 6191 +Description: filesystem corruption for non-standard kernels and very large OSTs +Details : If a 32-bit kernel is compiled without CONFIG_LBD enabled and a + filesystems larger than 2TB is mounted then the kernel will + silently corrupt the start of the filesystem. CONFIG_LBD is + enabled for all CFS-supported kernels, but the possibility of + this happening with a modified kernel config exists. + +Severity : enhancement +Bugzilla : 10462 +Description: add client O_DIRECT support for 2.6 kernels +Details : It is now possible to do O_DIRECT reads and writes to files + in the Lustre client mountpoint on 2.6 kernel clients. + +Severity : enhancement +Bugzilla : 10446 +Description: parallel glimpse, setattr, statfs, punch, destroy requests +Details : Sends glimpse, setattr, statfs, punch, destroy requests to OSTs in + parallel, not waiting for response from every OST before sending + a rpc to the next OST. + +Severity : minor +Frequency : rare +Bugzilla : 10150 +Description: setattr vs write race when updating file timestamps +Details : Client processes that update a file timestamp into the past + right after writing to the file (e.g. tar) it is possible that + the updated file modification time can be reset to the current + time due to a race between processing the setattr and write RPC. + +Severity : enhancement +Bugzilla : 10318 +Description: Bring 'lfs find' closer in line with regular Linux find. +Details : lfs find util supports -atime, -mtime, -ctime, -maxdepth, -print, + -print0 options and obtains all the needed info through the lustre + ioctls. + +Severity : enhancement +Bugzilla : 6221 +Description: support up to 1024 configured devices on one node +Details : change obd_dev array from statically allocated to dynamically + allocated structs as they are first used to reduce memory usage + +Severity : minor +Frequency : rare +Bugzilla : 10437 +Description: Flush dirty partially truncated pages during truncate +Details : Immediatelly flush partially truncated pages in filter_setattr, + this way we completely avoid having any pages in page cache on OST + and can retire ugly workarounds during writes to flush such pages. + +Severity : minor +Frequency : rare +Bugzilla : 10409 +Description: i_sem vs transaction deadlock in mds_obd_destroy during unlink. +Details : protect inode from truncation within vfs_unlink() context + just take a reference before calling vfs_unlink() and release it + when parent's i_sem is free. + +Severity : minor +Frequency : always, if extents are used on OSTs +Bugzilla : 10703 +Description: index ei_leaf_hi (48-bit extension) is not zeroed in extent index +Details : OSTs using the extents format would not zero the high 16 bits of + the index physical block number. This is not a problem for any + OST filesystems smaller than 16TB, and no kernels support ext3 + filesystems larger than 16TB yet. This is fixed in 1.4.7 (all + new/modified files) and can be fixed for existing filesystems + with e2fsprogs-1.39-cfs1. + +Severity : minor +Frequency : rare +Bugzilla : 9387 +Description: import connection selection may be incorrect if timer wraps +Details : Using a 32-bit jiffies timer with HZ=1000 may cause backup + import connections to be ignored if the 32-bit jiffies counter + wraps. Use a 64-bit jiffies counter. + +Severity : major +Frequency : during server recovery +Bugzilla : 10479 +Description: crash after server is denying duplicate export +Details : If clients are resending connect requests to the server, the + server refuses to allow a client to connect multiple times. + Fixed a bug in the handling of this case. + +Severity : minor +Frequency : very large clusters immediately after boot +Bugzilla : 10083 +Description: LNET request buffers exhausted under heavy short-term load +Details : If a large number of client requests are generated on a service + that has previously never seen so many requests it is possible + that the request buffer growth cannot keep up with the spike in + demand. Instead of dropping incoming requests, they are held in + the LND until the RPC service can accept more requests. + +Severity : minor +Frequency : Sometimes during replay +Bugzilla : 9314 +Description: Assertion failure in ll_local_open after replay. +Details : If replay happened on an open request reply before we were able + to set replay handler, reply will become not swabbed tripping the + assertion in ll_local_open. Now we set the handler right after + recognising of open request + +Severity : minor +Frequency : very rare +Bugzilla : 10584 +Description: kernel reports "badness in vsnprintf" +Details : Reading from the "recovery_status" /proc file in small chunks + may cause a negative length in lprocfs_obd_rd_recovery_status() + call to vsnprintf() (which is otherwise harmless). Exit early + if there is no more space in the output buffer. + +Severity : enhancement +Bugzilla : 2259 +Description: clear OBD RPC statistics by writing to them +Details : It is now possible to clear the OBD RPC statistics by writing + to the "stats" file. + +Severity : minor +Frequency : rare +Bugzilla : 10641 +Description: Client mtime is not the same on different clients after utimes +Details : In some cases, the client was using the utimes() syscall on + a file cached on another node. The clients now validate the + ctime from the MDS + OSTs to determine which one is right. + +Severity : minor +Frequency : always +Bugzilla : 10611 +Description: Inability to activate failout mode +Details : lconf script incorrectly assumed that in python string's numeric + value is used in comparisons. + +Severity : minor +Frequency : always with multiple stripes per file +Bugzilla : 10671 +Description: Inefficient object allocation for mutli-stripe files +Details : When selecting which OSTs to stripe files over, for files with + a stripe count that divides evenly into the number of OSTs, + the MDS is always picking the same starting OST for each file. + Return the OST selection heuristic to the original design. + +Severity : minor +Frequency : rare +Bugzilla : 10673 +Description: mount failures may take full timeout to return an error +Details : Under some heavy load conditions it is possible that a + failed mount can wait for the full obd_timeout interval, + possibly several minutes, before reporting an error. + Instead return an error as soon as the status is known. + +------------------------------------------------------------------------------ + +2006-02-14 Cluster File Systems, Inc. + * version 1.4.6 + * WIRE PROTOCOL CHANGE. This version of Lustre networking WILL NOT + INTEROPERATE with older versions automatically. Please read the + user documentation before upgrading any part of a live system. + * WARNING: Lustre networking configuration changes are required with + this release. See https://bugzilla.clusterfs.com/show_bug.cgi?id=10052 + for details. + * bug fixes + * Support for kernels: + 2.6.9-22.0.2.EL (RHEL 4) + 2.6.5-7.244 (SLES 9) + 2.6.12.6 vanilla (kernel.org) + + +Severity : enhancement +Bugzilla : 7981/8208 +Description: Introduced Lustre Networking (LNET) +Details : LNET is new networking infrastructure for Lustre, it includes + a reorganized network configuration mode (see the user + documentation for full details) as well as support for routing + between different network fabrics. Lustre Networking Devices + (LNDs) for the supported network fabrics have also been + created for this new infrastructure. + +Severity : enhancement +Description: Introduced Access control lists +Details : clients can set ACLs on files and directories in order to have + more fine-grained permissions than the standard Unix UGO+RWX. + The MDS must be started with the "-o acl" mount option. + +Severity : enhancement +Description: Introduced filesystem quotas +Details : Administrators may now establish per-user quotas on the + filesystem. + +Severity : enhancement +Bugzilla : 7982 +Description: Configuration change for the XT3 + The PTLLND is now used to run Lustre over Portals on the XT3 + The configure option(s) --with-cray-portals are no longer used. + Rather --with-portals= is used to + enable building on the XT3. In addition to enable XT3 specific + features the option --enable-cray-xt3 must be used. + +Severity : major +Frequency : rare +Bugzilla : 7407 +Description: Running on many-way SMP OSTs can trigger oops in llcd_send() +Details : A race between allocating a new llcd and re-getting the llcd_lock + allowed another thread to grab newly-allocated llcd. + +Severity : enhancement +Bugzilla : 7116 +Description: 2.6 OST async journal commit and locking fix to improve performance +Details : The filter_direct_io()+filter_commitrw_write() journal commits for + 2.6 kernels are now async as they already were in 2.4 kernels so + that they can commit concurrently with the network bulk transfer. + For block-allocated files the filter allocation semaphore is held + to avoid filesystem fragmentation during allocation. BKL lock + removed for 2.6 xattr operations where it is no longer needed. + +Severity : minor +Frequency : rare +Bugzilla : 8320 +Description: lconf incorrectly determined whether two IP networks could talk +Details : In some more complicated routing and multiple-network + configurations, lconf will avoid trying to make a network + connection to a disjoint part of the IP space. It was doing the + math incorrectly for one set of cases. + +Severity : major +Frequency : rare +Bugzilla : 7359 +Description: Fix for potential infinite loop processing records in an llog. +Details : If an llog record is corrupted/zeroed, it is possible to loop + forever in llog_process(). Validate the llog record length + and skip the remainder of the block on error. + +Severity : minor +Frequency : occasional (liblustre only) +Bugzilla : 6363 +Description: liblustre could not open files whose last component is a symlink +Details : sysio_path_walk() would incorrectly pass the open intent to + intermediate path components. + +Severity : minor +Frequency : rare (liblustre only with non-standard tuning) +Bugzilla : 7201 (7350) +Description: Tuning the MDC DLM LRU size to zero triggers client LASSERT +Details : llu_lookup_finish_locks() tries to set lock data on a lock + after it has been released, only do this for referenced locks + +Severity : enhancement +Bugzilla : 7328 +Description: specifying an (invalid) directory default stripe_size of -1 + would reset the directory default striping +Details : stripe_size -1 was used internally to signal directory stripe + removal, now use "all default" to signal dir stripe removal + as a directory striping of "all default" is not useful + +Severity : minor +Frequency : common for large clusters running liblustre clients +Bugzilla : 7198 +Description: doing an ls when liblustre clients are running is slow +Details : sending a glimpse AST to a liblustre client waits for every AST + to time out, as liblustre clients will not respond. Since they + cannot cache data we refresh the OST lock LVB from disk instead. + +Severity : enhancement +Bugzilla : 7198 +Description: doing an ls at the same time as file IO can be slow +Details : enqueue and other "small" requests can be blocked behind many + large IO requests. Create a new OST IO portal for non-IO + requests so they can be processed faster. + +Severity : minor +Frequency : rare (only HPUX clients mounting unsupported re-exported NFS vol) +Bugzilla : 5781 +Description: an HPUX NFS client would get -EACCESS when ftruncate()ing a newly + created file with mode 000 +Details : the Linux NFS server relies on an MDS_OPEN_OWNEROVERRIDE hack to + allow an ftruncate() as a non-root user to a file with mode 000. + Lustre now respects this flag to disable mode checks when + truncating a file owned by the user + +Severity : minor +Frequency : liblustre-only, when liblustre client dies unexpectedly or becomes + busy +Bugzilla : 7313 +Description: Revoking locks from clients that went dead or catatonic might take + a lot of time. +Details : New lock flags FL_CANCEL_ON_BLOCK used by liblustre makes + cancellation of such locks instant on servers without waiting for + any reply from clients. Clients drops these locks when cancel + notification from server is received without replying. + +Severity : minor +Frequency : liblustre-only, when liblustre client dies or becomes busy +Bugzilla : 7311 +Description: Doing ls on Linux clients can take a long time with active + liblustre clients +Details : Liblustre client cannot handle ASTs in timely manner, so avoid + granting such locks to it in the first place if possible. Locks + are taken by proxy on the OST during the read or write and + dropped immediately afterward. Add connect flags handling, do + not grant locks to liblustre clients for glimpse ASTs. + +Severity : enhancement +Bugzilla : 6252 +Description: Improve read-ahead algorithm to avoid excessive IO for random reads +Details : Existing read-ahead algorithm is tuned for the case of streamlined + sequential reads and behaves badly with applications doing random + reads. Improve it by reading ahead at least read region, and + avoiding excessive large RPC for small reads. + +Severity : enhancement +Bugzilla : 8330 +Description: Creating more than 1000 files for a single job may cause a load + imbalance on the OSTs if there are also a large number of OSTs. +Details : qos_prep_create() uses an OST index reseed value that is an + even multiple of the number of available OSTs so that if the + reseed happens in the middle of the object allocation it will + still utilize the OSTs as uniformly as possible. + +Severity : major +Frequency : rare +Bugzilla : 8322 +Description: OST or MDS may oops in ping_evictor_main() +Details : ping_evictor_main() drops obd_dev_lock if deleting a stale export + but doesn't restart at beginning of obd_exports_timed list + afterward. + +Severity : enhancement +Bugzilla : 7304 +Description: improve by-nid export eviction on the MDS and OST +Details : allow multiple exports with the same NID to be evicted at one + time without re-searching the exports list. + +Severity : major +Frequency : rare, only with supplementary groups enabled on SMP 2.6 kernels +Bugzilla : 7273 +Description: MDS may oops in groups_free() +Details : in rare race conditions a newly allocated group_info struct is + freed again, and this can be NULL. The 2.4 compatibility code + for groups_free() checked for a NULL pointer, but 2.6 did not. + +Severity : minor +Frequency : common for liblustre clients doing little filesystem IO +Bugzilla : 9352, 7313 +Description: server may evict liblustre clients accessing contended locks +Details : if a client is granted a lock or receives a completion AST + with a blocking AST already set it would not reply to the AST + for LDLM_FL_CANCEL_ON_BLOCK locks. It now replies to such ASTs. + +Severity : minor +Frequency : lfs setstripe, only systems with more than 160 OSTs +Bugzilla : 9440 +Description: unable to set striping with a starting offset beyond OST 160 +Details : llapi_create_file() incorrectly limited the starting stripe + index to the maximum single-file stripe count. + +Severity : minor +Frequency : LDAP users only +Bugzilla : 6163 +Description: lconf did not handle in-kernel recovery with LDAP properly +Details : lconf/LustreDB get_refs() is searching the wrong namespace + +Severity : enhancement +Bugzilla : 7342 +Description: bind OST threads to NUMA nodes to improve performance +Details : all OST threads are uniformly bound to CPUs on a single NUMA + node and do their allocations there to localize memory access + +Severity : enhancement +Bugzilla : 7979 +Description: llmount can determine client NID directly from Myrinet (GM) +Details : the client NID code from gmnalnid was moved directly into + llmount, removing the need to use this or specifying the + client NID explicitly when mounting GM clients with zeroconf + +Severity : minor +Frequency : if client is started with down MDS +Bugzilla : 7184 +Description: if client is started with down MDS mount hangs in ptlrpc_queue_wait +Details : Having an LWI_INTR() wait event (interruptible, but no timeout) + will wait indefinitely in ptlrpc_queue_wait->l_wait_event() after + ptlrpc_import_delayed_req() because we didn't check if the + request was interrupted, and we also didn't break out of the + event loop if there was no timeout + +Severity : major +Frequency : rare +Bugzilla : 5047 +Description: data loss during non-page-aligned writes to a single file from + both multiple nodes and multiple threads on one node at same time +Details : updates to KMS and lsm weren't protected by common lock. Resulting + inconsistency led to false short-reads, that were cached and later + used by ->prepare_write() to fill in partially written page, + leading to data loss. + +Severity : minor +Frequency : always, if lconf --abort_recovery used +Bugzilla : 7047 +Description: lconf --abort_recovery fails with 'Operation not supported' +Details : lconf was attempting to abort recovery on the MDT device and not + the MDS device + +Severity : enhancement +Bugzilla : 9445 +Description: remove cleanup logs +Details : replace lconf-generated cleanup logs with lustre internal + cleanup routines. Eliminates the need for client-cleanup and + mds-cleanup logs. + +Severity : enhancement +Bugzilla : 8592 +Description: add support for EAs (user and system) on lustre filesystems +Details : it is now possible to store extended attributes in the Lustre + client filesystem, and with the user_xattr mount option it + is possible to allow users to store EAs on their files also + +Severity : enhancement +Bugzilla : 7293 +Description: Add possibility (config option) to show minimal available OST free + space. +Details : When compiled with --enable-mindf configure option, statfs(2) + (and so, df) will return least minimal free space available from + all OSTs as amount of free space on FS, instead of summary of + free spaces of all OSTs. + +Severity : enhancement +Bugzilla : 7311 +Description: do not expand extent locks acquired on OST-side +Details : Modify ldlm_extent_policy() to not expand local locks, acquired + by server: they are not cached anyway. + +Severity : major +Frequency : when mmap is used/binaries executed from Lustre +Bugzilla : 9482 +Description: Unmmap pages before throwing them away from read cache. +Details : llap_shrink cache now attempts to unmap pages before discarding + them (if unmapping failed - do not discard). SLES9 kernel has + extra checks that trigger if this unmapping is not done first. + +Severity : minor +Frequency : rare +Bugzilla : 6034 +Description: lconf didn't resolve symlinks before checking to see whether a + given mountpoint was already in use + +Severity : minor +Frequency : when migrating failover services +Bugzilla : 6395, 9514 +Description: When migrating a subset of services from a node (e.g. failback + from a failover service node) the remaining services would + time out and evict clients. +Details : lconf --force (implied by --failover) sets the global obd_timeout + to 5 seconds in order to quickly disconnect, but this caused + other RPCs to time out too quickly. Do not change the global + obd_timeout for force cleanup, only set it for DISCONNECT RPCs. + +Severity : enhancement +Frequency : if MDS is started with down OST +Bugzilla : 9439,5706 +Description: Allow startup/shutdown of an MDS without depending on the + availability of the OSTs. +Details : Asynchronously call mds_lov_synchronize during MDS startup. + Add appropriate locking and lov-osc refcounts for safe + cleaning. Add osc abort_inflight calls in case the + synchronize never started. + +Severity : minor +Frequency : occasional (Cray XT3 only) +Bugzilla : 7305 +Description: root not authorized to access files in CRAY_PORTALS environment +Details : The client process capabilities were not honoured on the MDS in + a CRAY_PORTALS/CRAY_XT3 environment. If the file had previously + been accessed by an authorized user then root was able to access + the file on the local client also. The root user capabilities + are now allowed on the MDS, as this environment has secure UID. + +Severity : minor +Frequency : occasional +Bugzilla : 6449 +Description: ldiskfs "too long searching" message happens too often +Details : A debugging message (otherwise harmless) prints too often on + the OST console. This has been reduced to only happen when + there are fragmentation problems on the filesystem. + +Severity : minor +Frequency : rare +Bugzilla : 9598 +Description: Division by zero in statfs when all OSCs are inactive +Details : lov_get_stripecnt() returns zero due to incorrect order of checks, + lov_statfs divides by value returned by lov_get_stripecnt(). + +Severity : minor +Frequency : common +Bugzilla : 9489, 3273 +Description: First write from each client to each OST was only 4kB in size, + to initialize client writeback cache, which caused sub-optimal + RPCs and poor layout on disk for the first writen file. +Details : Clients now request an initial cache grant at (re)connect time + and so that they can start streaming writes to the cache right + away and always do full-sized RPCs if there is enough data. + If the OST is rebooted the client also re-establishes its grant + so that client cached writes will be honoured under the grant. + +Severity : minor +Frequency : common +Bugzilla : 7198 +Description: Slow ls (and stat(2) syscall) on files residing on IO-loaded OSTs +Details : Now I/O RPCs go to different portal number and (presumably) fast + lock requests (and glimses) and other RPCs get their own service + threads pool that should be able to service those RPCs + immediatelly. + +Severity : enhancement +Bugzilla : 7417 +Description: Ability to exchange lustre version between client and servers and + issue warnings at client side if client is too old. Also for + liblustre clients there is ability to refuse connection of too old + clients. +Details : New 'version' field is added to connect data structure that is + filled with version info. That info is later checked by server and + by client. + +Severity : minor +Frequency : rare, liblustre only. +Bugzilla : 9296, 9581 +Description: Two simultaneous writes from liblustre at offset within same page + might proceed at the same time overwriting eachother with stale + data. +Details : I/O lock withing llu_file_prwv was released too early, before data + actually was hitting the wire. Extended lock-holding time until + server acknowledges receiving data. + +Severity : minor +Frequency : extremely rare. Never observed in practice. +Bugzilla : 9652 +Description: avoid generating lustre_handle cookie of 0. +Details : class_handle_hash() generates handle cookies by incrementing + global counter, and can hit 0 occasionaly (this is unlikely, but + not impossible, because initial value of cookie counter is + selected randonly). Value of 0 is used as a sentinel meaning + "unassigned handle" --- avoid it. Also coalesce two critical + sections in this function into one. + +Severity : enhancement +Bugzilla : 9528 +Description: allow liblustre clients to delegate truncate locking to OST +Details : To avoid overhead of locking, liblustre client instructs OST to + take extent lock in ost_punch() on client's behalf. New connection + flag is added to handle backward compatibility. + +Severity : enhancement +Bugzilla : 4928, 7341, 9758 +Description: allow number of OST service threads to be specified +Details : a module parameter allows the number of OST service threads + to be specified via "options ost ost_num_threads={N}" in the + OSS's /etc/modules.conf or /etc/modprobe.conf. + +Severity : major +Frequency : rare +Bugzilla : 6146, 9635, 9895 +Description: servers crash with bad pointer in target_handle_connect() +Details : In rare cases when a client is reconnecting it was possible that + the connection request was the last reference for that export. + We would temporarily drop the export reference and get a new + one, but this may have been the last reference and the export + was just destroyed. Get new reference before dropping old one. + +Severity : enhancement +Frequency : if client is started with failover MDS +Bugzilla : 9818 +Description: Allow multiple MDS hostnames in the mount command +Details : Try to read the configuration from all specified MDS + hostnames during a client mount in case the "primary" + MDS is down. + +Severity : enhancement +Bugzilla : 9297 +Description: Stop sending data to evicted clients as soon as possible. +Details : Check if the client we are about to send or are sending data to + was evicted already. (Check is done every second of waiting, + for which l_wait_event interface was extended to allow checking + of exit condition at specified intervals). + +Severity : minor +Frequency : rare, normally only when NFS exporting is done from client +Bugzilla : 9301 +Description: 'bad disk LOV MAGIC: 0x00000000' error when chown'ing files + without objects +Details : Make mds_get_md() recognise empty md case and set lmm size to 0. + +Severity : minor +Frequency : always, if srand() is called before liblustre initialization +Bugzilla : 9794 +Description: Liblustre uses system PRNG disturbing its usage by user application +Details : Introduce internal to lustre fast and high-quality PRNG for + lustre usage and make liblustre and some other places in generic + lustre code to use it. + +Severity : enhancement +Bugzilla : 9477, 9557, 9870 +Description: Verify that the MDS configuration logs are updated when xml is +Details : Check if the .xml configuration logs are newer than the config + logs stored on the MDS and report an error if this is the case. + Request --write-conf, or allow starting with --old_conf. + +Severity : enhancement +Bugzilla : 6034 +Description: Handle symlinks in the path when checking if Lustre is mounted. +Details : Resolve intermediate symlinks when checking if a client has + mounted a filesystem to avoid duplicate client mounts. + +Severity : minor +Frequency : rare +Bugzilla : 9309 +Description: lconf can hit an error exception but still return success. +Details : The lconf command catches the Command error exception at the top + level script context and will exit with the associated exit + status, but doesn't ensure that this exit status is non-zero. + +Severity : minor +Frequency : rare +Bugzilla : 9493 +Description: failure of ptlrpc thread startup can cause oops +Details : Starting a ptlrpc service thread can fail if there are a large + number of threads or the server memory is very fragmented. + Handle this without oopsing. + +Severity : minor +Frequency : always, only if liblustre and non-default acceptor port was used +Bugzilla : 9933 +Description: liblustre cannot connect to servers with non-default acceptor port +Details : tcpnal_set_default_params() was not called and was therefore + ignoring the environment varaible TCPNAL_PORT, as well as other + TCPNAL_ environment variables + +Severity : minor +Frequency : rare +Bugzilla : 9923 +Description: two objects could be created on the same OST for a single file +Details : If an OST is down, in some cases it was possible to create two + objects on a single OST for a single file. No problems other + than potential performance impact and spurious error messages. + +Severity : minor +Frequency : rare +Bugzilla : 5681, 9562 +Description: Client may oops in ll_unhash_aliases +Details : Client dcache may become inconsistent in race condition. + In some cases "getcwd" can fail if the current directory is + modified. + +Severity : minor +Frequency : always +Bugzilla : 9942 +Description: Inode refcounting problems in NFS export code +Details : link_raw functions used to call d_instantiate without obtaining + extra inode reference first. + +Severity : minor +Frequency : rare +Bugzilla : 9942, 9903 +Description: Referencing freed requests leading to crash, memleaks with NFS. +Details : We used to require that call to ll_revalidate_it was always + followed by ll_lookup_it. Also with revalidate_special() it is + possible to call ll_revalidate_it() twice for the same dentry + even if first occurence returned success. This fix changes semantic + between DISP_ENQ_COMPLETE disposition flag to mean there is extra + reference on a request referred from the intent. + ll_intent_release() then releases such a request. + +Severity : minor +Frequency : rare, normally benchmark loads only +Bugzilla : 1443 +Description: unlinked inodes were kept in memory on the client +Details : If a client is repeatedly creating and unlinking files it + can accumulate a lot of stale inodes in the inode slab cache. + If there is no other client load running this can cause the + client node to run out of memory. Instead flush old inodes + from client cache that have the same inode number as a new inode. + +Severity : minor +Frequency : SLES9 2.6.5 kernel and long filenames only +Bugzilla : 9969, 10379 +Description: utime reports stale NFS file handle +Details : SLES9 uses out-of-dentry names in some cases, which confused + the lustre dentry revalidation. Change it to always use the + in-dentry qstr. + +Severity : major +Frequency : rare, unless heavy write-truncate concurrency is continuous +Bugzilla : 4180, 6984, 7171, 9963, 9331 +Description: OST becomes very slow and/or deadlocked during object unlink +Details : filter_destroy() was holding onto the parent directory lock + while truncating+unlinking objects. For very large objects this + may block other threads for a long time and slow overall OST + responsiveness. It may also be possible to get a lock ordering + deadlock in this case, or run out of journal credits because of + the combined truncate+unlink. Solution is to do object truncate + first in one transaction without parent lock, and then do the + final unlink in a new transaction with the parent lock. This + reduces the lock hold time dramatically. + +Severity : major +Frequency : rare, 2.4 kernels only +Bugzilla : 9967 +Description: MDS or OST cleanup may trip kernel BUG when dropping kernel lock +Details : mds_cleanup() and filter_cleanup() need to drop the kernel lock + before unmounting their filesystem in order to avoid deadlock. + The kernel_locked() function in 2.4 kernels only checks whether + the kernel lock is held, not whether it is this process that is + holding it as 2.6 kernels do. + +Severity : major +Frequency : rare +Bugzilla : 9635 +Description: MDS or OST may oops/LBUG if a client is connecting multiple times +Details : The client ptlrpc code may be trying to reconnect to a down + server before a previous connection attempt has timed out. + Increase the reconnect interval to be longer than the connection + timeout interval to avoid sending duplicate connections to + servers. + +Severity : minor +Frequency : echo_client brw_test command +Bugzilla : 9919 +Description: fix echo_client to work with OST preallocated code +Details : OST preallocation code (5137) didn't take echo_client IO path + into account: echo_client calls filter methods outside of any + OST thread and, hence, there is no per-thread preallocated + pages and buffers to use. Solution: hijack pga pages for IO. As + a byproduct, this avoids unnecessary data copying. + +Severity : minor +Frequency : rare +Bugzilla : 3555, 5962, 6025, 6155, 6296, 9574 +Description: Client can oops in mdc_commit_close() after open replay +Details : It was possible for the MDS to return an open request with no + transaction number in mds_finish_transno() if the client was + evicted, but without actually returning an error. Clients + would later try to replay that open and may trip an assertion + Simplify the client close codepath, and always return an error + from the MDS in case the open is not successful. + +Severity : major +Frequency : rare, 2.6 OSTs only +Bugzilla : 10076 +Description: OST may deadlock under high load on fragmented files +Details : If there was a heavy load and highly-fragmented OST filesystems + it was possible to have all the OST threads deadlock waiting on + allocation of biovecs, because the biovecs were not released + until the entire RPC IO was completed. Instead, release biovecs + as soon as they are complete to ensure forward IO progress. + +Severity : enhancement +Bugzilla : 9578 +Description: Support for specifying external journal device at mount +Details : If an OST or MDS device is formatted with an external journal + device, this device major/minor is stored in the ext3 superblock + and may not be valid for failover. Allow detecting and + specifying the external journal at mount time. + +Severity : major +Frequency : rare +Bugzilla : 10235 +Description: Mounting an MDS with pending unlinked files may cause oops +Details : target_finish_recovery() calls mds_postrecov() which returned + the number of orphans unlinked. mds_lov_connect->mds_postsetup() + considers this an error and immediately begins cleaning up the + lov, just after starting the mds_lov process + +Severity : enhancement +Bugzilla : 9461 +Description: Implement 'lfs df' to report actual free space on per-OST basis +Details : Add sub-command 'df' on 'lfs' to report the disk space usage of + MDS/OSDs. Usage: lfs df [-i][-h]. Command Options: '-i' to report + usage of objects; '-h' to report in human readable format. + +------------------------------------------------------------------------------ + +2005-08-26 Cluster File Systems, Inc. + * version 1.4.5 + * bug fixes + +Severity : major +Frequency : rare +Bugzilla : 7264 +Description: Mounting an ldiskfs file system with mballoc may crash OST node. +Details : ldiskfs mballoc code may reference an uninitialized buddy struct + at startup during orphan unlinking. Instead, skip buddy update + before setup, as it will be regenerated after recovery is complete. + +Severity : minor +Frequency : rare +Bugzilla : 7039 +Description: If an OST is inactive, its locks might reference stale inodes. +Details : lov_change_cbdata() must iterate over all namespaces, even if + they are inactive to clear inode references from the lock. + +Severity : enhancement +Frequency : occasional, if non-standard max_dirty_mb used +Bugzilla : 7138 +Description: Client will block write RPCs if not enough grant +Details : If a client has max_dirty_mb smaller than max_rpcs_in_flight, + then the client will block writes while waiting for another RPC + to complete instead of consuming its dirty limit. With change + we get improved performance when max_dirty_mb is small. + +Severity : enhancement +Bugzilla : 3389, 6253 +Description: Add support for supplementary groups on the MDS. +Details : The MDS has an upcall /proc/fs/lustre/mds/{mds}/group_upcall + (set to /usr/sbin/l_getgroups if enabled) which will do MDS-side + lookups for user supplementary groups into a cache. + +Severity : minor +Bugzilla : 7278 +Description: O_CREAT|O_EXCL open flags in liblustre always return -EEXIST +Details : Make libsysio to not enforce O_EXCL by clearing the flag, + for liblustre O_EXCL is enforced by MDS. + +Severity : minor +Bugzilla : 6455 +Description: readdir never returns NULL in liblustre. +Details : Corrected llu_iop_getdirentries logic, to return offset of next + dentry in struct dirent. + +Severity : minor +Bugzilla : 7137 +Frequency : liblustre only, depends on application IO pattern +Description: liblustre clients evicted if not contacting servers +Details : Don't put liblustre clients into the ping_evictor list, so + they will not be evicted by the pinger ever. + +Severity : enhancement +Bugzilla : 6902 +Description: Add ability to evict clients by NID from MDS. +Details : By echoing "nid:$NID" string into + /proc/fs/lustre/mds/.../evict_client client with nid that equals to + $NID would be instantly evicted from this MDS and from all active + OSTs connected to it. + +Severity : minor +Bugzilla : 7198 +Description: Do not query file size twice, somewhat slowing stat(2) calls. +Details : lookup_it_finish() used to query file size from OSTs that was not + needed. + +Severity : minor +Bugzilla : 6237 +Description: service threads change working directory to that of init +Details : Starting lustre service threads may pin the working directory + of the parent thread, making that filesystem busy. Threads + now change to the working directory of init to avoid this. + +Severity : minor +Bugzilla : 6827 +Frequency : during shutdown only +Description: shutdown with a failed MDS or OST can cause unmount to hang +Details : Don't resend DISCONNECT messages in ptlrpc_disconnect_import() + if server is down. + +Severity : minor +Bugzilla : 7331 +Frequency : 2.6 only +Description: chmod/chown may include an extra supplementary group +Details : ll{,u}_mdc_pack_op_data() does not properly initialize the + supplementary group and if none is specified this is used. + +Severity : minor +Bugzilla : 5479 (6816) +Frequency : rare +Description: Racing open + rm can assert client in mdc_set_open_replay_data() +Details : If lookup is in progress on a file that is unlinked we might try + to revalidate the inode and fail in revalidate after lookup is + complete and ll_file_open() enqueues the open again but + it_open_error() was not checking DISP_OPEN_OPEN errors correctly. + +Severity : minor +Frequency : always, if lconf --abort_recovery used +Bugzilla : 7047 +Description: lconf --abort_recovery fails with 'Operation not supported' +Details : lconf was attempting to abort recovery on the MDT device and not + the MDS device + +------------------------------------------------------------------------------ + +2005-08-08 Cluster File Systems, Inc. + * version 1.4.4 + * bug fixes + +Severity : major +Frequency : rare (only unsupported configurations with a node running as an + OST and a client) +Bugzilla : 6514, 5137 +Description: Mounting a Lustre file system on a node running as an OST could + lead to deadlocks +Details : OSTs now preallocates memory needed to write out data at + startup, instead of when needed, to avoid having to + allocate memory in possibly low memory situations. + Specifically, if the file system is mounted on on OST, + memory pressure could force it to try to write out data, + which it needed to allocate memory to do. Due to the low + memory, it would be unable to do so and the node would + become unresponsive. + +Severity : enhancement +Bugzilla : 7015 +Description: Addition of lconf --service command line option +Details : lconf now accepts a '--service ' option, which is + shorthand for 'lconf --group --select =' + +Severity : enhancement +Bugzilla : 6101 +Description: Failover mode is now the default for OSTs. +Details : By default, OSTs will now run in failover mode. To return to + the old behaviour, add '--failout' to the lmc line for OSTs. + +Severity : enhancement +Bugzilla : 1693 +Description: Health checks are now provided for MDS and OSTs +Details : Additional detailed health check information on MSD and OSTs + is now provided through the procfs health_check value. + +Severity : minor +Frequency : occasional, depends on IO load +Bugzilla : 4466 +Description: Disk fragmentation on the OSTs could eventually cause slowdowns + after numerous create/delete cycles +Details : The ext3 inode allocation policy would not allocate new inodes + very well on the OSTs because there are no new directories + being created. Instead we look for groups with free space if + the parent directories are nearly full. + +Severity : major +Bugzilla : 6302 +Frequency : rare +Description: Network or server problems during mount may cause partially + mounted clients instead of returning an error. +Details : The config llog parsing code may overwrite the error return + code during mount error handling, returning success instead + of an error. + +Severity : minor +Bugzilla : 6422 +Frequency : rare +Description: MDS can fail to allocate large reply buffers +Details : After long uptimes the MDS can fail to allocate large reply + buffers (e.g. zconf client mount config records) due to memory + fragmentation or consumption by the buffer cache. Preallocate + some large reply buffers so that these replies can be sent even + under memory pressure. + +Severity : minor +Bugzilla : 6266 +Frequency : rare (liblustre) +Description: fsx running with liblustre complained that using truncate() to + extend the file doesn't work. This patch corrects that issue. +Details : This is the liblustre equivalent of the fix for bug 6196. Fixes + ATTR_SIZE and lsm use in llu_setattr_raw. + +Severity : critical +Bugzilla : 6866 +Frequency : rare, only 2.6 kernels +Description: Unusual file access patterns on the MDS may result in inode + data being lost in very rare circumstances. +Details : Bad interaction between the ea-in-inode patch and the "no-read" + code in the 2.6 kernel caused the inode and/or EA data not to + be read from disk, causing single-file corruption. + +Severity : critical +Bugzilla : 6998 +Frequency : rare, only 2.6 filesystems using extents +Description: Heavy concurrent write and delete load may cause data corruption. +Details : It was possible under high-load situations to have an extent + metadata block in the block device cache from a just-unlinked + file overwrite a newly-allocated data block. We now unmap any + metadata buffers that alias just-allocated data blocks. + +Severity : minor +Bugzilla : 7241 +Frequency : filesystems with default stripe_count larger than 77 +Description: lconf+mke2fs fail when formatting filesystem with > 77 stripes +Details : lconf specifies an inode size of 4096 bytes when the default + stripe_count is larger than 77. This conflicts with the default + inode density of 1 per 4096 bytes. Allocate smaller inodes in + this case to avoid pinning too much memory for large EAs. + +------------------------------------------------------------------------------ + +2005-07-07 Cluster File Systems, Inc. + * version 1.4.3 + * bug fixes + +Severity : minor +Frequency : rare (extremely heavy IO load with hundreds of clients) +Bugzilla : 6172 +Description: Client is evicted, gets IO error writing to file +Details : lock ordering changes for bug 5492 reintroduced bug 3267 and + caused clients to be evicted for AST timeouts. The fixes in + bug 5192 mean we no longer need to have such short AST timeouts + so ldlm_timeout has been increased. + +Severity : major +Frequency : occasional during --force or --failover shutdown under load +Bugzilla : 5949, 4834 +Description: Server oops/LBUG if stopped with --force or --failover under load +Details : a collection of import/export refcount and cleanup ordering + issues fixed for safer force cleanup + +Severity : major +Frequency : only filesystems larger than 120 OSTs +Bugzilla : 5990, 6223 +Description: lfs getstripe would oops on a very large filesystem +Details : lov_getconfig used kfree on vmalloc'd memory + +Severity : minor +Frequency : only filesystems exporting via NFS to Solaris 10 clients +Bugzilla : 6242, 6243 +Description: reading from files that had been truncated to a non-zero size + but never opened returned no data +Details : ll_file_read() reads zeros from no-object files to EOF + +Severity : major +Frequency : rare +Bugzilla : 6200 +Description: A bug in MDS/OSS recovery could cause the OSS to fail an assertion +Details : There's little harm in aborting MDS/OSS recovery and letting it + try again, so I removed the LASSERT and return an error instead. + +Severity : enhancement +Bugzilla : 5902 +Description: New debugging infrastructure for tracking down data corruption +Details : The I/O checksum code was replaced to: (a) control it at runtime, + (b) cover more of the client-side code path, and (c) try to narrow + down where problems occurred + +Severity : major +Frequency : rare +Bugzilla : 3819, 4364, 4397, 6313 +Description: Racing close and eviction MDS could cause assertion in mds_close +Details : It was possible to get multiple mfd references during close and + client eviction, leading to one thread referencing a freed mfd. + +Severity: : enhancement +Bugzilla : 3262, 6359 +Description: Attempts to reconnect to servers are now more aggressive. +Details : This builds on the enhanced upcall-less recovery that was added + in 1.4.2. When trying to reconnect to servers, clients will + now try each server in the failover group every 10 seconds. By + default, clients would previously try one server every 25 seconds. + +Severity : major +Frequency : rare +Bugzilla : 6371 +Description: After recovery, certain operations trigger a failed + assertion on a client. +Details : Failing over an mds, using lconf -d --failover, while a + client was doing a readdir() call would cause the client to + LBUG after recovery completed and the readdir() was resent. + +Severity : enhancement +Bugzilla : 6296 +Description: Default groups are now added by lconf +Details : You can now run lconf --group without having to + manually add groups with lmc. + +Severity : major +Frequency : occasional +Bugzilla : 6412 +Description: Nodes with an elan id of 0 trigger a failed assertion + +Severity : minor +Frequency : always when accessing e.g. tty/console device nodes +Bugzilla : 3790 +Description: tty and some other devices nodes cannot be used on lustre +Details : file's private_data field is used by device data and lustre + values in there got lost. New field was added to struct file to + store fs-specific private data. + +Severity : minor +Frequency : when exporting Lustre via NFS +Bugzilla : 5275 +Description: NFSD failed occasionally when looking up a path component +Details : NFSD is looking up ".." which was broken in ext3 directories + that had grown large enough to become hashed. + +Severity : minor +Frequency : Clusters with multiple interfaces not on the same subnet +Bugzilla : 5541 +Description: Nodes will repeatedly try to reconnect to an interface which it + cannot reach and report an error to the log. +Details : Extra peer list entries will be created by lconf with some peers + unreachable. lconf now validates the peer before adding it. + +Severity : major +Frequency : Only if a default stripe is set on the filesystem root. +Bugzilla : 6367 +Description: Setting a default stripe on the filesystem root prevented the + filesystem from being remounted. +Details : The client was sending extra request flags in the root getattr + request and did not allocate a reply buffer for the dir EA. + +Severity : major +Frequency : occasional, higher if lots of files are accessed by one client +Bugzilla : 6159, 6097 +Description: Client trips assertion regarding lsm mismatch/magic +Details : While revalidating inodes the VFS looks up inodes with ifind() + and in rare cases can find an inode that is being freed. + The ll_test_inode() code will free the lsm during ifind() + when it finds an existing inode and then the VFS later attaches + this free lsm to a new inode. + +Severity : major +Frequency : rare +Bugzilla : 6422, 7030 +Description: MDS deadlock between mkdir and client eviction +Details : Creating a new file via mkdir or mknod (starting a transaction + and getting the ns lock) can deadlock with client eviction + (gets ns lock and trying to finish a synchronous transaction). + +Severity : minor +Frequency : occasional +Description: While starting a server, the fsfilt_ext3 module could not be + loaded. +Details : CFS's improved ext3 filesystem is named ldiskfs for 2.6 + kernels. Previously, lconf would still use the ext3 name + when trying to load modules. Now, it will correctly use + ext3 on 2.4 and ldiskfs on 2.6. + +Severity : enhancement +Description: The default stripe count has been changed to 1 +Details : The interpretation of the default stripe count (0, to lfs + or lmc) has been changed to mean striping across a single + OST, rather than all available. For general usage we have + found a stripe count of 1 or 2 works best. + +Severity : enhancement +Description: Add support for compiling against Cray portals. +Details : Conditional compiling for some areas that are different + on Cray Portals. + +Severity : major +Frequency : occasional +Bugzilla : 6409, 6834 +Description: Creating files with an explicit stripe count may lead to + a failed assertion on the MDS +Details : If some OSTs are full or unavailable, creating files may + trigger a failed assertion on the MDS. Now, Lustre will + try to use other servers or return an error to the + client. + +Severity : minor +Frequency : occasional +Bugzilla : 6469 +Description: Multiple concurrent overlapping read+write on multiple SMP nodes + caused lock timeout during readahead (since 1.4.2). +Details : Processes doing readahead might match a lock that hasn't been + granted yet if there are overlapping and conflicting lock + requests. The readahead process waits on ungranted lock + (original lock is CBPENDING), while OST waits for that process + to cancel CBPENDING read lock and eventually evicts client. + +Severity : enhancement +Bugzilla : 6931 +Description: Initial enabling of flock support for clients +Details : Implements fcntl advisory locking and file status functions. + This feature is provided as an optional mount flag (default + off), and is NOT CURRENTLY SUPPORTED. Not all types of record + locking are implemented yet, and those that are are not guaranteed + to be completely correct in production environments. + mount -t lustre -o [flock|noflock] ... + +Severity : major +Frequency : occasional +Bugzilla : 6198 +Description: OSTs running 2.4 kernels but with extents enabled might trip an + assertion in the ext3 JBD (journaling) layer. +Details : The b_committed_data struct is protected by the big kernel lock + in 2.4 kernels, serializing journal_commit_transaction() and + ext3_get_block_handle->ext3_new_block->find_next_usable_block() + access to this struct. In 2.6 kernels there is finer grained + locking to improve SMP performance of the JBD layer. + +Severity : minor +Bugzilla : 6147 +Description: Changes the "SCSI I/O Stats" kernel patch to default to "enabled" + +----------------------------------------------------------------------------- + +2005-05-05 Cluster File Systems, Inc. + * version 1.4.2 + NOTE: Lustre 1.4.2 uses an incompatible network protocol than previous + versions of Lustre. Please update all servers and clients to + version 1.4.2 or later at the same time. You must also run + "lconf --write-conf {config}.xml" on the MDS while it is stopped + to update the configuration logs. + * bug fixes + - fix for HPUX NFS client breakage when NFS exporting Lustre (5781) + - mdc_enqueue does not need max_mds_easize request buffer on send (5707) + - swab llog records of type '0' so we get proper header size/idx (5861) + - send llog cancel req to DLM cancel portal instead of cb portal (5515) + - fix rename of one directory over another leaking an inode (5953) + - avoid SetPageDirty on 2.6 (5981) + - don't re-add just-being-destroyed locks to the waiting list (5653) + - when creating new directories, inherit the parent's custom + striping settings if present parent (3048) + - flush buffers from cache before direct IO in 2.6 obdfilter (4982) + - don't hold i_size_sem in ll_nopage() and ll_ap_refresh_count (6077) + - don't hold client locks on temporary worklist from l_lru (5666) + - handle IO errors in 2.6 obdfilter bio completion routine (6046) + - automatically evict dead clients (5921) + - Update file size properly in create+truncate+fstat case (6196) + - Do not unhash mountpoint dentries, do not allow removal of + mountpoints (5907) + - Avoid lock ordering deadlock issue with write/truncate (6203,5654) + - reserve enough journal credits in fsfilt_start_log for setattr (4554) + - ldlm_enqueue freed-export error path would always LBUG (6149,6184) + - don't reference lr_lvb_data until after we hold lr_lvb_sem (6170) + - don't overwrite last_rcvd if there is a *_client_add() error (6086) + - Correctly handle reads of files with no objects (6243) + - lctl recover will also mark a device active if deactivate used (5933) + * miscellania + - by default create 1 inode per 4kB space on MDS, per 16kB on OSTs + - allow --write-conf on an MDS with different nettype than client (5619) + - don't write config llogs to MDS for mounts not from that MDS (5617) + - lconf should create multiple TCP connections from a client (5201) + - init scripts are now turned off by default; run chkconfig --on + lustre and chkconfig --on lustrefs to use them + - upcalls are no longer needed for clients to recover to failover + servers (3262) + - add --abort-recovery option to lconf to abort recovery on device + startup (6017) + - add support for an arbitrary number of OSTs (3026) + - Quota support protocol changes. + - forward compatibility changes to wire structs (6007) + - rmmod NALs that might be loaded because of /etc/modules.conf (6133) + - support for mountfsoptions and clientoptions to the Lustre LDAP (5873) + - improved "lustre status" script + - initialize blocksize for non-regular files (6062) + - added --disable-server and --disable-client configure options (5782) + - introduce a lookup cache for lconf to avoid repeated DB scans (6204) + - Vanilla 2.4.29 support + - increase maximum number of obd devices to 520 (6242) + - remove the tcp-zero-copy patch from the suse-2.4 series (5902) + - Quadrics Elan drivers are now included for the RHEL 3 2.4.21 and + SLES 9 2.6.5 kernels + - limit stripes per file to 160 (the maximum EA size) (6093) + +2005-03-22 Cluster File Systems, Inc. + * version 1.4.1 + * bug fixes + - don't LASSERT in ll_release on NULL lld with NFS export (4655, 5760) + - hold NS lock when calling handle_ast_error->del_waiting_lock (5746) + - fix setattr mtime regression from lovcleanup merge (4829, 5669) + - workaround for 2.6 crash in ll_unhash_aliases (5687, 5210) + - small ext3 extents cleanups and fixes (5733) + - improved mballoc code, several small races and bugs fixed (5733, 5638) + - kernel version 43 - fix remove_suid bugs in both 2.4 and 2.6 (5695) + - avoid needless client->OST connect, fix handle mismatch (5317) + - fix DLM error path that led to out-of-sync client, long delays (5779) + - support common vfs-enforced mount options (nodev,nosuid,noexec) (5637) + - fix several locking issues related to i_size (5492,5624,5654,5672) + - don't move pending lock onto export if it is already evicted (5683) + - fix kernel oops when creating .foo in unlinked directory (5548) + - fix deadlock in obdfilter statistics vs. object create (5811) + - use time_{before,after} to avoid timer jiffies wrap (5882) + - shutdown --force/--failover stability (3607,3651,4797,5203,4834) + - Do not leak request if server was not able to process it (5154) + - If mds_open unable to find parent dir, make that negative lookup(5154) + - don't create new directories with extent-mapping (5909, 5936) + * miscellania + - fix lustre/lustrefs init scripts for SuSE (patch from Scali, 5702) + - don't hold the pinger_sem in ptlrpc_pinger_sending_on_import + - change obd_increase_kms to obd_adjust_kms (up or down) (5654) + - lconf, lmc search both /usr/lib and /usr/lib64 for Python libs (5800) + - support for RHEL4 kernel on i686 (5773) + - provide error messages when incompatible logs are encountered (5898) + +2005-02-18 Cluster File Systems, Inc. + * version 1.4.0.10 (1.4.1 release candidate 1) + * bug fixes + - don't keep a lock reference when lock is not granted (4238) + - unsafe list practices (rarely) led to infinite eviction loop (4908) + - add per-fs limit of Lustre pages in page cache, avoid OOM (4699) + - drop import inflight refcount on signal_completed_replay error (5255) + - unlock page after async write error during send (3677) + - handle missing objects in filter_preprw_read properly (5265) + - no transno return for symlink open, don't save no-trasno open (3440) + - don't try to complete elan receive that already failed (4012) + - free RPC server reply state on error (5406) + - clean up thread from ptlrpc_start_thread() on error (5160) + - readahead could read extra page into cache that wasn't ejected (5388) + - prevent races in class_attach/setup/cleanup/detach (5260) + - don't dereference de->d_inode after l_dput of de (5458) + - use "int" for stripe value returned from lock_to_stripe (5544) + - mballoc allocation and error-checking fixes in 2.6 (5504) + - block device patches to fix I/O request sizes in 2.6 (5482) + - look up hostnames for IB nals (5602) + - 2.6 changed lock ordering of 2 semaphores, caused deadlock (5654) + - don't start multiple acceptors for the same port (5277) + - fix incorrect LASSERT in mds_getattr_name (5635) + - export a proc file for general "ping" checking (5628) + - fix "lfs check" to not block when the MDS is down (5628) + * miscellania + - service request history (4965) + - put {ll,lov,osc}_async_page structs in a single slab (4699) + - create an "evict_client" /proc entry on OSTs, like the MDS has + - fix mount usage message, return errors per mount(8) (5168) + - change grep [] to grep "[]" in tests so they work in more UMLs + - fix ppc64/x86_64 spec to use %{_libdir} instead of /usr/lib (5389) + - remove ancient LOV_MAGIC_V0 EA support (5047) + - add "disk I/Os in flight" and "I/O req time" stats in obdfilter + - align r/w RPCs to PTLRPC_MAX_BRW_SIZE boundary for performance (3451) + - allow readahead allocations to fail when low on memory (5383) + - mmap locking landed again, after considerable improvement (2828) + - add get_hostaddr() to lustreDB.py for LDAP support (5459) + +2004-11-23 Cluster File Systems, Inc. + * version 1.4.0 + * bug fixes + - send OST transaction number in read/write reply to free req (4966) + - don't ASSERT in ptl_send_rpc() if we run out of memory (5119) + - lock /proc/sys/portals/routes internal state, avoiding oops (4827) + - the watchdog thread now runs as interruptible (5246) + - flock/lockf fixes (but it's still disabled, pending 5135) + - don't use EXT3 constants in llite code (5094) + - memory shortage at startup could cause assertion (5176) + * miscellania + - reorganization of lov code + - single portals codebase + - Infiniband NAL + - add extents/mballoc support (5025) + - direct I/O reads in the obdfilter (4048) + - kernel patches from LNXI for 2.6 (bluesmoke, perfctr, mtd, kexec) + +tbd Cluster File Systems, Inc. + * version 1.2.9 + * bug fixes + - send OST transaction number in read/write reply to free req (4966) + - don't ASSERT in ptl_send_rpc() if we run out of memory (5119) + - lock /proc/sys/portals/routes internal state, avoiding oops (4827) + - the watchdog thread now runs as interruptible (5246) + - handle missing objects in filter_preprw_read properly (5265) + - unsafe list practices (rarely) led to infinite eviction loop (4908) + - drop import inflight refcount on signal_completed_replay error (5255) + - unlock page after async write error during send (3677) + - return original error code on reconstructed replies (3761) + - no transno return for symlink open, don't save no-trasno open (3440) + * miscellania + - add pid to ldlm debugging output (4922) + - bump the watchdog timeouts -- we can't handle 30sec yet + - extra debugging for orphan dentry/inode bug (5259) + +2004-11-16 Cluster File Systems, Inc. + * version 1.2.8 + * bug fixes + - fix TCP_NODELAY bug, which caused extreme perf regression (5134) + - allocate qswnal tx descriptors singly to avoid fragmentation (4504) + - don't LBUG on obdo_alloc() failure, use OBD_SLAB_ALLOC() (4800) + - fix NULL dereference in /proc/sys/portals/routes (4827) + - allow failed mdc_close() operations to be interrupted (4561) + - stop precreate on OST before MDS would time out on it (4778) + - don't send partial-page writes before EOF from client (4410) + - discard client grant for sub-page writes on large-page clients (4520) + - don't free dentries not owned by NFS code, check generation (4806) + - fix lsm leak if mds_create_objects() fails (4801) + - limit debug_daemon file size, always print CERROR messages (4789) + - use transno after validating reply (3892) + - process timed out requests if import state changes (3754) + - update mtime on OST during writes, return in glimpse (4829) + - add mkfsoptions to LDAP (4679) + - use ->max_readahead method instead of zapping global ra (5039) + - don't interrupt __l_wait_event() during strace + * miscellania + - add software watchdogs to catch hung threads quickly (4941) + - make lustrefs init script start after nfs is mounted + - fix CWARN/ERROR duplication (4930) + - return async write errors to application if possible (2248) + - add /proc/sys/portal/memused (bytes allocated by PORTALS_ALLOC) + - print NAL number in %x format (4645) + - update barely-supported suse-2.4.21-171 series (4842) + - support for sles 9 %post scripts + - support for building 2.6 kernel-source packages + - support for sles km_* packages + +2004-10-07 Cluster File Systems, Inc. + * version 1.2.7 + * bug fixes + - ignore -ENOENT errors in osc_destroy (3639) + - notify osc create thread that OSC is being cleaned up (4600) + - add nettype argument for llmount in #5d in conf-sanity.sh (3936) + - reconstruct ost_handle() like mds_handle() (4657) + - create a new thread to do import eviction to avoid deadlock (3969) + - let lconf resolve symlinked-to devices (4629) + - don't unlink "objects" from directory with default EA (4554) + - hold socknal file ref over connect in case target is down (4394) + - allow more than 32000 subdirectories in a single directory (3244) + - fix blocks count for O_DIRECT writes (3751) + - OST returns ENOSPC from object create when no space left (4539) + - don't send truncate RPC if file size isn't changing (4410) + - limit OSC precreate to 1/2 of value OST considers bogus (4778) + - bind to privileged port in socknal and tcpnal (3689) + * miscellania + - rate limit CERROR/CWARN console message to avoid overload (4519) + - GETFILEINFO dir ioctl returns LOV EA + MDS stat in 1 call (3327) + - basic mmap support (3918) + - kernel patch series update from b1_4 (4711) + +2004-09-16 Cluster File Systems, Inc. + * version 1.2.6 + * bug fixes + - avoid crash during MDS cleanup with OST shut down (2775) + - fix loi_list_lock/oig_lock inversion on interrupted IO (4136) + - don't use bad inodes on the MDS (3744) + - dynamic object preallocation to improve recovery speed (4236) + - don't hold spinlock over lock dumping or change debug flags (4401) + - don't zero obd_dev when it is force cleaned (3651) + - print grants to console if they go negative (4431) + - "lctl deactivate" will stop automatic recovery attempts (3406) + - look for existing locks in ldlm_handle_enqueue() (3764) + - don't resolve lock handle twice in recovery avoiding race (4401) + - revalidate should check working dir is a directory (4134) + * miscellania + - don't always mark "slow" obdfilter messages as errors (4418) + +2004-08-24 Cluster File Systems, Inc. + * version 1.2.5 + * bug fixes + - don't close LustreDB during write_conf until it is done (3860) + - fix typo in lconf for_each_profile (3821) + - allow dumping logs from multiple threads at one time (3820) + - don't allow multiple threads in OSC recovery (3812) + - fix debug_size parameters (3864) + - fix mds_postrecov to initialize import for llog ctxt (3121) + - replace config semaphore with spinlock (3306) + - be sure to send a reply for a CANCEL rpc with bad export (3863) + - don't allow enqueue to complete on a destroyed export (3822) + - down write_lock before checking llog header bitmap (3825) + - recover from lock replay timeout (3764) + - up llog sem before sending rpc (3652) + - reduce ns lock hold times when setting kms (3267) + - change a dlm LBUG to LASSERTF, to maybe learn something (4228) + - fix NULL deref and obd_dev leak on setup error (3312) + - replace some LBUG about llog ops with error handling (3841) + - don't match INVALID dentries from d_lookup and spin (3784) + - hold dcache_lock while marking dentries INVALID and hashing (4255) + - fix invalid assertion in ptlrpc_set_wait (3880) + * miscellania + - add libwrap support for the TCP acceptor (3996) + - add /proc/sys/portals/routes for non-root route listing (3994) + - allow setting MDS UUID in .xml (2580) + - print the stack of a process that LBUGs (4228) + +2004-07-14 Cluster File Systems, Inc. + * version 1.2.4 + * bug fixes + - don't cleanup request in ll_file_open() on failed MDS open (3430) + - make sure to unset replay flag from failed open requests (3440) + - if default stripe count is 0, use OST count for inode size (3636) + - update parent mtime/ctime on client for create/unlink (2611) + - drop dentry ref in ext3_add_link from open_connect_dentry (3266) + - free recovery state on server during a forced cleanup (3571) + - unregister_reply for resent reqs (3063) + - loop back devices mounting and status check on 2.6 (3563) + - fix resource-creation race that can provoke i_size == 0 (3513) + - don't try to use bad inodes returned from MDS/OST fs lookup (3688) + - more debugging for page-accounting assertion (3746) + - return -ENOENT instead of asserting if ost getattr+unlink race (3558) + - avoid deadlock after precreation failure (3758) + - fix race and lock order deadlock in orphan handling (3450, 3750) + - add validity checks when grabbing inodes from l_ast_data (3599) + * miscellania + - add /proc/.../recovery_status to obdfilter (3428) + - lightweight CDEBUG infrastructure, debug daemon (3668) + - change default OSC RPC parameters to be better on small clusters + - turn off OST read cache for files smaller than 32MB + - install man pages and include them in rpms (3100) + - add new init script for (un)mounting lustre filesystems (2593) + - run chkconfig in %post for init scripts (3701) + - drop scimac NAL (unmaintained) + +2004-06-17 Cluster File Systems, Inc. + * version 1.2.3 + * bug fixes + - clean kiobufs before and after use (3485) + - strip trailing '/'s before comparing paths with /proc/mounts (3486) + - remove assertions to work around "in-flight rpcs" recovery bug (3063) + - change init script to fail more clearly if not run as root (1528) + - allow clients to reconnect during replay (1742) + - fix ns_lock/i_sem lock ordering deadlock for kms update (3477) + - don't do DNS lookups on NIDs too small for IP addresses (3442) + - re-awaken ptlrpcd if new requests arrive during check_set (3554) + - fix cond_resched (3554) + - only evict unfinished clients after recovery (3515) + - allow bulk resend, prevent data loss (3570) + - dynamic ptlrpc request buffer allocation (2102) + - don't allow unlinking open directory if it isn't empty (2904) + - set MDS/OST threads to umask 0 to not clobber client modes (3359) + - remove extraneous obd dereference causing LASSERT failure (3334) + - don't use get_cycles() when creating temp. files on the mds (3156) + - hold i_sem when setting i_size in ll_extent_lock() (3564) + - handle EEXIST for set-stripe, set proper directory name (3336) + * miscellania + - servers can dump a log evicting a client - lustre.dump_on_timeout=1 + - fix ksocknal_fmb_callback() error messages (2918) + +2004-05-27 Cluster File Systems, Inc. + * version 1.2.2 + * bug fixes + - don't copy lvb into (possibly NULL) reply on error (2983) + - don't deref dentry after dput, don't free lvb on error (2922) + - use the kms to determine writeback rpc length (2947) + - increment oti_logcookies when osc is inactive (2948) + - update client's i_blocks count via lvb messages (2543) + - handle intent open/close of special files properly (1557) + - mount MDS with errors=remount-ro, like obdfilter (2009) + - initialize lock handle to avoid ASSERT on error cleanup (3057) + - don't use cancelling-locks' kms values (2947) + - use highest lock extent for kms, not last one (2925) + - don't dereference ERR_PTR() dentry in error handling path (3107) + - fix thread race in portals_debug_dumplog() (3122) + - create lprocfs device entries at setup instead of at attach (1519) + - common AST error handler, don't evict client on completion race (3145) + - zero nameidata in detach_mnt in 2.6 (3118) + - verify d_inode after revalidate_special is valid in 2.6 (3116) + - use lustre_put_super() to handle zconf unmounts in 2.6 (3064) + - initialize RPC timeout timer earlier for 2.6 (3219) + - don't dereference NULL reply buffer if mdc_close was never sent (2410) + - print nal/nid for unknown nid (3258) + - additional checks for oscc recovery before doing precreate (3284) + - fix ll_extent_lock() error return code for 64-bit systems (3043) + - don't crash in mdc_close for bad permissions on open (3285) + - zero i_rdev for non-device files (3147) + - clear page->private before handing to FS, better assertion (3119) + - tune the read pipeline (3236) + - fix incorrect decref of invalidated dentry (2350) + - provide read-ahead stats and refine rpc in flight stats (3328) + - don't hold journal transaction open across create RPC (3313) + - update atime on MDS at close time (3265) + - close LDAP connection when recovering to avoid server load (3315) + - update iopen-2.6 patch with fixes from 2399,2517,2904 (3301) + - don't leak open file on MDS after open resend (3325) + - serialize filter_precreate and filter_destroy_precreated (3329) + - loop device shouldn't call sync_dev() for nul device (3092) + - clear page cache after eviction (2766) + - resynchronize MDS->OST in background (2824) + - refuse to mount the same filesystem twice on same mountpoint (3394) + - allow llmount to create routes for mounting behind routers (3320) + - push lock cancellation to blocking thread for glimpse ASTs (3409) + - don't call osc_set_data_with_check() for TEST_LOCK matches (3159) + - fix rare problem with rename on htree directories (3417) + * miscellania + - allow default OST striping configuration per directory (1414) + - fix compilation for qswnal for 2.6 kernels (3125) + - increase maximum number of MDS request buffers for large systems + - change liblustreapi to be useful for external progs like lfsck (3098) + - increase local configuration timeout for slow disks (3353) + - allow configuring ldlm AST timeout - lustre.ldlm_timeout= + +2004-03-22 Cluster File Systems, Inc. + * version 1.2.1 + * bug fixes + - fixes for glimpse AST timeouts / incorrectly 0-sized files (2818) + - don't overwrite extent policy data in reply if lock was blocked (2901) + - drop filter export grants atomically with removal from device (2663) + - del obd_self_export from work_list in class_disconnect_exports (2908) + - don't LBUG if MDS recovery times out during orphan cleanup (2530) + - swab reply message in mdc_close, other PPC fixes (2464) + - fix destroying of named logs (2325) + - overwrite old logs when running lconf --write_conf (2264) + - bump LLOG_CHUNKSIZE to 8k to allow for larger clusters (2306) + - fix race in target_handle_connect (2898) + - mds_reint_create() should take same inode create lock (2926) + - correct journal credits calculated for CANCEL_UNLINK_LOG (2931) + - don't close files for self_export to avoid uninitialized obd (2936) + - allow MDS with the same name as client node (2939) + - hold dentry reference for closed log files for unlink (2325) + - reserve space for all logs during transactions (2059) + - don't evict page beyond end of stripe extent (2925) + - don't oops on a deleted current working directory (2399) + - handle hard links to targets without a parent properly (2517) + - don't dereference NULL lock when racing during eviction (2867) + - don't grow lock extents when lots of conflicting locks (2919) + +2004-03-04 Cluster File Systems, Inc. * version 1.2.0 * bug fixes + - account for cache space usage on clients to avoid data loss (974) + - lfsck support in lustre kernel code (2349) - reduce journal credits needed for BRW writes (2370) - orphan handling to avoid losing space on client/server crashes - ptlrpcd can be blocked, stopping ALL progress (2477) + - use lock value blocks to assist in proper KMS, faster stat (1021) + - takes i_sem instead of DLM locks internally on obdfilter (2720) - recovery for initial connections (2355) - - orphan recovery problems in b_eq (1934) + - fixes for mds_cleanup_orphans (1934) + - abort_recovery crashes MDS in b_eq (mds_unlink_orphan) (2584) - block all file creations until orphan recovery completes (1901) - client remove rq_connection from request struct (2423) - conf-sanity test_5, proper cleanup in umount log not availale (2640) - recovery timer race (2670) - mdc_close recovey bug (2532) + - ptlrpc cleanup bug (2710) + - mds timeout on local locks (2588) + - namespace lock held during RPCs (2431) + - handle interrupted sync write properly (2503) + - don't try to handle a message that hasn't been replied to (2699) + - client assert failure during cleanup after abort recovery (2701) + - leak mdc device after failed mount (2712) + - ptlrpc_check_set allows timedout requests to complete (2714) + - wait for inflight reqs when ptlrpcd finishes (2710) + - make sure unregistered services are removed from the srv_list + - reset bulk XID's when resending them (caught by 1138 test) + - unregister_bulk after timeout + - fix lconf error (2694) + - handle write after unfinished setstripe, stripe-only getstripe (2388) + - readahead locks pages, leaves pending causing memory pressure (2673) + - increase OST request buffers to 4096 on large machines (2729) + - fix up permission of existing directories in simple_mkdir (2661) + - init deleted item, add assertions ptlrpc_abort_inflight() (2725) + - don't assign transno to errored transactions (2742) + - don't delete objects on OST if given a bogus objid from MDS (2751) + - handle large client PAGE_SIZE readdir on small PAGE_SIZE MDS (2777) + - if rq_no_resend, then timeout request after recovery (2432) + - fix MDS llog_logid record size, 64-bit array alignment (2733) + - don't call usermode_helper from ptlrpcd, DEFAULT upcall (2773) + - put magic in mount.lustre data, check for bad/NULL mount data (2529) + - MDS recovery shouldn't delete objects that it has given out (2730) + - if enqueue arrives after completion, don't clobber LVB (2819) + - don't unlock pages twice when trigger_group_io returns error (2814) + - don't deref NULL rq_repmsg if ldlm_handle_enqueue failed (2822) + - don't write pages to disk if there was an error (1450) + - don't ping imports that have recovery disabled (2676) + - take buffered bytes into account when balancing socknal conn (2817) + - hold a DLM lock over readdir always, use truncate_inode_pages (2706) + - reconnect unlink llog connection after MDS reconnects to OST (2816) + - remove little-endian swabbing of llog records (1987) + - set/limit i_blksize to LL_MAX_BLKSIZE on client (2884) + - retry reposting request buffers if they fail (1191) + - grow extent at grant time to avoid granting a revoked lock (2809) + - lock revoke doesn't evict page if covered by a second lock (2765) + - disable VM readahead to avoid reading outside lock extents (2805) + * miscellania + - return LL_SUPER_MAGIC from statfs for the filesystem type (1972) + - updated kernel patches for hp-2.4.20 kernel (2681) 2004-02-07 Cluster File Systems, Inc. * version 1.0.4 @@ -28,6 +2863,7 @@ tbd Cluster File Systems, Inc. - print out dotted-quad IP addresses in the socknal (2302) * miscellania - additional debugging for MDS client eviction problem (2443) + - fix mkfsoptions support for osts (2603, 2604) 2004-01-27 Cluster File Systems, Inc. * version 1.0.3 @@ -253,7 +3089,7 @@ tbd Cluster File Systems, Inc. - return 0 from revalidate2 if ll_intent_lock returns -EINTR (912) - fix leak in bulk IO when only partially completed (899, 900, 926) - fix O_DIRECT for ia64 (55) - - (almost) eliminate Lustre-kernel-thread effects on load average (722) + - (almost) eliminate Lustre-kernel-thread effects on load average (722) - C-z after timeout could hang a process forever; fixed (977) * Features - client-side I/O cache (678, 924, 929, 941, 970) @@ -267,7 +3103,7 @@ tbd Cluster File Systems, Inc. - Fix ldlm_lock_match on the MDS to avoid matching remote locks (592) - Fix fsfilt_extN_readpage() to read a full page of directory entries, or fake the remainder if PAGE_SIZE != blocksize (500) - - Avoid extra mdc_getattr() in ll_intent_lock when possible (534, 604) + - Avoid extra mdc_getattr() in ll_intent_lock when possible (534, 604) - Fix imbalanced LOV object allocation and out-of-bound access (469) - Most intent operations were removed, in favour of a new RPC mode that does a single RPC to the server and bypasses most of the VFS @@ -375,9 +3211,9 @@ tbd Cluster File Systems, Inc. - fix dbench 2, extN refcount problem (170, 258, 356, 418) - fix double-O_EXCL intent crash (424) - avoid sending multiple lock CANCELs (352) - * Features + * Features - MDS can do multi-client recovery (modulo bugs in new code) - * Documentation + * Documentation - many updates, edits, cleanups 2002-11-18 Phil Schwan @@ -401,12 +3237,12 @@ tbd Cluster File Systems, Inc. - properly abstracted the echo client - OSC locked 1 byte too many; fixed - rewrote brw callback code: - - fixed recovery bugs related to LOVs (306) - - fixed too-many-pages-in-one-write crash (191) - - fixed (again) crash in sync_io_timeout (214) - - probably fixed callback-related race (385) + - fixed recovery bugs related to LOVs (306) + - fixed too-many-pages-in-one-write crash (191) + - fixed (again) crash in sync_io_timeout (214) + - probably fixed callback-related race (385) * protocol change - - Add capability to MDS protocol + - Add capability to MDS protocol - LDLM cancellations and callbacks on different portals 2002-10-28 Andreas Dilger @@ -495,7 +3331,7 @@ tbd Cluster File Systems, Inc. * add hard link support * change obdfile creation method * kernel patch changed - + 2002-09-19 Peter Braam * version 0_5_9 * bug fix @@ -561,8 +3397,8 @@ tbd Cluster File Systems, Inc. * small changes in the DLM wire protocol 2002-07-25 Peter J. Braam - * version 0_5_1 with some initial stability, - * locking on MD and file I/O. + * version 0_5_1 with some initial stability, + * locking on MD and file I/O. * documentation updates * several bug fixes since 0.5.0 * small changes in wire protocol @@ -596,4 +3432,4 @@ tbd Cluster File Systems, Inc. * move forward to latest Lustre kernel 2002-06-25 Peter Braam - * release version v0_4_1. Hopefully stable on single node use. + * release version v0_4_1. Hopefully stable on single node use.