-TBA
+tbd Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.6.0
+ * CONFIGURATION CHANGE. This version of Lustre WILL NOT
+ INTEROPERATE with older versions automatically. In many cases a
+ special upgrade step is needed. Please read the
+ user documentation before upgrading any part of a 1.4.x system.
+ * WARNING: Lustre configuration and startup changes are required with
+ this release. See https://mail.clusterfs.com/wikis/lustre/MountConf
+ for details.
+ * Support for kernels:
+ 2.6.9-42.0.3EL (RHEL 4)
+ 2.6.5-7.276 (SLES 9)
+ 2.4.21-47.0.1.EL (RHEL 3)
+ 2.6.12.6 vanilla (kernel.org)
+ 2.6.16.21-0.8 (SLES10)
+ * Client support for unpatched kernels:
+ (see https://mail.clusterfs.com/wikis/lustre/PatchlessClient)
+ 2.6.16 - 2.6.19 vanilla (kernel.org)
+ 2.6.9-42.0.3EL (RHEL 4)
+ * Recommended e2fsprogs version: 1.39.cfs2-0
+ * bug fixes
+
+Severity : enhancement
+Bugzilla : 8007
+Description: MountConf
+Details : Lustre configuration is now managed via mkfs and mount
+ commands instead of lmc and lconf. New obd types (MGS, MGC)
+ are added for dynamic configuration management. See
+ https://mail.clusterfs.com/wikis/lustre/MountConf for
+ details.
+
+Severity : enhancement
+Bugzilla : 4482
+Description: dynamic OST addition
+Details : OSTs can now be added to a live filesystem
+
+Severity : enhancement
+Bugzilla : 9851
+Description: startup order invariance
+Details : MDTs and OSTs can be started in any order. Clients only
+ require the MDT to complete startup.
+
+Severity : enhancement
+Bugzilla : 4899
+Description: parallel, asynchronous orphan cleanup
+Details : orphan cleanup is now performed in separate threads for each
+ OST, allowing parallel non-blocking operation.
+
+Severity : enhancement
+Bugzilla : 9862
+Description: optimized stripe assignment
+Details : stripe assignments are now made based on ost space available,
+ ost previous usage, and OSS previous usage, in order to try
+ to optimize storage space and networking resources.
+
+Severity : enhancement
+Bugzilla : 4226
+Description: Permanently set tunables
+Details : All writable /proc/fs/lustre tunables can now be permanently
+ set on a per-server basis, at mkfs time or on a live
+ system.
+
+Severity : enhancement
+Bugzilla : 10547
+Description: Lustre message v2
+Details : Add lustre message format v2.
+
+Severity : enhancement
+Bugzilla : 9866
+Description: client OST exclusion list
+Details : Clients can be started with a list of OSTs that should be
+ declared "inactive" for known non-responsive OSTs.
+
+Severity : minor
+Frequency : SFS test only (otherwise harmless)
+Bugzilla : 6062
+Description: SPEC SFS validation failure on NFS v2 over lustre.
+Details : Changes the blocksize for regular files to be 2x RPC size,
+ and not depend on stripe size.
+
+Severity : enhancement
+Bugzilla : 9293
+Description: Multiple MD RPCs in flight.
+Details : Further unserialise some read-only MDS RPCs - learn about intents.
+ To avoid overly-overloading MDS, introduce a limit on number of
+ MDS RPCs in flight for a single client and add /proc controls
+ to adjust this limit.
+
+Severity : enhancement
+Bugzilla : 22484
+Description: client read/write statistics
+Details : Add client read/write call usage stats for performance
+ analysis of user processes.
+ /proc/fs/lustre/llite/*/offset_stats shows non-sequential
+ file access. extents_stats shows chunk size distribution.
+ extents_stats_per_process show chunk size distribution per
+ user process.
+
+Severity : enhancement
+Bugzilla : 22485
+Description: per-client statistics on server
+Details : Add ldlm and operations statistics for each client in
+ /proc/fs/lustre/mds|obdfilter/*/exports/
+
+Severity : enhancement
+Bugzilla : 22486
+Description: mds statistics
+Details : Add detailed mds operations statistics in
+ /proc/fs/lustre/mds/*/stats
+
+Severity : enhancement
+Bugzilla : 10968
+Description: VFS operations stats
+Details : Add client VFS call stats, trackable by pid, ppid, or gid
+ /proc/fs/lustre/llite/*/vfs_ops_stats
+ /proc/fs/lustre/llite/*/track_[pid|ppid|gid]
+
+Severity : minor
+Frequency : always
+Bugzilla : 6380
+Description: Fix client-side osc byte counters
+Details : The osc read/write byte counters in
+ /proc/fs/lustre/osc/*/stats are now working
+
+Severity : minor
+Frequency : always as root on SLES
+Bugzilla : 10667
+Description: Failure of copying files with lustre special EAs.
+Details : Client side always return success for setxattr call for lustre
+ special xattr (currently only "trusted.lov").
+
+Severity : minor
+Frequency : always
+Bugzilla : 10345
+Description: Refcount LNET uuids
+Details : The global LNET uuid list grew linearly with every startup;
+ refcount repeated list entries instead of always adding to
+ the list.
+
+Severity : enhancement
+Bugzilla : 2258
+Description: Dynamic service threads
+Details : Within a small range, start extra service threads
+ automatically when the request queue builds up.
+
+Severity : major
+Frequency : mixed-endian client/server environments
+Bugzilla : 11214
+Description: mixed-endian crashes
+Details : The new msg_v2 system had some failures in mixed-endian
+ environments.
+
+Severity : enhancement
+Bugzilla : 11229
+Description: Easy OST removal
+Details : OSTs can be permanently deactivated with e.g. 'lctl
+ conf_param lustre-OST0001.osc.active=0'
+
+Severity : enhancement
+Bugzilla : 11335
+Description: MGS proc entries
+Details : Added basic proc entries for the MGS showing what filesystems
+ are served.
+
+Severity : enhancement
+Bugzilla : 10998
+Description: provide MGS failover
+Details : Added config lock reacquisition after MGS server failover.
+
+Severity : enhancement
+Bugzilla : 11461
+Description: add Linux 2.4 support
+Details : Added support for RHEL 2.4.21 kernel for 1.6 servers and clients
+
+Severity : normal
+Bugzilla : 11330
+Description: a large application tries to do I/O to the same resource and dies
+ in the middle of it.
+Details : Check the req->rq_arrival time after the call to
+ ost_brw_lock_get(), but before we do anything about
+ processing it & sending the BULK transfer request. This
+ should help move old stale pending locks off the queue as
+ quickly as obd_timeout.
+
+Severity : major
+Frequency : when an incorrect nid is specified during startup
+Bugzilla : 10734
+Description: ptlrpc connect to non-existant node causes kernel crash
+Details : LNET can't be re-entered from an event callback, which
+ happened when we expire a message after the export has been
+ cleaned up. Instead, hand the zombie cleanup off to another
+ thread.
+
+Severity : enhancement
+Bugzilla : 10902
+Description: plain/inodebits lock performance improvement
+Details : Grouping plain/inodebits in granted list by their request modes
+ and bits policy, thus improving the performance of search through
+ the granted list.
+
+Severity : major
+Frequency : only if OST filesystem is corrupted
+Bugzilla : 9829
+Description: client incorrectly hits assertion in ptlrpc_replay_req()
+Details : for a short time RPCs with bulk IO are in the replay list,
+ but replay of bulk IOs is unimplemented. If the OST filesystem
+ is corrupted due to disk cache incoherency and then replay is
+ started it is possible to trip an assertion. Avoid putting
+ committed RPCs into the replay list at all to avoid this issue.
+
+Severity : minor
+Frequency : only for kernels with patches from Lustre below 1.4.3
+Bugzilla : 11248
+Description: Remove old rdonly API
+Details : Remove old rdonly API which unsed from at least lustre 1.4.3
+
+------------------------------------------------------------------------------
+
+TBD Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.10
+ * Support for kernels:
+ 2.6.9-42.0.3EL (RHEL 4)
+ 2.6.5-7.276 (SLES 9)
+ 2.4.21-47.0.1.EL (RHEL 3)
+ 2.6.12.6 vanilla (kernel.org)
+ * Recommended e2fsprogs version: 1.39.cfs2-0
+
+Severity : normal
+Frequency : always
+Bugzilla : 10214
+Description: make O_SYNC working on 2.6 kernels
+Details : 2.6 kernels use different method for mark pages for write,
+ so need add a code to lustre for O_SYNC work.
+
+Severity : minor
+Frequency : always
+Bugzilla : 11110
+Description: Failure to close file and release space on NFS
+Details : Put inode details into lock acquired in ll_intent_file_open.
+ Use mdc_intent_lock in ll_intent_open to properly
+ detect all kind of errors unhandled by mdc_enqueue
+
+Severity : major
+Frequency : rare
+Bugzilla : 10866
+Description: proc file read during shutdown sometimes raced obd removal,
+ causing node crash
+Details : Add lock to prevent obd access after proc file removal.
+
+Severity : normal
+Frequency : Only for files larger than 4GB on 32-bit clients.
+Bugzilla : 11237
+Description: improperly doing page alignment of locks
+Details : Modify lustre core code to use CFS_PAGE_* defines instead of
+ PAGE_*. Make CFS_PAGE_MASK 64bit long.
+
+Severity : normal
+Frequency : rarely
+Bugzilla : 11203
+Description: RPCs being resent when they shouldn't be
+Details : Some RPCs that should not be resent are being resent. This
+ can cause inconsistencies in the RPC state machine. Do not
+ resend such requests.
+
+Severity : normal
+Frequency : rare, only with NFS export
+Bugzilla : 11669
+Description: Crash on NFS re-export node
+Details : under very unusual load conditions an assertion is hit in
+ ll_intent_file_open()
+
+------------------------------------------------------------------------------
+
+TBD Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.9
+ * Support for kernels:
+ 2.6.9-42.0.3EL (RHEL 4)
+ 2.6.5-7.276 (SLES 9)
+ 2.4.21-40.0.1.EL (RHEL 3)
+ 2.6.12.6 vanilla (kernel.org)
+ * bug fixes
+
+Severity : critical
+Frequency : rare
+Bugzilla : 11125
+Description: "went back in time" messages on mds failover
+Details : The greatest transno may be lost when the current operation
+ finishes with an error (transno==0) and the client's last_rcvd
+ record is over-written. Save the greatest transno in the
+ mds_last_transno for this case.
+
+Severity : minor
+Frequency : always for specific kernels and striping counts
+Bugzilla : 11042
+Description: client may get "Matching packet too big" without ACL support
+Details : Clients compiled without CONFIG_FS_POSIX_ACL get an error message
+ when trying to access files in certain configurations. The
+ clients should in fact be denied when mounting because they do
+ not understand ACLs.
+
+Severity : major
+Frequency : Cray XT3 with more than 4000 clients and multiple jobs
+Bugzilla : 10906
+Description: many clients connecting with IO in progress causes connect timeouts
+Details : Avoid synchronous journal commits to avoid delays caused by many
+ clients connecting/disconnecting when bulk IO is in progress.
+ Queue liblustre connect requests on OST_REQUEST_PORTAL instead of
+ OST_IO_PORTAL to avoid delays behind potentially many pending
+ slow IO requests.
+
+Severity : normal
+Frequency : occasionally with multiple writers to a single file
+Bugzilla : 11081
+Description: shared writes to file may result in wrong size reported by stat()
+Details : Allow growing of kms when extent lock is cancelled
+
+Severity : minor
+Frequency : always with random mmap IO to multi-striped file
+Bugzilla : 10919
+Description: mmap write might be lost if we are writing to a 'hole' in stripe
+Details : Only if the hole is at the end of OST object so that kms is too
+ small. Fix is to increase kms accordingly in ll_nopage.
+
+Severity : normal
+Frequency : rare, only if OST filesystem is inconsistent with MDS filesystem
+Bugzilla : 11211
+Description: writes to a missing object would leak memory on the OST
+Details : If there is an inconsistency between the MDS and OST filesystems,
+ such that the MDS references an object that doesn't exist, writes
+ to that object will leak memory due to incorrect cleanup in the
+ error handling path, eventually running out of memory on the OST.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 11040
+Description: Creating too long symlink causes lustre errors
+Details : Check symlink and name lengths before sending requests to MDS.
+
+Severity : normal
+Frequency : only if flock is enabled (not on by default)
+Bugzilla : 11415
+Description: posix locks not released on fd closure on 2.6.9+
+Details : We failed to add posix locks to list of inode locks on 2.6.9+
+ kernels, this caused such locks not to be released on fd close and
+ then assertions on fs unmount about still used locks.
+
+Severity : minor
+Frequency : MDS failover only, very rarely
+Bugzilla : 11277
+Description: clients may get ASSERTION(granted_lock != NULL)
+Details : When request was taking a long time, and a client was resending
+ a getattr by name lock request. The were multiple lock
+ requests with the same client lock handle and
+ mds_getattr_name->fixup_handle_for_resent_request found one
+ of the lock handles but later failed with
+ ASSERTION(granted_lock != NULL).
+
+Severity : major
+Frequency : rare
+Bugzilla : 10891
+Description: handle->h_buffer_credits > 0, assertion failure
+Details : h_buffer_credits is zero after truncate, causing assertion
+ failure. This patch extends the transaction or creates a new
+ one after truncate.
+
+Severity : normal
+Frequency : NFS re-export or patchless client
+Bugzilla : 11179, 10796
+Description: Crash on NFS re-export node (__d_move)
+Details : We do not want to hash the dentry if we don't have a lock.
+ But if this dentry is later used in d_move, we'd hit uninitialised
+ list head d_hash, so we just do this to init d_hash field but
+ leave dentry unhashed.
+
+Severity : normal
+Frequency : NFS re-export or patchless client
+Bugzilla : 11135
+Description: NFS exports has problem with symbolic link
+Details : lustre client didn't properly install dentry when re-exported
+ to NFS or running patchless client.
+
+Severity : normal
+Frequency : NFS re-export or patchless client
+Bugzilla : 10796
+Description: Various nfs/patchless fixes.
+Details : fixes reuse disconected alias for lookup process - this fixes
+ warning "find_exported_dentry: npd != pd", fix permission
+ error with open files at nfs.
+
+Severity : normal
+Frequency : occasional
+Bugzilla : 11191
+Description: Crash on NFS re-export node
+Details : call clear_page on wrong pointer triggered oops in
+ generic_mapping_read().
+
+Severity : normal
+Frequency : rarely, using O_DIRECT IO
+Bugzilla : 10903
+Description: unaligned directio crashes client with LASSERT
+Details : check for unaligned buffers before trying any requests.
+
+Severity : major
+Frequency : rarely, using CFS RAID5 patches in non-standard kernel series
+Bugzilla : 11313
+Description: stale data returned from RAID cache
+Details : If only a small amount of IO is done to the RAID device before
+ reading it again it is possible to get stale data from the RAID
+ cache instead of reading it from disk.
+
+Severity : major
+Frequency : depends on arch, kernel and compiler version, always on sles10
+ kernel and x86_64
+Bugzilla : 11562
+Description: recursive or deep enough symlinks cause stack overflow
+Details : getting rid of large stack-allocated variable in
+ __vfs_follow_link
+
+Severity : minor
+Frequency : depends on hardware
+Bugzilla : 11540
+Description: lustre write performance loss in the SLES10 kernel
+Details : the performance loss is caused by using of write barriers in the
+ ext3 code. The SLES10 kernel turns barrier support on by
+ default. The fix is to undo that change for ldiskfs.
+
+------------------------------------------------------------------------------
+
+2006-12-09 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.8
+ * Support for kernels:
+ 2.6.9-42.0.3EL (RHEL 4)
+ 2.6.5-7.276 (SLES 9)
+ 2.4.21-47.0.1.EL (RHEL 3)
+ 2.6.12.6 vanilla (kernel.org)
+ * bug fixes
+
+Severity : major
+Frequency : quota enabled and large files being deleted
+Bugzilla : 10707
+Description: releasing more than 4GB of quota at once hangs OST
+Details : If a user deletes more than 4GB of files on a single OST it
+ will cause the OST to spin in an infinite loop. Release
+ quota in < 4GB chunks, or use a 64-bit value for 1.4.7.1+.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10845
+Description: statfs data retrieved from /proc may be stale or zero
+Details : When reading per-device statfs data from /proc, in the
+ {kbytes,files}_{total,free,avail} files, it may appear
+ as zero or be out of date.
+
+Severity : normal
+Frequency : always, for aggregate stripe size over 4GB
+Bugzilla : 10725
+Description: "lfs setstripe" fails assertion when setting 4GB+ stripe width
+Details : Using "lfs setstripe" to set stripe size * stripe count over 4GB
+ will fail the kernel with "ASSERTION(lsm->lsm_xfersize != 0)"
+
+Severity : minor
+Frequency : always if "lfs find" used on a local file/directory
+Bugzilla : 10864
+Description: "lfs find" segfaults if used on a local file/directory
+Details : The case where a directory component was not specified wasn't
+ handled correctly. Handle this properly.
+
+Severity : normal
+Frequency : always on ppc64
+Bugzilla : 10634
+Description: the write to an ext3 filesystem mounted with mballoc got stuck
+Details : ext3_mb_generate_buddy() uses find_next_bit() which does not
+ perform endianness conversion.
+
+Severity : major
+Frequency : rarely (truncate to non-zero file size after write under load)
+Bugzilla : 10730, 10687
+Description: Files padded with zeros to next 4K multiple
+Details : With filesystems mounted using the "extents" option (2.6 kernels)
+ it is possible that files that are truncated to a non-zero size
+ immediately after being written are filled with zero bytes beyond
+ the truncated size. No file data is lost.
+
+Severity : enhancement
+Frequency : liblustre only
+Bugzilla : 10452
+Description: Allow recovery/failover for liblustre clients.
+Details : liblustre clients were unaware of failover configurations until
+ now.
+
+Severity : enhancement
+Bugzilla : 10743
+Description: user file locks should fail when not mounting with flock option
+Details : Set up an error-returning stub in ll_file_operations.lock field
+ to prevent incorrect behaviour when client is mounted without
+ flock option. Also, set up properly f_op->flock field for
+ RHEL4 kernels.
+
+Severity : minor
+Frequency : always on ia64
+Bugzilla : 10905
+Description: "lfs df" loops on printing out MDS statfs information
+Details : The obd_ioctl_data was not initialized and in some systems
+ this caused a failure during the ioctl that did not return
+ an error. Initialize the struct and return an error on failure.
+
+Severity : minor
+Frequency : SLES 9 only
+Bugzilla : 10667
+Description: Error of copying files with lustre special EAs as root
+Details : Client side always return success for setxattr call for lustre
+ special xattr (currently only "trusted.lov").
+
+Severity : normal
+Frequency : rarely on clusters with both ia64+i386 clients
+Bugzilla : 10672
+Description: ia64+i686 clients doing shared IO on the same file may LBUG
+Details : In rare cases when both ia64+i686 (or other mixed-PAGE_SIZE)
+ clients are doing concurrent writes to the same file it is
+ possible that the ia64 clients may LASSERT because the OST
+ extent locks are not PAGE_SIZE aligned. Ensure that grown
+ locks are always aligned on the request boundary.
+
+Severity : normal
+Frequency : specific use, occasional
+Bugzilla : 7040
+Description: Overwriting in use executable truncates on-disk binary image
+Details : If one node attempts to overwrite an executable in use by
+ another node, we now correctly return ETXTBSY instead of
+ truncating the file.
+
+Severity : normal
+Frequency : rare
+Bugzilla : 2707
+Description: chmod on Lustre root is propagated to other clients
+Details : Re-validate root's dentry in ll_lookup_it to avoid having it
+ invalid by the follow_mount time.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10883
+Description: Race in 'instant cancel' lock handling could lead to such locks
+ never to be granted in case of SMP MDS
+Details : Do not destroy not yet granted but cbpending locks in
+ handle_enqueue
+
+Severity : minor
+Frequency : replay/resend of open
+Bugzilla : 10991
+Description: non null lock assetion failure in mds_intent_policy
+Details : Trying to replay/resend lockless open requests resulted in
+ mds_open() returning 0 with no lock. Now it sets a flag if
+ a lock is going to be returned.
+
+Severity : enhancement
+Bugzilla : 10889
+Description: Checksum enhancements
+Details : New checksum enhancements allow for resending RPCs that failed
+ checksum checks.
+
+Severity : enhancement
+Bugzilla : 7376
+Description: Tunables on number of dirty pages in cacche
+Details : Allow to set limit on number of dirty pages cached.
+
+Severity : normal
+Frequency : rare
+Bugzilla : 10643
+Description: client crash on unmount - lock still has references
+Details : In some error handling cases it was possible to leak a lock
+ reference on a client while accessing a file. This was not
+ harmful to the client during operation, but would cause the
+ client to crash when the filesystem is unmounted.
+
+Severity : normal
+Frequency : specific case, rare
+Bugzilla : 10921
+Description: ETXTBSY on mds though file not in use by client
+Details : ETXTBSY is no longer incorrectly returned when attempting to
+ chmod or chown a directory that the user previously tried to
+ execute or a currently-executing binary.
+
+Severity : major
+Frequency : extremely rare except on liblustre-based clients
+Bugzilla : 10480
+Description: Lustre space not freed when files are deleted
+Details : Clean up open-unlinked files after client eviction. Previously
+ the unlink was skipped and the files remained as orphans.
+
+Severity : normal
+Frequency : rare
+Bugzilla : 10999
+Description: OST failure "would be an LBUG" in waiting_locks_callback()
+Details : In some cases it was possible to send a blocking callback to
+ a client doing a glimpse, even though that client didn't get
+ a lock granted. When the glimpse lock is cancelled on the OST
+ the freed lock is left on the waiting list and corrupted the list.
+
+Severity : major
+Frequency : all core dumps
+Bugzilla : 11103
+Description: Broke core dumps to lustre
+Details : Negative dentry may be unhashed if parent does not have UPDATE
+ lock, but some callers, e.g. do_coredump, expect dentry to be
+ hashed after successful create, hash it in ll_create_it.
+
+------------------------------------------------------------------------------
+
+2006-09-13 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.7.1
+ * Support for kernels:
+ 2.6.9-42.0.2.EL (RHEL 4)
+ 2.6.5-7.276 (SLES 9)
+ 2.4.21-40.EL (RHEL 3)
+ 2.6.12.6 vanilla (kernel.org)
+ * bug fix
+
+Severity : major
+Frequency : always on RHEL 3
+Bugzilla : 10867
+Description: Number of open files grows over time
+Details : The number of open files grows over time, whether or not
+ Lustre is started. This was due to a filp leak introduced
+ by one of our kernel patches.
+
+------------------------------------------------------------------------------
+
+2006-08-20 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.7
+ * Support for kernels:
+ 2.6.9-42.EL (RHEL 4)
+ 2.6.5-7.267 (SLES 9)
+ 2.4.21-40.EL (RHEL 3)
+ 2.6.12.6 vanilla (kernel.org)
+ * bug fixes
+
+Severity : major
+Frequency : rare
+Bugzilla : 5719, 9635, 9792, 9684
+Description: OST (or MDS) trips assertions in (re)connection under heavy load
+Details : If a server is under heavy load and cannot reply to new
+ connection requests before the client resends the (re)connect,
+ the connection handling code can behave badly if two service
+ threads are concurrently handing separate (re)connections from
+ the same client. Add better locking to the connection handling
+ code, and ensure that only a single connection will be processed
+ for a given client UUID, even if the lock is dropped.
+
+Severity : enhancement
+Bugzilla : 3627
+Description: add TCP zero-copy support to kernel
+Details : Add support to the kernel TCP stack to allow zero-copy bulk
+ sends if the hardware supports scatter-gather and checksumming.
+ This allows socklnd to do client-write and server-read more
+ efficiently and reduce CPU utilization from skbuf copying.
+
+Severity : minor
+Frequency : only if NFS exporting from client
+Bugzilla : 10258
+Description: NULL pointer deref in ll_iocontrol() if chattr mknod file
+Details : If setting attributes on a file created under NFS that had
+ never been opened it would be possible to oops the client
+ if the file had no objects.
+
+Severity : major
+Frequency : rare
+Bugzilla : 9326, 10402, 10897
+Description: client crash in ptlrpcd_wake() thread when sending async RPC
+Details : It is possible that ptlrpcd_wake() dereferences a freed async
+ RPC. In rare cases the ptlrpcd thread alread processed the RPC
+ before ptlrpcd_wake() was called and the request was freed.
+
+Severity : minor
+Frequency : always for liblustre
+Bugzilla : 10290
+Description: liblustre client does MDS+OSTs setattr RPC for each write
+Details : When doing a write from a liblustre client, the client
+ incorrectly issued an RPC to the MDS and each OST the file was
+ striped over in order to update the timestamps. When writing
+ with small chunks and many clients this could overwhelm the MDS
+ with RPCs. In all cases it would slow down the write because
+ these RPCs are unnecessary.
+
+Severity : enhancement
+Bugzilla : 9340
+Description: allow number of MDS service threads to be changed at module load
+Details : It is now possible to change the number of MDS service threads
+ running. Adding "options mds mds_num_threads={N}" to the MDS's
+ /etc/modprobe.conf will set the number of threads for the next
+ time Lustre is restarted (assuming the "mds" module is also
+ reloaded at that time). The default number of threads will
+ stay the same, 32 for most systems.
+
+Severity : major
+Frequency : rare
+Bugzilla : 10300
+Description: OST crash if filesystem is unformatted or corrupt
+Details : If an OST is started on a device that has never been formatted
+ or if the filesystem is corrupt and cannot even mount then the
+ error handling cleanup routines would dereference a NULL pointer.
+
+Severity : normal
+Frequency : rare
+Bugzilla : 10047
+Description: NULL pointer deref in llap_from_page.
+Details : get_cache_page_nowait can return a page with NULL (or otherwise
+ incorrect) mapping if the page was truncated/reclaimed while it was
+ searched for. Check for this condition and skip such pages when
+ doing readahead. Introduce extra check to llap_from_page() to
+ verify page->mapping->host is non-NULL (so page is not anonymous).
+
+Severity : minor
+Frequency : Sometimes when using sys_sendfile
+Bugzilla : 7020
+Description: "page not covered by a lock" warnings from ll_readpage
+Details : sendfile called ll_readpage without right page locks present.
+ Now we introduced ll_file_sendfile that does necessary locking
+ around call to generic_file_sendfile() much like we do in
+ ll_file_read().
+
+Severity : normal
+Frequency : with certain MDS communication failures at client mount time
+Bugzilla : 10268
+Description: NULL pointer deref after failed client mount
+Details : a client connection request may delayed by the network layer
+ and not be sent until after the PTLRPC layer has timed out the
+ request. If the client fails the mount immediately it will try
+ to clean up before the network times out the request. Add a
+ reference from the request import to the obd device and delay
+ the cleanup until the network drops the request.
+
+Severity : normal
+Frequency : occasionally during client (re)connect
+Bugzilla : 9387
+Description: assertion failure during client (re)connect
+Details : processing a client connection request may be delayed by the
+ client or server longer than the client connect timeout. This
+ causes the client to resend the connection request. If the
+ original connection request is replied in this interval, the
+ client may trip an assertion failure in ptlrpc_connect_interpret()
+ which thought it would be the only running connect process.
+
+Severity : normal
+Frequency : only with obd_echo servers and clients that are rebooted
+Bugzilla : 10140
+Description: kernel BUG accessing uninitialized data structure
+Details : When running an obd_echo server it did not start the ping_evictor
+ thread, and when a client was evicted an uninitialized data
+ structure was accessed. Start the ping_evictor in the RPC
+ service startup instead of the OBD startup.
+
+Severity : enhancement
+Bugzilla : 10193 (patchless)
+Description: Remove dependency on various unexported kernel interfaces.
+Details : No longer need reparent_to_init, exit_mm, exit_files,
+ sock_getsockopt, filemap_populate, FMODE_EXEC, put_filp.
+
+Severity : minor
+Frequency : rare (only users of deprecated and unsupported LDAP config)
+Bugzilla : 9337
+Description: write_conf for zeroconf mount queried LDAP incorrectly for client
+Details : LDAP apparently contains 'lustreName' attributes instead of
+ 'name'. A simple remapping of the name is sufficient.
+
+Severity : major
+Frequency : rare (only with non-default dump_on_timeout debug enabled)
+Bugzilla : 10397
+Description: waiting_locks_callback trips kernel BUG if client is evicted
+Details : Running with the dump_on_timeout debug flag turned on makes
+ it possible that the waiting_locks_callback() can try to dump
+ the Lustre kernel debug logs from an interrupt handler. Defer
+ this log dumping to the expired_lock_main() thread.
+
+Severity : enhancement
+Bugzilla : 10420
+Description: Support NFS exporting on 2.6 kernels.
+Details : Implement non-rawops metadata methods for NFS server to use without
+ changing NFS server code.
+
+Severity : normal
+Frequency : very rare (synthetic metadata workload only)
+Bugzilla : 9974
+Description: two racing renames might cause an MDS thread to deadlock
+Details : Running the "racer" program may cause one MDS thread to rename
+ a file from being the source of a rename to being the target of
+ a rename at exactly the same time that another thread is doing
+ so, and the second thread has already enqueued these locks after
+ doing a lookup of the target and is trying to relock them in
+ order. Ensure that we don't try to re-lock the same resource.
+
+Severity : major
+Frequency : only very large systems with liblustre clients
+Bugzilla : 7304
+Description: slow eviction of liblustre clients with the "evict_by_nid" RPC
+Details : Use asynchronous set_info RPCs to send the "evict_by_nid" to
+ all OSTs in parallel. This allows the eviction of stale liblustre
+ clients to proceed much faster than if they were done in series,
+ and also offers similar improvements for other set_info RPCs.
+
+Severity : minor
+Frequency : common
+Bugzilla : 10265
+Description: excessive CPU usage during initial read phase on client
+Details : During the initial read phase on a client, it would agressively
+ retry readahead on the file, consuming too much CPU and impacting
+ performance (since 1.4.5.8). Improve the readahead algorithm
+ to avoid this, and also improve some other common cases (read
+ of small files in particular, where "small" is files smaller than
+ /proc/fs/lustre/llite/*/max_read_ahead_whole_mb, 2MB by default).
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10450
+Description: MDS crash when receiving packet with unknown intent.
+Details : Do not LBUG in unknown intent case, just return -EFAULT
+
+Severity : enhancement
+Bugzilla : 9293, 9385
+Description: MDS RPCs are serialised on client. This is unnecessary for some.
+Details : Do not serialize getattr (non-intent version) and statfs.
+
+Severity : minor
+Frequency : occasional, when OST network is overloaded/intermittent
+Bugzilla : 10416
+Description: client evicted by OST after bulk IO timeout
+Details : If a client sends a bulk IO request (read or write) the OST
+ may evict the client if it is unresposive to its data GET/PUT
+ request. This is incorrect if the network is overloaded (takes
+ too long to transfer the RPC data) or dropped the OST GET/PUT
+ request. There is no need to evict the client at all, since
+ the pinger and/or lock callbacks will handle this, and the
+ client can restart the bulk request.
+
+Severity : minor
+Frequency : Always when mmapping file with no objects
+Bugzilla : 10438
+Description: client crashes when mmapping file with no objects
+Details : Check that we actually have objects in a file before doing any
+ operations on objects in ll_vm_open, ll_vm_close and
+ ll_glimpse_size.
+
+Severity : minor
+Frequency : Rare
+Bugzilla : 10484
+Description: Request leak when working with deleted CWD
+Details : Introduce advanced request refcount tracking for requests
+ referenced from lustre intent.
+
+Severity : Enhancement
+Bugzilla : 10482
+Description: Cache open file handles on client.
+Details : MDS now will return special lock along with openhandle, if
+ requested and client is allowed to hold openhandle, even if unused,
+ until such a lock is revoked. Helps NFS a lot, since NFS is opening
+ closing files for every read/write openration.
+
+Severity : Enhancement
+Bugzilla : 9291
+Description: Cache open negative dentries on client when possible.
+Details : Guard negative dentries with UPDATE lock on parent dir, drop
+ negative dentries on lock revocation.
+
+Severity : minor
+Frequency : Always
+Bugzilla : 10510
+Description: Remounting a client read-only wasn't possible with a zconf mount
+Details : It wasn't possible to remount a client read-only with llmount.
+
+Severity : enhancement
+Description: Include MPICH 1.2.6 Lustre ADIO interface patch
+Details : In lustre/contrib/ or /usr/share/lustre in RPM a patch for
+ MPICH is included to add Lustre-specific ADIO interfaces.
+ This is based closely on the UFS ADIO layer and only differs
+ in file creation, in order to allow the OST striping to be set.
+ This is user-contributed code and not supported by CFS.
+
+Severity : minor
+Frequency : Always
+Bugzilla : 9486
+Description: extended inode attributes (immutable, append-only) work improperly
+ when 2.4 and 2.6 kernels are used on client/server or vice versa
+Details : Introduce kernel-independent values for these flags.
+
+Severity : enhancement
+Frequency : Always
+Bugzilla : 10248
+Description: Allow fractional MB tunings for lustre in /proc/ filesystem.
+Details : Many of the /proc/ tunables can only be tuned at a megabyte
+ granularity. Now, Fractional MB granularity is be supported,
+ this is very useful for low memory system.
+
+Severity : enhancement
+Bugzilla : 9292
+Description: Getattr by fid
+Details : Getting a file attributes by its fid, obtaining UPDATE|LOOKUP
+ locks, avoids extra getattr rpc requests to MDS, allows '/' to
+ have locks and avoids getattr rpc requests for it on every stat.
+
+Severity : major
+Frequency : Always, for filesystems larger than 2TB
+Bugzilla : 6191
+Description: ldiskfs crash at mount for filesystem larger than 2TB with mballoc
+Details : Kenrel kmalloc limits allocations to 128kB and this prevents
+ filesystems larger than 2TB to be mounted with mballoc enabled.
+
+Severity : critical
+Frequency : Always, for 32-bit kernel without CONFIG_LBD and filesystem > 2TB
+Bugzilla : 6191
+Description: filesystem corruption for non-standard kernels and very large OSTs
+Details : If a 32-bit kernel is compiled without CONFIG_LBD enabled and a
+ filesystems larger than 2TB is mounted then the kernel will
+ silently corrupt the start of the filesystem. CONFIG_LBD is
+ enabled for all CFS-supported kernels, but the possibility of
+ this happening with a modified kernel config exists.
+
+Severity : enhancement
+Bugzilla : 10462
+Description: add client O_DIRECT support for 2.6 kernels
+Details : It is now possible to do O_DIRECT reads and writes to files
+ in the Lustre client mountpoint on 2.6 kernel clients.
+
+Severity : enhancement
+Bugzilla : 10446
+Description: parallel glimpse, setattr, statfs, punch, destroy requests
+Details : Sends glimpse, setattr, statfs, punch, destroy requests to OSTs in
+ parallel, not waiting for response from every OST before sending
+ a rpc to the next OST.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10150
+Description: setattr vs write race when updating file timestamps
+Details : Client processes that update a file timestamp into the past
+ right after writing to the file (e.g. tar) it is possible that
+ the updated file modification time can be reset to the current
+ time due to a race between processing the setattr and write RPC.
+
+Severity : enhancement
+Bugzilla : 10318
+Description: Bring 'lfs find' closer in line with regular Linux find.
+Details : lfs find util supports -atime, -mtime, -ctime, -maxdepth, -print,
+ -print0 options and obtains all the needed info through the lustre
+ ioctls.
+
+Severity : enhancement
+Bugzilla : 6221
+Description: support up to 1024 configured devices on one node
+Details : change obd_dev array from statically allocated to dynamically
+ allocated structs as they are first used to reduce memory usage
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10437
+Description: Flush dirty partially truncated pages during truncate
+Details : Immediatelly flush partially truncated pages in filter_setattr,
+ this way we completely avoid having any pages in page cache on OST
+ and can retire ugly workarounds during writes to flush such pages.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10409
+Description: i_sem vs transaction deadlock in mds_obd_destroy during unlink.
+Details : protect inode from truncation within vfs_unlink() context
+ just take a reference before calling vfs_unlink() and release it
+ when parent's i_sem is free.
+
+Severity : minor
+Frequency : always, if extents are used on OSTs
+Bugzilla : 10703
+Description: index ei_leaf_hi (48-bit extension) is not zeroed in extent index
+Details : OSTs using the extents format would not zero the high 16 bits of
+ the index physical block number. This is not a problem for any
+ OST filesystems smaller than 16TB, and no kernels support ext3
+ filesystems larger than 16TB yet. This is fixed in 1.4.7 (all
+ new/modified files) and can be fixed for existing filesystems
+ with e2fsprogs-1.39-cfs1.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 9387
+Description: import connection selection may be incorrect if timer wraps
+Details : Using a 32-bit jiffies timer with HZ=1000 may cause backup
+ import connections to be ignored if the 32-bit jiffies counter
+ wraps. Use a 64-bit jiffies counter.
+
+Severity : major
+Frequency : during server recovery
+Bugzilla : 10479
+Description: crash after server is denying duplicate export
+Details : If clients are resending connect requests to the server, the
+ server refuses to allow a client to connect multiple times.
+ Fixed a bug in the handling of this case.
+
+Severity : minor
+Frequency : very large clusters immediately after boot
+Bugzilla : 10083
+Description: LNET request buffers exhausted under heavy short-term load
+Details : If a large number of client requests are generated on a service
+ that has previously never seen so many requests it is possible
+ that the request buffer growth cannot keep up with the spike in
+ demand. Instead of dropping incoming requests, they are held in
+ the LND until the RPC service can accept more requests.
+
+Severity : minor
+Frequency : Sometimes during replay
+Bugzilla : 9314
+Description: Assertion failure in ll_local_open after replay.
+Details : If replay happened on an open request reply before we were able
+ to set replay handler, reply will become not swabbed tripping the
+ assertion in ll_local_open. Now we set the handler right after
+ recognising of open request
+
+Severity : minor
+Frequency : very rare
+Bugzilla : 10584
+Description: kernel reports "badness in vsnprintf"
+Details : Reading from the "recovery_status" /proc file in small chunks
+ may cause a negative length in lprocfs_obd_rd_recovery_status()
+ call to vsnprintf() (which is otherwise harmless). Exit early
+ if there is no more space in the output buffer.
+
+Severity : enhancement
+Bugzilla : 2259
+Description: clear OBD RPC statistics by writing to them
+Details : It is now possible to clear the OBD RPC statistics by writing
+ to the "stats" file.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10641
+Description: Client mtime is not the same on different clients after utimes
+Details : In some cases, the client was using the utimes() syscall on
+ a file cached on another node. The clients now validate the
+ ctime from the MDS + OSTs to determine which one is right.
+
+Severity : minor
+Frequency : always
+Bugzilla : 10611
+Description: Inability to activate failout mode
+Details : lconf script incorrectly assumed that in python string's numeric
+ value is used in comparisons.
+
+Severity : minor
+Frequency : always with multiple stripes per file
+Bugzilla : 10671
+Description: Inefficient object allocation for mutli-stripe files
+Details : When selecting which OSTs to stripe files over, for files with
+ a stripe count that divides evenly into the number of OSTs,
+ the MDS is always picking the same starting OST for each file.
+ Return the OST selection heuristic to the original design.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 10673
+Description: mount failures may take full timeout to return an error
+Details : Under some heavy load conditions it is possible that a
+ failed mount can wait for the full obd_timeout interval,
+ possibly several minutes, before reporting an error.
+ Instead return an error as soon as the status is known.
+
+------------------------------------------------------------------------------
+
+2006-02-14 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.6
+ * WIRE PROTOCOL CHANGE. This version of Lustre networking WILL NOT
+ INTEROPERATE with older versions automatically. Please read the
+ user documentation before upgrading any part of a live system.
+ * WARNING: Lustre networking configuration changes are required with
+ this release. See https://bugzilla.clusterfs.com/show_bug.cgi?id=10052
+ for details.
+ * bug fixes
+ * Support for kernels:
+ 2.6.9-22.0.2.EL (RHEL 4)
+ 2.6.5-7.244 (SLES 9)
+ 2.6.12.6 vanilla (kernel.org)
+
+
+Severity : enhancement
+Bugzilla : 7981/8208
+Description: Introduced Lustre Networking (LNET)
+Details : LNET is new networking infrastructure for Lustre, it includes
+ a reorganized network configuration mode (see the user
+ documentation for full details) as well as support for routing
+ between different network fabrics. Lustre Networking Devices
+ (LNDs) for the supported network fabrics have also been
+ created for this new infrastructure.
+
+Severity : enhancement
+Description: Introduced Access control lists
+Details : clients can set ACLs on files and directories in order to have
+ more fine-grained permissions than the standard Unix UGO+RWX.
+ The MDS must be started with the "-o acl" mount option.
+
+Severity : enhancement
+Description: Introduced filesystem quotas
+Details : Administrators may now establish per-user quotas on the
+ filesystem.
+
+Severity : enhancement
+Bugzilla : 7982
+Description: Configuration change for the XT3
+ The PTLLND is now used to run Lustre over Portals on the XT3
+ The configure option(s) --with-cray-portals are no longer used.
+ Rather --with-portals=<path-to-portals-includes> is used to
+ enable building on the XT3. In addition to enable XT3 specific
+ features the option --enable-cray-xt3 must be used.
+
+Severity : major
+Frequency : rare
+Bugzilla : 7407
+Description: Running on many-way SMP OSTs can trigger oops in llcd_send()
+Details : A race between allocating a new llcd and re-getting the llcd_lock
+ allowed another thread to grab newly-allocated llcd.
+
+Severity : enhancement
+Bugzilla : 7116
+Description: 2.6 OST async journal commit and locking fix to improve performance
+Details : The filter_direct_io()+filter_commitrw_write() journal commits for
+ 2.6 kernels are now async as they already were in 2.4 kernels so
+ that they can commit concurrently with the network bulk transfer.
+ For block-allocated files the filter allocation semaphore is held
+ to avoid filesystem fragmentation during allocation. BKL lock
+ removed for 2.6 xattr operations where it is no longer needed.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 8320
+Description: lconf incorrectly determined whether two IP networks could talk
+Details : In some more complicated routing and multiple-network
+ configurations, lconf will avoid trying to make a network
+ connection to a disjoint part of the IP space. It was doing the
+ math incorrectly for one set of cases.
+
+Severity : major
+Frequency : rare
+Bugzilla : 7359
+Description: Fix for potential infinite loop processing records in an llog.
+Details : If an llog record is corrupted/zeroed, it is possible to loop
+ forever in llog_process(). Validate the llog record length
+ and skip the remainder of the block on error.
+
+Severity : minor
+Frequency : occasional (liblustre only)
+Bugzilla : 6363
+Description: liblustre could not open files whose last component is a symlink
+Details : sysio_path_walk() would incorrectly pass the open intent to
+ intermediate path components.
+
+Severity : minor
+Frequency : rare (liblustre only with non-standard tuning)
+Bugzilla : 7201 (7350)
+Description: Tuning the MDC DLM LRU size to zero triggers client LASSERT
+Details : llu_lookup_finish_locks() tries to set lock data on a lock
+ after it has been released, only do this for referenced locks
+
+Severity : enhancement
+Bugzilla : 7328
+Description: specifying an (invalid) directory default stripe_size of -1
+ would reset the directory default striping
+Details : stripe_size -1 was used internally to signal directory stripe
+ removal, now use "all default" to signal dir stripe removal
+ as a directory striping of "all default" is not useful
+
+Severity : minor
+Frequency : common for large clusters running liblustre clients
+Bugzilla : 7198
+Description: doing an ls when liblustre clients are running is slow
+Details : sending a glimpse AST to a liblustre client waits for every AST
+ to time out, as liblustre clients will not respond. Since they
+ cannot cache data we refresh the OST lock LVB from disk instead.
+
+Severity : enhancement
+Bugzilla : 7198
+Description: doing an ls at the same time as file IO can be slow
+Details : enqueue and other "small" requests can be blocked behind many
+ large IO requests. Create a new OST IO portal for non-IO
+ requests so they can be processed faster.
+
+Severity : minor
+Frequency : rare (only HPUX clients mounting unsupported re-exported NFS vol)
+Bugzilla : 5781
+Description: an HPUX NFS client would get -EACCESS when ftruncate()ing a newly
+ created file with mode 000
+Details : the Linux NFS server relies on an MDS_OPEN_OWNEROVERRIDE hack to
+ allow an ftruncate() as a non-root user to a file with mode 000.
+ Lustre now respects this flag to disable mode checks when
+ truncating a file owned by the user
+
+Severity : minor
+Frequency : liblustre-only, when liblustre client dies unexpectedly or becomes
+ busy
+Bugzilla : 7313
+Description: Revoking locks from clients that went dead or catatonic might take
+ a lot of time.
+Details : New lock flags FL_CANCEL_ON_BLOCK used by liblustre makes
+ cancellation of such locks instant on servers without waiting for
+ any reply from clients. Clients drops these locks when cancel
+ notification from server is received without replying.
+
+Severity : minor
+Frequency : liblustre-only, when liblustre client dies or becomes busy
+Bugzilla : 7311
+Description: Doing ls on Linux clients can take a long time with active
+ liblustre clients
+Details : Liblustre client cannot handle ASTs in timely manner, so avoid
+ granting such locks to it in the first place if possible. Locks
+ are taken by proxy on the OST during the read or write and
+ dropped immediately afterward. Add connect flags handling, do
+ not grant locks to liblustre clients for glimpse ASTs.
+
+Severity : enhancement
+Bugzilla : 6252
+Description: Improve read-ahead algorithm to avoid excessive IO for random reads
+Details : Existing read-ahead algorithm is tuned for the case of streamlined
+ sequential reads and behaves badly with applications doing random
+ reads. Improve it by reading ahead at least read region, and
+ avoiding excessive large RPC for small reads.
+
+Severity : enhancement
+Bugzilla : 8330
+Description: Creating more than 1000 files for a single job may cause a load
+ imbalance on the OSTs if there are also a large number of OSTs.
+Details : qos_prep_create() uses an OST index reseed value that is an
+ even multiple of the number of available OSTs so that if the
+ reseed happens in the middle of the object allocation it will
+ still utilize the OSTs as uniformly as possible.
+
+Severity : major
+Frequency : rare
+Bugzilla : 8322
+Description: OST or MDS may oops in ping_evictor_main()
+Details : ping_evictor_main() drops obd_dev_lock if deleting a stale export
+ but doesn't restart at beginning of obd_exports_timed list
+ afterward.
+
+Severity : enhancement
+Bugzilla : 7304
+Description: improve by-nid export eviction on the MDS and OST
+Details : allow multiple exports with the same NID to be evicted at one
+ time without re-searching the exports list.
+
+Severity : major
+Frequency : rare, only with supplementary groups enabled on SMP 2.6 kernels
+Bugzilla : 7273
+Description: MDS may oops in groups_free()
+Details : in rare race conditions a newly allocated group_info struct is
+ freed again, and this can be NULL. The 2.4 compatibility code
+ for groups_free() checked for a NULL pointer, but 2.6 did not.
+
+Severity : minor
+Frequency : common for liblustre clients doing little filesystem IO
+Bugzilla : 9352, 7313
+Description: server may evict liblustre clients accessing contended locks
+Details : if a client is granted a lock or receives a completion AST
+ with a blocking AST already set it would not reply to the AST
+ for LDLM_FL_CANCEL_ON_BLOCK locks. It now replies to such ASTs.
+
+Severity : minor
+Frequency : lfs setstripe, only systems with more than 160 OSTs
+Bugzilla : 9440
+Description: unable to set striping with a starting offset beyond OST 160
+Details : llapi_create_file() incorrectly limited the starting stripe
+ index to the maximum single-file stripe count.
+
+Severity : minor
+Frequency : LDAP users only
+Bugzilla : 6163
+Description: lconf did not handle in-kernel recovery with LDAP properly
+Details : lconf/LustreDB get_refs() is searching the wrong namespace
+
+Severity : enhancement
+Bugzilla : 7342
+Description: bind OST threads to NUMA nodes to improve performance
+Details : all OST threads are uniformly bound to CPUs on a single NUMA
+ node and do their allocations there to localize memory access
+
+Severity : enhancement
+Bugzilla : 7979
+Description: llmount can determine client NID directly from Myrinet (GM)
+Details : the client NID code from gmnalnid was moved directly into
+ llmount, removing the need to use this or specifying the
+ client NID explicitly when mounting GM clients with zeroconf
+
+Severity : minor
+Frequency : if client is started with down MDS
+Bugzilla : 7184
+Description: if client is started with down MDS mount hangs in ptlrpc_queue_wait
+Details : Having an LWI_INTR() wait event (interruptible, but no timeout)
+ will wait indefinitely in ptlrpc_queue_wait->l_wait_event() after
+ ptlrpc_import_delayed_req() because we didn't check if the
+ request was interrupted, and we also didn't break out of the
+ event loop if there was no timeout
+
+Severity : major
+Frequency : rare
+Bugzilla : 5047
+Description: data loss during non-page-aligned writes to a single file from
+ both multiple nodes and multiple threads on one node at same time
+Details : updates to KMS and lsm weren't protected by common lock. Resulting
+ inconsistency led to false short-reads, that were cached and later
+ used by ->prepare_write() to fill in partially written page,
+ leading to data loss.
+
+Severity : minor
+Frequency : always, if lconf --abort_recovery used
+Bugzilla : 7047
+Description: lconf --abort_recovery fails with 'Operation not supported'
+Details : lconf was attempting to abort recovery on the MDT device and not
+ the MDS device
+
+Severity : enhancement
+Bugzilla : 9445
+Description: remove cleanup logs
+Details : replace lconf-generated cleanup logs with lustre internal
+ cleanup routines. Eliminates the need for client-cleanup and
+ mds-cleanup logs.
+
+Severity : enhancement
+Bugzilla : 8592
+Description: add support for EAs (user and system) on lustre filesystems
+Details : it is now possible to store extended attributes in the Lustre
+ client filesystem, and with the user_xattr mount option it
+ is possible to allow users to store EAs on their files also
+
+Severity : enhancement
+Bugzilla : 7293
+Description: Add possibility (config option) to show minimal available OST free
+ space.
+Details : When compiled with --enable-mindf configure option, statfs(2)
+ (and so, df) will return least minimal free space available from
+ all OSTs as amount of free space on FS, instead of summary of
+ free spaces of all OSTs.
+
+Severity : enhancement
+Bugzilla : 7311
+Description: do not expand extent locks acquired on OST-side
+Details : Modify ldlm_extent_policy() to not expand local locks, acquired
+ by server: they are not cached anyway.
+
+Severity : major
+Frequency : when mmap is used/binaries executed from Lustre
+Bugzilla : 9482
+Description: Unmmap pages before throwing them away from read cache.
+Details : llap_shrink cache now attempts to unmap pages before discarding
+ them (if unmapping failed - do not discard). SLES9 kernel has
+ extra checks that trigger if this unmapping is not done first.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 6034
+Description: lconf didn't resolve symlinks before checking to see whether a
+ given mountpoint was already in use
+
+Severity : minor
+Frequency : when migrating failover services
+Bugzilla : 6395, 9514
+Description: When migrating a subset of services from a node (e.g. failback
+ from a failover service node) the remaining services would
+ time out and evict clients.
+Details : lconf --force (implied by --failover) sets the global obd_timeout
+ to 5 seconds in order to quickly disconnect, but this caused
+ other RPCs to time out too quickly. Do not change the global
+ obd_timeout for force cleanup, only set it for DISCONNECT RPCs.
+
+Severity : enhancement
+Frequency : if MDS is started with down OST
+Bugzilla : 9439,5706
+Description: Allow startup/shutdown of an MDS without depending on the
+ availability of the OSTs.
+Details : Asynchronously call mds_lov_synchronize during MDS startup.
+ Add appropriate locking and lov-osc refcounts for safe
+ cleaning. Add osc abort_inflight calls in case the
+ synchronize never started.
+
+Severity : minor
+Frequency : occasional (Cray XT3 only)
+Bugzilla : 7305
+Description: root not authorized to access files in CRAY_PORTALS environment
+Details : The client process capabilities were not honoured on the MDS in
+ a CRAY_PORTALS/CRAY_XT3 environment. If the file had previously
+ been accessed by an authorized user then root was able to access
+ the file on the local client also. The root user capabilities
+ are now allowed on the MDS, as this environment has secure UID.
+
+Severity : minor
+Frequency : occasional
+Bugzilla : 6449
+Description: ldiskfs "too long searching" message happens too often
+Details : A debugging message (otherwise harmless) prints too often on
+ the OST console. This has been reduced to only happen when
+ there are fragmentation problems on the filesystem.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 9598
+Description: Division by zero in statfs when all OSCs are inactive
+Details : lov_get_stripecnt() returns zero due to incorrect order of checks,
+ lov_statfs divides by value returned by lov_get_stripecnt().
+
+Severity : minor
+Frequency : common
+Bugzilla : 9489, 3273
+Description: First write from each client to each OST was only 4kB in size,
+ to initialize client writeback cache, which caused sub-optimal
+ RPCs and poor layout on disk for the first writen file.
+Details : Clients now request an initial cache grant at (re)connect time
+ and so that they can start streaming writes to the cache right
+ away and always do full-sized RPCs if there is enough data.
+ If the OST is rebooted the client also re-establishes its grant
+ so that client cached writes will be honoured under the grant.
+
+Severity : minor
+Frequency : common
+Bugzilla : 7198
+Description: Slow ls (and stat(2) syscall) on files residing on IO-loaded OSTs
+Details : Now I/O RPCs go to different portal number and (presumably) fast
+ lock requests (and glimses) and other RPCs get their own service
+ threads pool that should be able to service those RPCs
+ immediatelly.
+
+Severity : enhancement
+Bugzilla : 7417
+Description: Ability to exchange lustre version between client and servers and
+ issue warnings at client side if client is too old. Also for
+ liblustre clients there is ability to refuse connection of too old
+ clients.
+Details : New 'version' field is added to connect data structure that is
+ filled with version info. That info is later checked by server and
+ by client.
+
+Severity : minor
+Frequency : rare, liblustre only.
+Bugzilla : 9296, 9581
+Description: Two simultaneous writes from liblustre at offset within same page
+ might proceed at the same time overwriting eachother with stale
+ data.
+Details : I/O lock withing llu_file_prwv was released too early, before data
+ actually was hitting the wire. Extended lock-holding time until
+ server acknowledges receiving data.
+
+Severity : minor
+Frequency : extremely rare. Never observed in practice.
+Bugzilla : 9652
+Description: avoid generating lustre_handle cookie of 0.
+Details : class_handle_hash() generates handle cookies by incrementing
+ global counter, and can hit 0 occasionaly (this is unlikely, but
+ not impossible, because initial value of cookie counter is
+ selected randonly). Value of 0 is used as a sentinel meaning
+ "unassigned handle" --- avoid it. Also coalesce two critical
+ sections in this function into one.
+
+Severity : enhancement
+Bugzilla : 9528
+Description: allow liblustre clients to delegate truncate locking to OST
+Details : To avoid overhead of locking, liblustre client instructs OST to
+ take extent lock in ost_punch() on client's behalf. New connection
+ flag is added to handle backward compatibility.
+
+Severity : enhancement
+Bugzilla : 4928, 7341, 9758
+Description: allow number of OST service threads to be specified
+Details : a module parameter allows the number of OST service threads
+ to be specified via "options ost ost_num_threads={N}" in the
+ OSS's /etc/modules.conf or /etc/modprobe.conf.
+
+Severity : major
+Frequency : rare
+Bugzilla : 6146, 9635, 9895
+Description: servers crash with bad pointer in target_handle_connect()
+Details : In rare cases when a client is reconnecting it was possible that
+ the connection request was the last reference for that export.
+ We would temporarily drop the export reference and get a new
+ one, but this may have been the last reference and the export
+ was just destroyed. Get new reference before dropping old one.
+
+Severity : enhancement
+Frequency : if client is started with failover MDS
+Bugzilla : 9818
+Description: Allow multiple MDS hostnames in the mount command
+Details : Try to read the configuration from all specified MDS
+ hostnames during a client mount in case the "primary"
+ MDS is down.
+
+Severity : enhancement
+Bugzilla : 9297
+Description: Stop sending data to evicted clients as soon as possible.
+Details : Check if the client we are about to send or are sending data to
+ was evicted already. (Check is done every second of waiting,
+ for which l_wait_event interface was extended to allow checking
+ of exit condition at specified intervals).
+
+Severity : minor
+Frequency : rare, normally only when NFS exporting is done from client
+Bugzilla : 9301
+Description: 'bad disk LOV MAGIC: 0x00000000' error when chown'ing files
+ without objects
+Details : Make mds_get_md() recognise empty md case and set lmm size to 0.
+
+Severity : minor
+Frequency : always, if srand() is called before liblustre initialization
+Bugzilla : 9794
+Description: Liblustre uses system PRNG disturbing its usage by user application
+Details : Introduce internal to lustre fast and high-quality PRNG for
+ lustre usage and make liblustre and some other places in generic
+ lustre code to use it.
+
+Severity : enhancement
+Bugzilla : 9477, 9557, 9870
+Description: Verify that the MDS configuration logs are updated when xml is
+Details : Check if the .xml configuration logs are newer than the config
+ logs stored on the MDS and report an error if this is the case.
+ Request --write-conf, or allow starting with --old_conf.
+
+Severity : enhancement
+Bugzilla : 6034
+Description: Handle symlinks in the path when checking if Lustre is mounted.
+Details : Resolve intermediate symlinks when checking if a client has
+ mounted a filesystem to avoid duplicate client mounts.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 9309
+Description: lconf can hit an error exception but still return success.
+Details : The lconf command catches the Command error exception at the top
+ level script context and will exit with the associated exit
+ status, but doesn't ensure that this exit status is non-zero.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 9493
+Description: failure of ptlrpc thread startup can cause oops
+Details : Starting a ptlrpc service thread can fail if there are a large
+ number of threads or the server memory is very fragmented.
+ Handle this without oopsing.
+
+Severity : minor
+Frequency : always, only if liblustre and non-default acceptor port was used
+Bugzilla : 9933
+Description: liblustre cannot connect to servers with non-default acceptor port
+Details : tcpnal_set_default_params() was not called and was therefore
+ ignoring the environment varaible TCPNAL_PORT, as well as other
+ TCPNAL_ environment variables
+
+Severity : minor
+Frequency : rare
+Bugzilla : 9923
+Description: two objects could be created on the same OST for a single file
+Details : If an OST is down, in some cases it was possible to create two
+ objects on a single OST for a single file. No problems other
+ than potential performance impact and spurious error messages.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 5681, 9562
+Description: Client may oops in ll_unhash_aliases
+Details : Client dcache may become inconsistent in race condition.
+ In some cases "getcwd" can fail if the current directory is
+ modified.
+
+Severity : minor
+Frequency : always
+Bugzilla : 9942
+Description: Inode refcounting problems in NFS export code
+Details : link_raw functions used to call d_instantiate without obtaining
+ extra inode reference first.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 9942, 9903
+Description: Referencing freed requests leading to crash, memleaks with NFS.
+Details : We used to require that call to ll_revalidate_it was always
+ followed by ll_lookup_it. Also with revalidate_special() it is
+ possible to call ll_revalidate_it() twice for the same dentry
+ even if first occurence returned success. This fix changes semantic
+ between DISP_ENQ_COMPLETE disposition flag to mean there is extra
+ reference on a request referred from the intent.
+ ll_intent_release() then releases such a request.
+
+Severity : minor
+Frequency : rare, normally benchmark loads only
+Bugzilla : 1443
+Description: unlinked inodes were kept in memory on the client
+Details : If a client is repeatedly creating and unlinking files it
+ can accumulate a lot of stale inodes in the inode slab cache.
+ If there is no other client load running this can cause the
+ client node to run out of memory. Instead flush old inodes
+ from client cache that have the same inode number as a new inode.
+
+Severity : minor
+Frequency : SLES9 2.6.5 kernel and long filenames only
+Bugzilla : 9969, 10379
+Description: utime reports stale NFS file handle
+Details : SLES9 uses out-of-dentry names in some cases, which confused
+ the lustre dentry revalidation. Change it to always use the
+ in-dentry qstr.
+
+Severity : major
+Frequency : rare, unless heavy write-truncate concurrency is continuous
+Bugzilla : 4180, 6984, 7171, 9963, 9331
+Description: OST becomes very slow and/or deadlocked during object unlink
+Details : filter_destroy() was holding onto the parent directory lock
+ while truncating+unlinking objects. For very large objects this
+ may block other threads for a long time and slow overall OST
+ responsiveness. It may also be possible to get a lock ordering
+ deadlock in this case, or run out of journal credits because of
+ the combined truncate+unlink. Solution is to do object truncate
+ first in one transaction without parent lock, and then do the
+ final unlink in a new transaction with the parent lock. This
+ reduces the lock hold time dramatically.
+
+Severity : major
+Frequency : rare, 2.4 kernels only
+Bugzilla : 9967
+Description: MDS or OST cleanup may trip kernel BUG when dropping kernel lock
+Details : mds_cleanup() and filter_cleanup() need to drop the kernel lock
+ before unmounting their filesystem in order to avoid deadlock.
+ The kernel_locked() function in 2.4 kernels only checks whether
+ the kernel lock is held, not whether it is this process that is
+ holding it as 2.6 kernels do.
+
+Severity : major
+Frequency : rare
+Bugzilla : 9635
+Description: MDS or OST may oops/LBUG if a client is connecting multiple times
+Details : The client ptlrpc code may be trying to reconnect to a down
+ server before a previous connection attempt has timed out.
+ Increase the reconnect interval to be longer than the connection
+ timeout interval to avoid sending duplicate connections to
+ servers.
+
+Severity : minor
+Frequency : echo_client brw_test command
+Bugzilla : 9919
+Description: fix echo_client to work with OST preallocated code
+Details : OST preallocation code (5137) didn't take echo_client IO path
+ into account: echo_client calls filter methods outside of any
+ OST thread and, hence, there is no per-thread preallocated
+ pages and buffers to use. Solution: hijack pga pages for IO. As
+ a byproduct, this avoids unnecessary data copying.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 3555, 5962, 6025, 6155, 6296, 9574
+Description: Client can oops in mdc_commit_close() after open replay
+Details : It was possible for the MDS to return an open request with no
+ transaction number in mds_finish_transno() if the client was
+ evicted, but without actually returning an error. Clients
+ would later try to replay that open and may trip an assertion
+ Simplify the client close codepath, and always return an error
+ from the MDS in case the open is not successful.
+
+Severity : major
+Frequency : rare, 2.6 OSTs only
+Bugzilla : 10076
+Description: OST may deadlock under high load on fragmented files
+Details : If there was a heavy load and highly-fragmented OST filesystems
+ it was possible to have all the OST threads deadlock waiting on
+ allocation of biovecs, because the biovecs were not released
+ until the entire RPC IO was completed. Instead, release biovecs
+ as soon as they are complete to ensure forward IO progress.
+
+Severity : enhancement
+Bugzilla : 9578
+Description: Support for specifying external journal device at mount
+Details : If an OST or MDS device is formatted with an external journal
+ device, this device major/minor is stored in the ext3 superblock
+ and may not be valid for failover. Allow detecting and
+ specifying the external journal at mount time.
+
+Severity : major
+Frequency : rare
+Bugzilla : 10235
+Description: Mounting an MDS with pending unlinked files may cause oops
+Details : target_finish_recovery() calls mds_postrecov() which returned
+ the number of orphans unlinked. mds_lov_connect->mds_postsetup()
+ considers this an error and immediately begins cleaning up the
+ lov, just after starting the mds_lov process
+
+Severity : enhancement
+Bugzilla : 9461
+Description: Implement 'lfs df' to report actual free space on per-OST basis
+Details : Add sub-command 'df' on 'lfs' to report the disk space usage of
+ MDS/OSDs. Usage: lfs df [-i][-h]. Command Options: '-i' to report
+ usage of objects; '-h' to report in human readable format.
+
+------------------------------------------------------------------------------
+
+2005-08-26 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.5
+ * bug fixes
+
+Severity : major
+Frequency : rare
+Bugzilla : 7264
+Description: Mounting an ldiskfs file system with mballoc may crash OST node.
+Details : ldiskfs mballoc code may reference an uninitialized buddy struct
+ at startup during orphan unlinking. Instead, skip buddy update
+ before setup, as it will be regenerated after recovery is complete.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 7039
+Description: If an OST is inactive, its locks might reference stale inodes.
+Details : lov_change_cbdata() must iterate over all namespaces, even if
+ they are inactive to clear inode references from the lock.
+
+Severity : enhancement
+Frequency : occasional, if non-standard max_dirty_mb used
+Bugzilla : 7138
+Description: Client will block write RPCs if not enough grant
+Details : If a client has max_dirty_mb smaller than max_rpcs_in_flight,
+ then the client will block writes while waiting for another RPC
+ to complete instead of consuming its dirty limit. With change
+ we get improved performance when max_dirty_mb is small.
+
+Severity : enhancement
+Bugzilla : 3389, 6253
+Description: Add support for supplementary groups on the MDS.
+Details : The MDS has an upcall /proc/fs/lustre/mds/{mds}/group_upcall
+ (set to /usr/sbin/l_getgroups if enabled) which will do MDS-side
+ lookups for user supplementary groups into a cache.
+
+Severity : minor
+Bugzilla : 7278
+Description: O_CREAT|O_EXCL open flags in liblustre always return -EEXIST
+Details : Make libsysio to not enforce O_EXCL by clearing the flag,
+ for liblustre O_EXCL is enforced by MDS.
+
+Severity : minor
+Bugzilla : 6455
+Description: readdir never returns NULL in liblustre.
+Details : Corrected llu_iop_getdirentries logic, to return offset of next
+ dentry in struct dirent.
+
+Severity : minor
+Bugzilla : 7137
+Frequency : liblustre only, depends on application IO pattern
+Description: liblustre clients evicted if not contacting servers
+Details : Don't put liblustre clients into the ping_evictor list, so
+ they will not be evicted by the pinger ever.
+
+Severity : enhancement
+Bugzilla : 6902
+Description: Add ability to evict clients by NID from MDS.
+Details : By echoing "nid:$NID" string into
+ /proc/fs/lustre/mds/.../evict_client client with nid that equals to
+ $NID would be instantly evicted from this MDS and from all active
+ OSTs connected to it.
+
+Severity : minor
+Bugzilla : 7198
+Description: Do not query file size twice, somewhat slowing stat(2) calls.
+Details : lookup_it_finish() used to query file size from OSTs that was not
+ needed.
+
+Severity : minor
+Bugzilla : 6237
+Description: service threads change working directory to that of init
+Details : Starting lustre service threads may pin the working directory
+ of the parent thread, making that filesystem busy. Threads
+ now change to the working directory of init to avoid this.
+
+Severity : minor
+Bugzilla : 6827
+Frequency : during shutdown only
+Description: shutdown with a failed MDS or OST can cause unmount to hang
+Details : Don't resend DISCONNECT messages in ptlrpc_disconnect_import()
+ if server is down.
+
+Severity : minor
+Bugzilla : 7331
+Frequency : 2.6 only
+Description: chmod/chown may include an extra supplementary group
+Details : ll{,u}_mdc_pack_op_data() does not properly initialize the
+ supplementary group and if none is specified this is used.
+
+Severity : minor
+Bugzilla : 5479 (6816)
+Frequency : rare
+Description: Racing open + rm can assert client in mdc_set_open_replay_data()
+Details : If lookup is in progress on a file that is unlinked we might try
+ to revalidate the inode and fail in revalidate after lookup is
+ complete and ll_file_open() enqueues the open again but
+ it_open_error() was not checking DISP_OPEN_OPEN errors correctly.
+
+Severity : minor
+Frequency : always, if lconf --abort_recovery used
+Bugzilla : 7047
+Description: lconf --abort_recovery fails with 'Operation not supported'
+Details : lconf was attempting to abort recovery on the MDT device and not
+ the MDS device
+
+------------------------------------------------------------------------------
+
+2005-08-08 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.4
+ * bug fixes
+
+Severity : major
+Frequency : rare (only unsupported configurations with a node running as an
+ OST and a client)
+Bugzilla : 6514, 5137
+Description: Mounting a Lustre file system on a node running as an OST could
+ lead to deadlocks
+Details : OSTs now preallocates memory needed to write out data at
+ startup, instead of when needed, to avoid having to
+ allocate memory in possibly low memory situations.
+ Specifically, if the file system is mounted on on OST,
+ memory pressure could force it to try to write out data,
+ which it needed to allocate memory to do. Due to the low
+ memory, it would be unable to do so and the node would
+ become unresponsive.
+
+Severity : enhancement
+Bugzilla : 7015
+Description: Addition of lconf --service command line option
+Details : lconf now accepts a '--service <arg>' option, which is
+ shorthand for 'lconf --group <arg> --select <arg>=<hostname>'
+
+Severity : enhancement
+Bugzilla : 6101
+Description: Failover mode is now the default for OSTs.
+Details : By default, OSTs will now run in failover mode. To return to
+ the old behaviour, add '--failout' to the lmc line for OSTs.
+
+Severity : enhancement
+Bugzilla : 1693
+Description: Health checks are now provided for MDS and OSTs
+Details : Additional detailed health check information on MSD and OSTs
+ is now provided through the procfs health_check value.
+
+Severity : minor
+Frequency : occasional, depends on IO load
+Bugzilla : 4466
+Description: Disk fragmentation on the OSTs could eventually cause slowdowns
+ after numerous create/delete cycles
+Details : The ext3 inode allocation policy would not allocate new inodes
+ very well on the OSTs because there are no new directories
+ being created. Instead we look for groups with free space if
+ the parent directories are nearly full.
+
+Severity : major
+Bugzilla : 6302
+Frequency : rare
+Description: Network or server problems during mount may cause partially
+ mounted clients instead of returning an error.
+Details : The config llog parsing code may overwrite the error return
+ code during mount error handling, returning success instead
+ of an error.
+
+Severity : minor
+Bugzilla : 6422
+Frequency : rare
+Description: MDS can fail to allocate large reply buffers
+Details : After long uptimes the MDS can fail to allocate large reply
+ buffers (e.g. zconf client mount config records) due to memory
+ fragmentation or consumption by the buffer cache. Preallocate
+ some large reply buffers so that these replies can be sent even
+ under memory pressure.
+
+Severity : minor
+Bugzilla : 6266
+Frequency : rare (liblustre)
+Description: fsx running with liblustre complained that using truncate() to
+ extend the file doesn't work. This patch corrects that issue.
+Details : This is the liblustre equivalent of the fix for bug 6196. Fixes
+ ATTR_SIZE and lsm use in llu_setattr_raw.
+
+Severity : critical
+Bugzilla : 6866
+Frequency : rare, only 2.6 kernels
+Description: Unusual file access patterns on the MDS may result in inode
+ data being lost in very rare circumstances.
+Details : Bad interaction between the ea-in-inode patch and the "no-read"
+ code in the 2.6 kernel caused the inode and/or EA data not to
+ be read from disk, causing single-file corruption.
+
+Severity : critical
+Bugzilla : 6998
+Frequency : rare, only 2.6 filesystems using extents
+Description: Heavy concurrent write and delete load may cause data corruption.
+Details : It was possible under high-load situations to have an extent
+ metadata block in the block device cache from a just-unlinked
+ file overwrite a newly-allocated data block. We now unmap any
+ metadata buffers that alias just-allocated data blocks.
+
+Severity : minor
+Bugzilla : 7241
+Frequency : filesystems with default stripe_count larger than 77
+Description: lconf+mke2fs fail when formatting filesystem with > 77 stripes
+Details : lconf specifies an inode size of 4096 bytes when the default
+ stripe_count is larger than 77. This conflicts with the default
+ inode density of 1 per 4096 bytes. Allocate smaller inodes in
+ this case to avoid pinning too much memory for large EAs.
+
+------------------------------------------------------------------------------
+
+2005-07-07 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.3
+ * bug fixes
+
+Severity : minor
+Frequency : rare (extremely heavy IO load with hundreds of clients)
+Bugzilla : 6172
+Description: Client is evicted, gets IO error writing to file
+Details : lock ordering changes for bug 5492 reintroduced bug 3267 and
+ caused clients to be evicted for AST timeouts. The fixes in
+ bug 5192 mean we no longer need to have such short AST timeouts
+ so ldlm_timeout has been increased.
+
+Severity : major
+Frequency : occasional during --force or --failover shutdown under load
+Bugzilla : 5949, 4834
+Description: Server oops/LBUG if stopped with --force or --failover under load
+Details : a collection of import/export refcount and cleanup ordering
+ issues fixed for safer force cleanup
+
+Severity : major
+Frequency : only filesystems larger than 120 OSTs
+Bugzilla : 5990, 6223
+Description: lfs getstripe would oops on a very large filesystem
+Details : lov_getconfig used kfree on vmalloc'd memory
+
+Severity : minor
+Frequency : only filesystems exporting via NFS to Solaris 10 clients
+Bugzilla : 6242, 6243
+Description: reading from files that had been truncated to a non-zero size
+ but never opened returned no data
+Details : ll_file_read() reads zeros from no-object files to EOF
+
+Severity : major
+Frequency : rare
+Bugzilla : 6200
+Description: A bug in MDS/OSS recovery could cause the OSS to fail an assertion
+Details : There's little harm in aborting MDS/OSS recovery and letting it
+ try again, so I removed the LASSERT and return an error instead.
+
+Severity : enhancement
+Bugzilla : 5902
+Description: New debugging infrastructure for tracking down data corruption
+Details : The I/O checksum code was replaced to: (a) control it at runtime,
+ (b) cover more of the client-side code path, and (c) try to narrow
+ down where problems occurred
+
+Severity : major
+Frequency : rare
+Bugzilla : 3819, 4364, 4397, 6313
+Description: Racing close and eviction MDS could cause assertion in mds_close
+Details : It was possible to get multiple mfd references during close and
+ client eviction, leading to one thread referencing a freed mfd.
+
+Severity: : enhancement
+Bugzilla : 3262, 6359
+Description: Attempts to reconnect to servers are now more aggressive.
+Details : This builds on the enhanced upcall-less recovery that was added
+ in 1.4.2. When trying to reconnect to servers, clients will
+ now try each server in the failover group every 10 seconds. By
+ default, clients would previously try one server every 25 seconds.
+
+Severity : major
+Frequency : rare
+Bugzilla : 6371
+Description: After recovery, certain operations trigger a failed
+ assertion on a client.
+Details : Failing over an mds, using lconf -d --failover, while a
+ client was doing a readdir() call would cause the client to
+ LBUG after recovery completed and the readdir() was resent.
+
+Severity : enhancement
+Bugzilla : 6296
+Description: Default groups are now added by lconf
+Details : You can now run lconf --group <servicename> without having to
+ manually add groups with lmc.
+
+Severity : major
+Frequency : occasional
+Bugzilla : 6412
+Description: Nodes with an elan id of 0 trigger a failed assertion
+
+Severity : minor
+Frequency : always when accessing e.g. tty/console device nodes
+Bugzilla : 3790
+Description: tty and some other devices nodes cannot be used on lustre
+Details : file's private_data field is used by device data and lustre
+ values in there got lost. New field was added to struct file to
+ store fs-specific private data.
+
+Severity : minor
+Frequency : when exporting Lustre via NFS
+Bugzilla : 5275
+Description: NFSD failed occasionally when looking up a path component
+Details : NFSD is looking up ".." which was broken in ext3 directories
+ that had grown large enough to become hashed.
+
+Severity : minor
+Frequency : Clusters with multiple interfaces not on the same subnet
+Bugzilla : 5541
+Description: Nodes will repeatedly try to reconnect to an interface which it
+ cannot reach and report an error to the log.
+Details : Extra peer list entries will be created by lconf with some peers
+ unreachable. lconf now validates the peer before adding it.
+
+Severity : major
+Frequency : Only if a default stripe is set on the filesystem root.
+Bugzilla : 6367
+Description: Setting a default stripe on the filesystem root prevented the
+ filesystem from being remounted.
+Details : The client was sending extra request flags in the root getattr
+ request and did not allocate a reply buffer for the dir EA.
+
+Severity : major
+Frequency : occasional, higher if lots of files are accessed by one client
+Bugzilla : 6159, 6097
+Description: Client trips assertion regarding lsm mismatch/magic
+Details : While revalidating inodes the VFS looks up inodes with ifind()
+ and in rare cases can find an inode that is being freed.
+ The ll_test_inode() code will free the lsm during ifind()
+ when it finds an existing inode and then the VFS later attaches
+ this free lsm to a new inode.
+
+Severity : major
+Frequency : rare
+Bugzilla : 6422, 7030
+Description: MDS deadlock between mkdir and client eviction
+Details : Creating a new file via mkdir or mknod (starting a transaction
+ and getting the ns lock) can deadlock with client eviction
+ (gets ns lock and trying to finish a synchronous transaction).
+
+Severity : minor
+Frequency : occasional
+Description: While starting a server, the fsfilt_ext3 module could not be
+ loaded.
+Details : CFS's improved ext3 filesystem is named ldiskfs for 2.6
+ kernels. Previously, lconf would still use the ext3 name
+ when trying to load modules. Now, it will correctly use
+ ext3 on 2.4 and ldiskfs on 2.6.
+
+Severity : enhancement
+Description: The default stripe count has been changed to 1
+Details : The interpretation of the default stripe count (0, to lfs
+ or lmc) has been changed to mean striping across a single
+ OST, rather than all available. For general usage we have
+ found a stripe count of 1 or 2 works best.
+
+Severity : enhancement
+Description: Add support for compiling against Cray portals.
+Details : Conditional compiling for some areas that are different
+ on Cray Portals.
+
+Severity : major
+Frequency : occasional
+Bugzilla : 6409, 6834
+Description: Creating files with an explicit stripe count may lead to
+ a failed assertion on the MDS
+Details : If some OSTs are full or unavailable, creating files may
+ trigger a failed assertion on the MDS. Now, Lustre will
+ try to use other servers or return an error to the
+ client.
+
+Severity : minor
+Frequency : occasional
+Bugzilla : 6469
+Description: Multiple concurrent overlapping read+write on multiple SMP nodes
+ caused lock timeout during readahead (since 1.4.2).
+Details : Processes doing readahead might match a lock that hasn't been
+ granted yet if there are overlapping and conflicting lock
+ requests. The readahead process waits on ungranted lock
+ (original lock is CBPENDING), while OST waits for that process
+ to cancel CBPENDING read lock and eventually evicts client.
+
+Severity : enhancement
+Bugzilla : 6931
+Description: Initial enabling of flock support for clients
+Details : Implements fcntl advisory locking and file status functions.
+ This feature is provided as an optional mount flag (default
+ off), and is NOT CURRENTLY SUPPORTED. Not all types of record
+ locking are implemented yet, and those that are are not guaranteed
+ to be completely correct in production environments.
+ mount -t lustre -o [flock|noflock] ...
+
+Severity : major
+Frequency : occasional
+Bugzilla : 6198
+Description: OSTs running 2.4 kernels but with extents enabled might trip an
+ assertion in the ext3 JBD (journaling) layer.
+Details : The b_committed_data struct is protected by the big kernel lock
+ in 2.4 kernels, serializing journal_commit_transaction() and
+ ext3_get_block_handle->ext3_new_block->find_next_usable_block()
+ access to this struct. In 2.6 kernels there is finer grained
+ locking to improve SMP performance of the JBD layer.
+
+Severity : minor
+Bugzilla : 6147
+Description: Changes the "SCSI I/O Stats" kernel patch to default to "enabled"
+
+-----------------------------------------------------------------------------
+
+2005-05-05 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.2
+ NOTE: Lustre 1.4.2 uses an incompatible network protocol than previous
+ versions of Lustre. Please update all servers and clients to
+ version 1.4.2 or later at the same time. You must also run
+ "lconf --write-conf {config}.xml" on the MDS while it is stopped
+ to update the configuration logs.
+ * bug fixes
+ - fix for HPUX NFS client breakage when NFS exporting Lustre (5781)
+ - mdc_enqueue does not need max_mds_easize request buffer on send (5707)
+ - swab llog records of type '0' so we get proper header size/idx (5861)
+ - send llog cancel req to DLM cancel portal instead of cb portal (5515)
+ - fix rename of one directory over another leaking an inode (5953)
+ - avoid SetPageDirty on 2.6 (5981)
+ - don't re-add just-being-destroyed locks to the waiting list (5653)
+ - when creating new directories, inherit the parent's custom
+ striping settings if present parent (3048)
+ - flush buffers from cache before direct IO in 2.6 obdfilter (4982)
+ - don't hold i_size_sem in ll_nopage() and ll_ap_refresh_count (6077)
+ - don't hold client locks on temporary worklist from l_lru (5666)
+ - handle IO errors in 2.6 obdfilter bio completion routine (6046)
+ - automatically evict dead clients (5921)
+ - Update file size properly in create+truncate+fstat case (6196)
+ - Do not unhash mountpoint dentries, do not allow removal of
+ mountpoints (5907)
+ - Avoid lock ordering deadlock issue with write/truncate (6203,5654)
+ - reserve enough journal credits in fsfilt_start_log for setattr (4554)
+ - ldlm_enqueue freed-export error path would always LBUG (6149,6184)
+ - don't reference lr_lvb_data until after we hold lr_lvb_sem (6170)
+ - don't overwrite last_rcvd if there is a *_client_add() error (6086)
+ - Correctly handle reads of files with no objects (6243)
+ - lctl recover will also mark a device active if deactivate used (5933)
+ * miscellania
+ - by default create 1 inode per 4kB space on MDS, per 16kB on OSTs
+ - allow --write-conf on an MDS with different nettype than client (5619)
+ - don't write config llogs to MDS for mounts not from that MDS (5617)
+ - lconf should create multiple TCP connections from a client (5201)
+ - init scripts are now turned off by default; run chkconfig --on
+ lustre and chkconfig --on lustrefs to use them
+ - upcalls are no longer needed for clients to recover to failover
+ servers (3262)
+ - add --abort-recovery option to lconf to abort recovery on device
+ startup (6017)
+ - add support for an arbitrary number of OSTs (3026)
+ - Quota support protocol changes.
+ - forward compatibility changes to wire structs (6007)
+ - rmmod NALs that might be loaded because of /etc/modules.conf (6133)
+ - support for mountfsoptions and clientoptions to the Lustre LDAP (5873)
+ - improved "lustre status" script
+ - initialize blocksize for non-regular files (6062)
+ - added --disable-server and --disable-client configure options (5782)
+ - introduce a lookup cache for lconf to avoid repeated DB scans (6204)
+ - Vanilla 2.4.29 support
+ - increase maximum number of obd devices to 520 (6242)
+ - remove the tcp-zero-copy patch from the suse-2.4 series (5902)
+ - Quadrics Elan drivers are now included for the RHEL 3 2.4.21 and
+ SLES 9 2.6.5 kernels
+ - limit stripes per file to 160 (the maximum EA size) (6093)
+
+2005-03-22 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.1
+ * bug fixes
+ - don't LASSERT in ll_release on NULL lld with NFS export (4655, 5760)
+ - hold NS lock when calling handle_ast_error->del_waiting_lock (5746)
+ - fix setattr mtime regression from lovcleanup merge (4829, 5669)
+ - workaround for 2.6 crash in ll_unhash_aliases (5687, 5210)
+ - small ext3 extents cleanups and fixes (5733)
+ - improved mballoc code, several small races and bugs fixed (5733, 5638)
+ - kernel version 43 - fix remove_suid bugs in both 2.4 and 2.6 (5695)
+ - avoid needless client->OST connect, fix handle mismatch (5317)
+ - fix DLM error path that led to out-of-sync client, long delays (5779)
+ - support common vfs-enforced mount options (nodev,nosuid,noexec) (5637)
+ - fix several locking issues related to i_size (5492,5624,5654,5672)
+ - don't move pending lock onto export if it is already evicted (5683)
+ - fix kernel oops when creating .foo in unlinked directory (5548)
+ - fix deadlock in obdfilter statistics vs. object create (5811)
+ - use time_{before,after} to avoid timer jiffies wrap (5882)
+ - shutdown --force/--failover stability (3607,3651,4797,5203,4834)
+ - Do not leak request if server was not able to process it (5154)
+ - If mds_open unable to find parent dir, make that negative lookup(5154)
+ - don't create new directories with extent-mapping (5909, 5936)
+ * miscellania
+ - fix lustre/lustrefs init scripts for SuSE (patch from Scali, 5702)
+ - don't hold the pinger_sem in ptlrpc_pinger_sending_on_import
+ - change obd_increase_kms to obd_adjust_kms (up or down) (5654)
+ - lconf, lmc search both /usr/lib and /usr/lib64 for Python libs (5800)
+ - support for RHEL4 kernel on i686 (5773)
+ - provide error messages when incompatible logs are encountered (5898)
+
+2005-02-18 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.0.10 (1.4.1 release candidate 1)
+ * bug fixes
+ - don't keep a lock reference when lock is not granted (4238)
+ - unsafe list practices (rarely) led to infinite eviction loop (4908)
+ - add per-fs limit of Lustre pages in page cache, avoid OOM (4699)
+ - drop import inflight refcount on signal_completed_replay error (5255)
+ - unlock page after async write error during send (3677)
+ - handle missing objects in filter_preprw_read properly (5265)
+ - no transno return for symlink open, don't save no-trasno open (3440)
+ - don't try to complete elan receive that already failed (4012)
+ - free RPC server reply state on error (5406)
+ - clean up thread from ptlrpc_start_thread() on error (5160)
+ - readahead could read extra page into cache that wasn't ejected (5388)
+ - prevent races in class_attach/setup/cleanup/detach (5260)
+ - don't dereference de->d_inode after l_dput of de (5458)
+ - use "int" for stripe value returned from lock_to_stripe (5544)
+ - mballoc allocation and error-checking fixes in 2.6 (5504)
+ - block device patches to fix I/O request sizes in 2.6 (5482)
+ - look up hostnames for IB nals (5602)
+ - 2.6 changed lock ordering of 2 semaphores, caused deadlock (5654)
+ - don't start multiple acceptors for the same port (5277)
+ - fix incorrect LASSERT in mds_getattr_name (5635)
+ - export a proc file for general "ping" checking (5628)
+ - fix "lfs check" to not block when the MDS is down (5628)
+ * miscellania
+ - service request history (4965)
+ - put {ll,lov,osc}_async_page structs in a single slab (4699)
+ - create an "evict_client" /proc entry on OSTs, like the MDS has
+ - fix mount usage message, return errors per mount(8) (5168)
+ - change grep [] to grep "[]" in tests so they work in more UMLs
+ - fix ppc64/x86_64 spec to use %{_libdir} instead of /usr/lib (5389)
+ - remove ancient LOV_MAGIC_V0 EA support (5047)
+ - add "disk I/Os in flight" and "I/O req time" stats in obdfilter
+ - align r/w RPCs to PTLRPC_MAX_BRW_SIZE boundary for performance (3451)
+ - allow readahead allocations to fail when low on memory (5383)
+ - mmap locking landed again, after considerable improvement (2828)
+ - add get_hostaddr() to lustreDB.py for LDAP support (5459)
+
+2004-11-23 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.4.0
+ * bug fixes
+ - send OST transaction number in read/write reply to free req (4966)
+ - don't ASSERT in ptl_send_rpc() if we run out of memory (5119)
+ - lock /proc/sys/portals/routes internal state, avoiding oops (4827)
+ - the watchdog thread now runs as interruptible (5246)
+ - flock/lockf fixes (but it's still disabled, pending 5135)
+ - don't use EXT3 constants in llite code (5094)
+ - memory shortage at startup could cause assertion (5176)
+ * miscellania
+ - reorganization of lov code
+ - single portals codebase
+ - Infiniband NAL
+ - add extents/mballoc support (5025)
+ - direct I/O reads in the obdfilter (4048)
+ - kernel patches from LNXI for 2.6 (bluesmoke, perfctr, mtd, kexec)
+
+tbd Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.9
+ * bug fixes
+ - send OST transaction number in read/write reply to free req (4966)
+ - don't ASSERT in ptl_send_rpc() if we run out of memory (5119)
+ - lock /proc/sys/portals/routes internal state, avoiding oops (4827)
+ - the watchdog thread now runs as interruptible (5246)
+ - handle missing objects in filter_preprw_read properly (5265)
+ - unsafe list practices (rarely) led to infinite eviction loop (4908)
+ - drop import inflight refcount on signal_completed_replay error (5255)
+ - unlock page after async write error during send (3677)
+ - return original error code on reconstructed replies (3761)
+ - no transno return for symlink open, don't save no-trasno open (3440)
+ * miscellania
+ - add pid to ldlm debugging output (4922)
+ - bump the watchdog timeouts -- we can't handle 30sec yet
+ - extra debugging for orphan dentry/inode bug (5259)
+
+2004-11-16 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.8
+ * bug fixes
+ - fix TCP_NODELAY bug, which caused extreme perf regression (5134)
+ - allocate qswnal tx descriptors singly to avoid fragmentation (4504)
+ - don't LBUG on obdo_alloc() failure, use OBD_SLAB_ALLOC() (4800)
+ - fix NULL dereference in /proc/sys/portals/routes (4827)
+ - allow failed mdc_close() operations to be interrupted (4561)
+ - stop precreate on OST before MDS would time out on it (4778)
+ - don't send partial-page writes before EOF from client (4410)
+ - discard client grant for sub-page writes on large-page clients (4520)
+ - don't free dentries not owned by NFS code, check generation (4806)
+ - fix lsm leak if mds_create_objects() fails (4801)
+ - limit debug_daemon file size, always print CERROR messages (4789)
+ - use transno after validating reply (3892)
+ - process timed out requests if import state changes (3754)
+ - update mtime on OST during writes, return in glimpse (4829)
+ - add mkfsoptions to LDAP (4679)
+ - use ->max_readahead method instead of zapping global ra (5039)
+ - don't interrupt __l_wait_event() during strace
+ * miscellania
+ - add software watchdogs to catch hung threads quickly (4941)
+ - make lustrefs init script start after nfs is mounted
+ - fix CWARN/ERROR duplication (4930)
+ - return async write errors to application if possible (2248)
+ - add /proc/sys/portal/memused (bytes allocated by PORTALS_ALLOC)
+ - print NAL number in %x format (4645)
+ - update barely-supported suse-2.4.21-171 series (4842)
+ - support for sles 9 %post scripts
+ - support for building 2.6 kernel-source packages
+ - support for sles km_* packages
+
+2004-10-07 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.7
+ * bug fixes
+ - ignore -ENOENT errors in osc_destroy (3639)
+ - notify osc create thread that OSC is being cleaned up (4600)
+ - add nettype argument for llmount in #5d in conf-sanity.sh (3936)
+ - reconstruct ost_handle() like mds_handle() (4657)
+ - create a new thread to do import eviction to avoid deadlock (3969)
+ - let lconf resolve symlinked-to devices (4629)
+ - don't unlink "objects" from directory with default EA (4554)
+ - hold socknal file ref over connect in case target is down (4394)
+ - allow more than 32000 subdirectories in a single directory (3244)
+ - fix blocks count for O_DIRECT writes (3751)
+ - OST returns ENOSPC from object create when no space left (4539)
+ - don't send truncate RPC if file size isn't changing (4410)
+ - limit OSC precreate to 1/2 of value OST considers bogus (4778)
+ - bind to privileged port in socknal and tcpnal (3689)
+ * miscellania
+ - rate limit CERROR/CWARN console message to avoid overload (4519)
+ - GETFILEINFO dir ioctl returns LOV EA + MDS stat in 1 call (3327)
+ - basic mmap support (3918)
+ - kernel patch series update from b1_4 (4711)
+
+2004-09-16 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.6
+ * bug fixes
+ - avoid crash during MDS cleanup with OST shut down (2775)
+ - fix loi_list_lock/oig_lock inversion on interrupted IO (4136)
+ - don't use bad inodes on the MDS (3744)
+ - dynamic object preallocation to improve recovery speed (4236)
+ - don't hold spinlock over lock dumping or change debug flags (4401)
+ - don't zero obd_dev when it is force cleaned (3651)
+ - print grants to console if they go negative (4431)
+ - "lctl deactivate" will stop automatic recovery attempts (3406)
+ - look for existing locks in ldlm_handle_enqueue() (3764)
+ - don't resolve lock handle twice in recovery avoiding race (4401)
+ - revalidate should check working dir is a directory (4134)
+ * miscellania
+ - don't always mark "slow" obdfilter messages as errors (4418)
+
+2004-08-24 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.5
+ * bug fixes
+ - don't close LustreDB during write_conf until it is done (3860)
+ - fix typo in lconf for_each_profile (3821)
+ - allow dumping logs from multiple threads at one time (3820)
+ - don't allow multiple threads in OSC recovery (3812)
+ - fix debug_size parameters (3864)
+ - fix mds_postrecov to initialize import for llog ctxt (3121)
+ - replace config semaphore with spinlock (3306)
+ - be sure to send a reply for a CANCEL rpc with bad export (3863)
+ - don't allow enqueue to complete on a destroyed export (3822)
+ - down write_lock before checking llog header bitmap (3825)
+ - recover from lock replay timeout (3764)
+ - up llog sem before sending rpc (3652)
+ - reduce ns lock hold times when setting kms (3267)
+ - change a dlm LBUG to LASSERTF, to maybe learn something (4228)
+ - fix NULL deref and obd_dev leak on setup error (3312)
+ - replace some LBUG about llog ops with error handling (3841)
+ - don't match INVALID dentries from d_lookup and spin (3784)
+ - hold dcache_lock while marking dentries INVALID and hashing (4255)
+ - fix invalid assertion in ptlrpc_set_wait (3880)
+ * miscellania
+ - add libwrap support for the TCP acceptor (3996)
+ - add /proc/sys/portals/routes for non-root route listing (3994)
+ - allow setting MDS UUID in .xml (2580)
+ - print the stack of a process that LBUGs (4228)
+
+2004-07-14 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.4
+ * bug fixes
+ - don't cleanup request in ll_file_open() on failed MDS open (3430)
+ - make sure to unset replay flag from failed open requests (3440)
+ - if default stripe count is 0, use OST count for inode size (3636)
+ - update parent mtime/ctime on client for create/unlink (2611)
+ - drop dentry ref in ext3_add_link from open_connect_dentry (3266)
+ - free recovery state on server during a forced cleanup (3571)
+ - unregister_reply for resent reqs (3063)
+ - loop back devices mounting and status check on 2.6 (3563)
+ - fix resource-creation race that can provoke i_size == 0 (3513)
+ - don't try to use bad inodes returned from MDS/OST fs lookup (3688)
+ - more debugging for page-accounting assertion (3746)
+ - return -ENOENT instead of asserting if ost getattr+unlink race (3558)
+ - avoid deadlock after precreation failure (3758)
+ - fix race and lock order deadlock in orphan handling (3450, 3750)
+ - add validity checks when grabbing inodes from l_ast_data (3599)
+ * miscellania
+ - add /proc/.../recovery_status to obdfilter (3428)
+ - lightweight CDEBUG infrastructure, debug daemon (3668)
+ - change default OSC RPC parameters to be better on small clusters
+ - turn off OST read cache for files smaller than 32MB
+ - install man pages and include them in rpms (3100)
+ - add new init script for (un)mounting lustre filesystems (2593)
+ - run chkconfig in %post for init scripts (3701)
+ - drop scimac NAL (unmaintained)
+
+2004-06-17 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.3
+ * bug fixes
+ - clean kiobufs before and after use (3485)
+ - strip trailing '/'s before comparing paths with /proc/mounts (3486)
+ - remove assertions to work around "in-flight rpcs" recovery bug (3063)
+ - change init script to fail more clearly if not run as root (1528)
+ - allow clients to reconnect during replay (1742)
+ - fix ns_lock/i_sem lock ordering deadlock for kms update (3477)
+ - don't do DNS lookups on NIDs too small for IP addresses (3442)
+ - re-awaken ptlrpcd if new requests arrive during check_set (3554)
+ - fix cond_resched (3554)
+ - only evict unfinished clients after recovery (3515)
+ - allow bulk resend, prevent data loss (3570)
+ - dynamic ptlrpc request buffer allocation (2102)
+ - don't allow unlinking open directory if it isn't empty (2904)
+ - set MDS/OST threads to umask 0 to not clobber client modes (3359)
+ - remove extraneous obd dereference causing LASSERT failure (3334)
+ - don't use get_cycles() when creating temp. files on the mds (3156)
+ - hold i_sem when setting i_size in ll_extent_lock() (3564)
+ - handle EEXIST for set-stripe, set proper directory name (3336)
+ * miscellania
+ - servers can dump a log evicting a client - lustre.dump_on_timeout=1
+ - fix ksocknal_fmb_callback() error messages (2918)
+
+2004-05-27 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.2
+ * bug fixes
+ - don't copy lvb into (possibly NULL) reply on error (2983)
+ - don't deref dentry after dput, don't free lvb on error (2922)
+ - use the kms to determine writeback rpc length (2947)
+ - increment oti_logcookies when osc is inactive (2948)
+ - update client's i_blocks count via lvb messages (2543)
+ - handle intent open/close of special files properly (1557)
+ - mount MDS with errors=remount-ro, like obdfilter (2009)
+ - initialize lock handle to avoid ASSERT on error cleanup (3057)
+ - don't use cancelling-locks' kms values (2947)
+ - use highest lock extent for kms, not last one (2925)
+ - don't dereference ERR_PTR() dentry in error handling path (3107)
+ - fix thread race in portals_debug_dumplog() (3122)
+ - create lprocfs device entries at setup instead of at attach (1519)
+ - common AST error handler, don't evict client on completion race (3145)
+ - zero nameidata in detach_mnt in 2.6 (3118)
+ - verify d_inode after revalidate_special is valid in 2.6 (3116)
+ - use lustre_put_super() to handle zconf unmounts in 2.6 (3064)
+ - initialize RPC timeout timer earlier for 2.6 (3219)
+ - don't dereference NULL reply buffer if mdc_close was never sent (2410)
+ - print nal/nid for unknown nid (3258)
+ - additional checks for oscc recovery before doing precreate (3284)
+ - fix ll_extent_lock() error return code for 64-bit systems (3043)
+ - don't crash in mdc_close for bad permissions on open (3285)
+ - zero i_rdev for non-device files (3147)
+ - clear page->private before handing to FS, better assertion (3119)
+ - tune the read pipeline (3236)
+ - fix incorrect decref of invalidated dentry (2350)
+ - provide read-ahead stats and refine rpc in flight stats (3328)
+ - don't hold journal transaction open across create RPC (3313)
+ - update atime on MDS at close time (3265)
+ - close LDAP connection when recovering to avoid server load (3315)
+ - update iopen-2.6 patch with fixes from 2399,2517,2904 (3301)
+ - don't leak open file on MDS after open resend (3325)
+ - serialize filter_precreate and filter_destroy_precreated (3329)
+ - loop device shouldn't call sync_dev() for nul device (3092)
+ - clear page cache after eviction (2766)
+ - resynchronize MDS->OST in background (2824)
+ - refuse to mount the same filesystem twice on same mountpoint (3394)
+ - allow llmount to create routes for mounting behind routers (3320)
+ - push lock cancellation to blocking thread for glimpse ASTs (3409)
+ - don't call osc_set_data_with_check() for TEST_LOCK matches (3159)
+ - fix rare problem with rename on htree directories (3417)
+ * miscellania
+ - allow default OST striping configuration per directory (1414)
+ - fix compilation for qswnal for 2.6 kernels (3125)
+ - increase maximum number of MDS request buffers for large systems
+ - change liblustreapi to be useful for external progs like lfsck (3098)
+ - increase local configuration timeout for slow disks (3353)
+ - allow configuring ldlm AST timeout - lustre.ldlm_timeout=<seconds>
+
+2004-03-22 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.1
+ * bug fixes
+ - fixes for glimpse AST timeouts / incorrectly 0-sized files (2818)
+ - don't overwrite extent policy data in reply if lock was blocked (2901)
+ - drop filter export grants atomically with removal from device (2663)
+ - del obd_self_export from work_list in class_disconnect_exports (2908)
+ - don't LBUG if MDS recovery times out during orphan cleanup (2530)
+ - swab reply message in mdc_close, other PPC fixes (2464)
+ - fix destroying of named logs (2325)
+ - overwrite old logs when running lconf --write_conf (2264)
+ - bump LLOG_CHUNKSIZE to 8k to allow for larger clusters (2306)
+ - fix race in target_handle_connect (2898)
+ - mds_reint_create() should take same inode create lock (2926)
+ - correct journal credits calculated for CANCEL_UNLINK_LOG (2931)
+ - don't close files for self_export to avoid uninitialized obd (2936)
+ - allow MDS with the same name as client node (2939)
+ - hold dentry reference for closed log files for unlink (2325)
+ - reserve space for all logs during transactions (2059)
+ - don't evict page beyond end of stripe extent (2925)
+ - don't oops on a deleted current working directory (2399)
+ - handle hard links to targets without a parent properly (2517)
+ - don't dereference NULL lock when racing during eviction (2867)
+ - don't grow lock extents when lots of conflicting locks (2919)
+
+2004-03-04 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.2.0
+ * bug fixes
+ - account for cache space usage on clients to avoid data loss (974)
+ - lfsck support in lustre kernel code (2349)
+ - reduce journal credits needed for BRW writes (2370)
+ - orphan handling to avoid losing space on client/server crashes
+ - ptlrpcd can be blocked, stopping ALL progress (2477)
+ - use lock value blocks to assist in proper KMS, faster stat (1021)
+ - takes i_sem instead of DLM locks internally on obdfilter (2720)
+ - recovery for initial connections (2355)
+ - fixes for mds_cleanup_orphans (1934)
+ - abort_recovery crashes MDS in b_eq (mds_unlink_orphan) (2584)
+ - block all file creations until orphan recovery completes (1901)
+ - client remove rq_connection from request struct (2423)
+ - conf-sanity test_5, proper cleanup in umount log not availale (2640)
+ - recovery timer race (2670)
+ - mdc_close recovey bug (2532)
+ - ptlrpc cleanup bug (2710)
+ - mds timeout on local locks (2588)
+ - namespace lock held during RPCs (2431)
+ - handle interrupted sync write properly (2503)
+ - don't try to handle a message that hasn't been replied to (2699)
+ - client assert failure during cleanup after abort recovery (2701)
+ - leak mdc device after failed mount (2712)
+ - ptlrpc_check_set allows timedout requests to complete (2714)
+ - wait for inflight reqs when ptlrpcd finishes (2710)
+ - make sure unregistered services are removed from the srv_list
+ - reset bulk XID's when resending them (caught by 1138 test)
+ - unregister_bulk after timeout
+ - fix lconf error (2694)
+ - handle write after unfinished setstripe, stripe-only getstripe (2388)
+ - readahead locks pages, leaves pending causing memory pressure (2673)
+ - increase OST request buffers to 4096 on large machines (2729)
+ - fix up permission of existing directories in simple_mkdir (2661)
+ - init deleted item, add assertions ptlrpc_abort_inflight() (2725)
+ - don't assign transno to errored transactions (2742)
+ - don't delete objects on OST if given a bogus objid from MDS (2751)
+ - handle large client PAGE_SIZE readdir on small PAGE_SIZE MDS (2777)
+ - if rq_no_resend, then timeout request after recovery (2432)
+ - fix MDS llog_logid record size, 64-bit array alignment (2733)
+ - don't call usermode_helper from ptlrpcd, DEFAULT upcall (2773)
+ - put magic in mount.lustre data, check for bad/NULL mount data (2529)
+ - MDS recovery shouldn't delete objects that it has given out (2730)
+ - if enqueue arrives after completion, don't clobber LVB (2819)
+ - don't unlock pages twice when trigger_group_io returns error (2814)
+ - don't deref NULL rq_repmsg if ldlm_handle_enqueue failed (2822)
+ - don't write pages to disk if there was an error (1450)
+ - don't ping imports that have recovery disabled (2676)
+ - take buffered bytes into account when balancing socknal conn (2817)
+ - hold a DLM lock over readdir always, use truncate_inode_pages (2706)
+ - reconnect unlink llog connection after MDS reconnects to OST (2816)
+ - remove little-endian swabbing of llog records (1987)
+ - set/limit i_blksize to LL_MAX_BLKSIZE on client (2884)
+ - retry reposting request buffers if they fail (1191)
+ - grow extent at grant time to avoid granting a revoked lock (2809)
+ - lock revoke doesn't evict page if covered by a second lock (2765)
+ - disable VM readahead to avoid reading outside lock extents (2805)
+ * miscellania
+ - return LL_SUPER_MAGIC from statfs for the filesystem type (1972)
+ - updated kernel patches for hp-2.4.20 kernel (2681)
+
+2004-02-07 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.0.4
+ * kernel patches
+ - fix truncated write corruption (2366)
+ - fix for failed assertion in iopen_connect_dentry (1792,2517)
+ * bug fixes
+ - don't flag the ptlrpcd thread with PF_MEMALLOC (2636)
+ - ensure len(uuid) < 37 in lmc (1171)
+ - fix ia64 OOPS in llog_test (2255)
+ - zero end of page at obdfilter for partial page writes (2648)
+ - don't leave stale dentries around after renames (bug 2428)
+ - fix timeouts when evicting a client with a single lock held (2642)
+ - set deadline for the initial HELLO message to drain (2634)
+ - print out dotted-quad IP addresses in the socknal (2302)
+ * miscellania
+ - additional debugging for MDS client eviction problem (2443)
+ - fix mkfsoptions support for osts (2603, 2604)
+
+2004-01-27 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.0.3
+ * kernel patches
+ - add series for the vanilla 2.6.0 kernel
+ - add series for the vanilla 2.4.24 kernel
+ - add series for a cray x86/64 UL kernel drop
+ - fix xattr patches for the vanilla 2.4.19 series
+ * bug fixes
+ - generate true UUIDs in lmc (1171)
+ - have portals stack dumping break in UML (2466)
+ - avoid bad dchild deref; avoid inum lock w/o creation (2362)
+ - allocate with _NOFS in ldlm to avoid deadlock (1933)
+ - wake callback waiting threads on client eviction (2460)
+ - Add --ptldebug and --subsystem to lmc (1719)
+ - update assertion to allow safe interrupt allocation
+ - set rq_no_resend for cancel requests (2432)
+ - recalculate ptlrpcd timeout after resend (2494)
+ - call vfs_rmdir when removing pending directories (2368)
+ - fix renaming a file to itself (2429)
+ - lmc creates a default one-stripe lov (2454)
+ - expand procfs space to handle large clusters (2326)
+ - increase UML stack to avoid overflow
+ - update lconf's list of debug and subsystem masks
+ - fix lfs find --obd (2510)
+ - /proc tunable for disabling filter read caching (2591)
+ - stop rpm packages from altering slapd.conf (2301)
+ - disable nagle in the socknal under 0conf (2578)
+ - choose mds inode size based on stripe count (2572)
+ - fix kernel-source rpm problems (2516)
+ * miscellania
+ - add --disable-doc to avoid pdf generation (2421)
+ - update documentation, tests, type-os, comments
+ - avoid format warnings on ia64
+ - remove the TOE NAL
+ - tiny code cleanups by removing unused fields
+
+2004-01-07 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.0.2
+ * bug fixes
+ - fix obvious semaphore misuse in as-yet-unused setattr path (2348)
+ - remove the most blatant lies from BUILDING file (2371)
+ - change default debug level to reasonable production setting
+ - reduce client side cache size to reduce cache flush time
+ - reduce max RPCs in flight to avoid unnecessary file fragmentation
+ - make TCP zerocopy and pinger support enabled by default (2476)
+ - sync writes completed after process exits caused crashes (2319)
+ - maintain correct mount count on the MDS (2356)
+ - backout 1557, because 2316 wasn't really fixed
+ - better file I/O statistics gathering in /proc
+ - don't take unnecessary, deadlock-inducing bug in readpage (2383)
+ - another kernel patch to fix zero-copy TCP function export
+ - don't take duplicate lock when processing re-sent getattr (2420)
+ - lctl uses obd_self_export instead of creating new conn (2353)
+ - MDS/OST recovery case which requires object creation asserted (2425)
+ - move lfs from /usr/sbin to /usr/bin in packages
+ - fix race between mds_client_add and mds_client_free (2417)
+ - use kmalloc instead of slabs in portals (2430)
+ - don't create duplicate records when a failover MDS is present (2442)
+ - remove unnecessary mount age check (2332)
+ - don't remove directory inodes from locks prematurely (2451)
+ - don't break if MDS service name is the same as hostname (2103)
+ - fix races in client write RPC generation when cache full (2482)
+
+2003-12-13 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.0.1
+ * bug fixes
+ - remove now-unused request->rq_obd (278)
+ - if an allocation fails, print out how much memory we've used (1933)
+ - use PORTAL_SLAB_ALLOC for structures, to get GFP_MEMALLOC (1933)
+ - add the "configurable stack size" patch to most series files (1256)
+ - ability to write large log records, for 100+ OST configs (2306)
+ - fix NULL deref when filter_prep fails (2314)
+ - fix operator precedence error in filter_sync
+ - dynamic allocation of socknal TX descriptors (2315)
+ - fix a missed case in the GFP_MEMALLOC patch, can cause deadlock (2310)
+ - fix gcc 2.96 compilation problem in xattr kernel patch (2294)
+ - ensure that CWARN messages in Portals always get to the syslog
+ - __init/__exit are not for prototype decls (ldlm_init/exit)
+ - x86-64 compile warning fixes
+ - fix gateway LMC keyword conflict (2318)
+ - fix MDS lock inversions in getattr/reint paths (1844)
+ - fix a rare lock re-ordering bug, which caused deadlock (2322)
+ - fix i_sem/journal inversion in fsfilt_ext3_write_record (2306)
+ - DLM race condition prevented some lock evictions (2328)
+ - ENOMEM detection and retry on socknal sends (2230)
+ - use GFP_NOFS throughout Lustre, to combat ENOMEM (2230)
+ - move osc_rpcd into ptlrpc, for use in MDC and others (2329)
+ - protect MDS inode fsdata with stronger locking; fixes assertion (2313)
+ - better error messages when a client is rejected during recovery (1505)
+ - avoid cancelling locks which were never granted, after failure (2330)
+ - fix i_sem/journal inversion in mds_client_add (2333)
+ - fix truncate/getattr lock cycle deadlock (2334)
+ - use rpcd to send close; allows resend after timeout, avoid leak (1897)
+ - fix two rare exit paths which could leak an l_lock() ref (2321)
+ - fencepost error in MDS/OST orphan recovery (2226)
+ - make log record alignment 8 bytes (1988)
+ - lstripe now fails when requested offset > ost_count (2237)
+ - ensure that all kernel series have a complete list.h (1607)
+ - fix crashes in special-file operations (2316)
+ - lctl create/brw OID mismatch, caused by obsolete filter loop (2339)
+ * miscellania
+ - allow configurable automake binary, for testing new versions
+ - small update to the lfs documentation
+
+2003-12-03 Cluster File Systems, Inc. <info@clusterfs.com>
+ * version 1.0.0
+ * fix negative export reference count in fsfilt_sync (2312)
+
+2003-12-01 Cluster File Systems, Inc. <info@clusterfs.com>
+ * release candidate 0.9.1
+ * bug fixes
+ - orphans are moved into the PENDING directory for possible recovery
+ - replayed opens now open by fid for orphan/rename safety (1042)
+ - last close of an orphan inode generates a transno (683)
+ - chdir() and mount() now pin the directory entry (1020)
+ - avoid CERROR in normal ll_setattr_raw() error case (1500)
+ - discard very old requests without processing them (1502)
+ - remove some common, well-understood CERRORs (1505)
+ - require O_DIRECT I/O to be page-sized to workaround IA64 crash (1609)
+ - clear "grant" flags in OST replies until OST grant code lands (1644)
+ - fix read performance by not clobbering i_blksize on client (1598)
+ - fix __ldlm_handle2lock oops by not dereferencing lock after PUT (1625)
+ - make LRU size a /proc tunable, clears locks when reduced (707)
+ - fix some lprocfs rot that prevented ptlbd from loading (1732)
+ - server locks take references on exports now (1558)
+ - build fixes for 2.4.20-rh trees (1663)
+ - return an error from lov_create if all OSCs are inactive (1751)
+ - fix import levels when a reconnect happens without a timeout (1597)
+ - exit early from mds_open if we get a lookup error (1749)
+ - partial page read at EOF wouldn't wait for disk before sending (1642)
+ - avoid NULL deref in obdfilter when reading page past EOF (1592)
+ - avoid LASSERT in ll_intent_lock if server failed very early (1090)
+ - fix LBUG in ll_it_open_error with rc = -2 (1861)
+ - write/truncate lock inversion (1639)
+ - Don't auto-load obdclass, portals modules during cleanup (1495)
+ - fix timestamps from jumping to "now" (1763)
+ - extra journal assertions (1648)
+ - add an extra multiunlink test (1771)
+ - fix read_record/write_record API (1776)
+ - fix leak of offset_extent, possible incorrect i_size later (1772)
+ - fix lasserts in mis-matched transnos during open-unlink testing (1541)
+ - Debugging for the kqswnal_get_idle_tx problems (1820)
+ - Allow recovery to be attempted multiple times (1536)
+ - Write out MDS last_rcvd file after it is first created (1600)
+ - Fix tx_descriptor leak in failed transmit situations (1827)
+ - ext3 journaling fixes for assertion failure after IO error (1871)
+ - class_export_put() on freed export after completion AST error (1896)
+ - Fix revalidate looping in VFS (1322)
+ - Don't access a freed export during MDS_REINT timeout (1521)
+ - Add open-unlink recovery support on the MDS (1673,1764)
+ - Return an error if no MDS data was read from last_rcvd (1946)
+ - Fix for lookup "." or ".." crash on error (1932,1931,1935)
+ - Don't setup a disk device that doesn't match exported UUID (317)
+ - Reduce bulk RPC timeout to avoid cascading client/OST failures (1845)
+ - avoid committing NULL handle in force close
+ - local.sh is now a one-stripe LOV configuration
+ - POSIX utime.4 -EPERM on FIFO not owned by user (56)
+ - fix ext3 htree duplicate directory entry corruption (1516)
+ - POSIX creat.13, fstat.1, open.18, stat.3 new file atime/mtime (2020)
+ - update to new LOV EA format (2097)
+ - interoperability for different PAGE_SIZE/wordsize (686,1821,1343,2042)
+
+2003-06-15 Phil Schwan <phil@clusterfs.com>
+ * version v0_7
+ * bug fixes
+ - imports and exports cleanup too early, need refcounts (349, 879, 1045)
+ - per-import/export recovery handling (958, 931, 959)
+ - multiple last-rcvd slots, for serving multiple FSes (949)
+ - connections are again shared between multiple imp/exports (963, 964)
+ - "umount -f" would hang if any requests needed to be sent (393, 978)
+ - avoid pinning large req buffer by copying for queued messages (989)
+ - add "uuid" to "lctl device" command to help upcalls (991)
+ - "open" RPCs with transnos would confuse recovery counters (1037)
+ - do proper endian conversion of all wire messages (288, 340, 891)
+ - remove OST bulk get LBUGs, fix ost_brw_write cleanup (1126)
+ - call waiting locks callback from LDLM recovery thread (1127, 1151)
+ - fix ptlrpc_connection leak in target_handle_connect (1174)
+ - fix import refcounting bug in OST and MDS cleanup (1134)
+ - if an invalid-at-open-time OSC returned before close(), LBUG (1150)
+ - fix very unlikely obd_types race condition (501)
+ - remove osc_open hack for echo_client (1187)
+ - we leaked exports/dlmimps for forcibly disconnected clients (1143)
+ - a failure in read_inode2 leads to deadlock (1139)
+ - cancel ack-locks as soon as transaction is committed (1072)
+ - fix major leaks and crashes in the bulk I/O path (937, 1057)
+ - make sure to commitrw after any preprw to avoid deadlock (1162)
+ - failing to execute a file in a lustre FS would lock inode (1203)
+ - small DEBUG_REQ fix to avoid dereferencing a NULL (1227)
+ - don't ASSERT while cleaning up an incompletely-setup obd (1248)
+ - obd_uuid2tgt would walk off the end of the list (1255)
+ - on IA64 the osc would give portals incorrect bulk size (1258)
+ - fix debug daemon ioctl interface; allows daemon on ia64 (1274)
+ - fix lock inversion caused by new llite matching code (1282)
+ - limit the number of dirty pages on a client to 10MB (1286)
+ - timed out locks were not being corrected cancelled (1289)
+ - fix O_DIRECT above 4GB on IA-32 (1292)
+ * major user-visible changes
+ - fail out/fail over policy now controlled by the upcall (993)
+ * protocol changes
+ - add OBD_PING to check server availability and failure (954)
+ - lustre messages are now sent in sending host order (288, 340, 891)
+ - add eadatalen to MDS getattr reply (340)
+ - OST read replies may contain second buffer, with per-page status (593)
+
+2003-03-11 Phil Schwan <phil@clusterfs.com>
+ * version v0_6
+ * bug fixes
+ - LDLM_DEBUG macro fix, for gcc 3.2 (850)
+ - failed open()s could cause deadlock; fixed (867, 869)
+ - stop cancelling OST locks when files are closed (481)
+ - overlapping XID spaces caused network corruption (851, 853)
+ - fix unsafe fsfilt counter arithmetic; change to atomic_t
+ - setattr_raw added, to do single-RPC, server-side setattrs
+ - lmc/lconf syntax change for OST UUIDs
+ - fix crashy race condition between ptlrpc_free_req and osc_close
+ - don't use request in mdc_enqueue if we hit a timeout (889)
+ - don't set the inode i_size for regular files from the MDS (896)
+ - handle out of order completion AST (842)
+ - don't LBUG if a lock request times out after receiving AST (913)
+ - avoid d_rehash race in ll_find_alias by rehashing inside dcache_lock
+ - if a bad lock AST arrives, send an error instead of dropping entirely
+ - return 0 from revalidate2 if ll_intent_lock returns -EINTR (912)
+ - fix leak in bulk IO when only partially completed (899, 900, 926)
+ - fix O_DIRECT for ia64 (55)
+ - (almost) eliminate Lustre-kernel-thread effects on load average (722)
+ - C-z after timeout could hang a process forever; fixed (977)
+ * Features
+ - client-side I/O cache (678, 924, 929, 941, 970)
+ * protocol changes
+ - READPAGE and SETATTRs which don't take server-side locks get
+ their own portal
+
+2003-02-11 Phil Schwan <phil@clusterfs.com>
+ * version v0_5_20
+ * bug fixes
+ - Fix ldlm_lock_match on the MDS to avoid matching remote locks (592)
+ - Fix fsfilt_extN_readpage() to read a full page of directory
+ entries, or fake the remainder if PAGE_SIZE != blocksize (500)
+ - Avoid extra mdc_getattr() in ll_intent_lock when possible (534, 604)
+ - Fix imbalanced LOV object allocation and out-of-bound access (469)
+ - Most intent operations were removed, in favour of a new RPC mode
+ that does a single RPC to the server and bypasses most of the VFS
+ - All LDLM resource ID arrays were removed in favour of ldlm_res_id
+ - Aggressively cancel local locks on DLM servers
+ - mds_reint_unlink sends EA to the client if it's the last nlink.
+ client uses that EA to unlink OST objects.
+ - mds_reint_{rename,unlink,link} were rewritten to take ordered locks
+ - recursive symlinks were fixed (439)
+ - fixed NULL deref in DEBUG_REQ
+ - filter_update_lastobjid no longer calls sync, which annoyed extN
+ - fixed multi-client small-writes to a single file problem (445)
+ - fixed mtime updates during file writes (607)
+ - fixed vector writes on obdfilter causing problems when ENOSPC (670)
+ - fixed bug in obd_brw_read/write() (under guise of testing 367)
+ - fixed Linux OST size reporting problem (444, 656)
+ - OST now updates object mtime with writes or setattr (607, 619)
+ - client verifies file size before zeroing page past EOF (445)
+ - OST now writes last allocated objid to disk with allocation (108)
+ - LOV on echo now works (409)
+ * protocol changes
+ - mds_reint_unlink sends a new buffer, with the EA included. this
+ buffer is only valid if body->valid & OBD_MD_FLEASIZE, which is only
+ set if a regular file was being unlinked, and it was the last link
+ - use PtlGet from the target for bulk writes (315)
+ - OST now updates object mtime with writes or setattr (607, 619)
+ - LDLM now has a grant-time callback to revalidate locked items, if
+ necessary (604)
+ - Many MDS operations were reorganized to combat race conditions
+ * other changes
+ - Merge b_intel branch (updated lprocfs code) - now at /proc/fs/lustre
+ - configure check to avoid gcc version 2.96 20000731-2.96-98) (606)
+
+2003-01-06 Andreas Dilger <adilger@clusterfs.com>
+ * version v0_5_19
* bug fixes
+ - Fully reactivate OST imports after reconnection (512, others)
+ - Make sure client sees our -ENOTCONN from mds_handle (513 - partial)
+ - More graceful error handling for truncating on dead OST (515)
+ - Don't error out unless we're actually accessing dead stripes (474)
+ - Fix garbage sizes when stripes are missing (410)
- LRU counters were broken, causing constant lock purge (433, 432)
- garbage on read from stripes with failed OSTs (441)
- mark OSCs as active before reconnecting during recovery (438)
- stop dereferencing request after dropping refcount (457)
- don't LASSERT(spin_is_locked) on non-SMP (455)
- fixes for many rename() bugs
+ - fstat didn't correctly synchronize attributes (399)
+ - server must handle lock cancellation during blocking AST prep (487)
+ - bulk descriptors were free()d too soon (511)
+ - fix paths in lconf, which would load incorrect modules (451, 507)
+ - fix confusing lconf 'host not found' error message (386)
+ - fix lock order deadlock on OST (O/R i_sem before journal ops, 478)
+ - fix race condition in mdc_blocking_ast() for inode access (526)
+ - fix lov_unpackmd() unpacking wrong number of stripes (537)
+ - fix lov_set_osc_active() marking wrong OSC inactive (440)
+ - fix bad lstripe lov_unpackmd() assertion (fix layering too) (527)
+ - fix multiple writes of stripe MD to MDS (358, maybe 519)
+ - fix lstripe in several ways (kernel side) (527)
+ - fix request leak in ldlm_cli_enqueue (262)
+ - incorrect OSC was marked inactive after OST failure
+ - call mds_fs_cleanup before unmounting filesystem (524)
+ - fix races between taking ns_lock and ldlm_lock_change_resource
+ - fix races updating LOV export open file list
+ - fix lov_enqueue error path, avoid decref-ing bad lock handle
+ - fix recovery NULL deref in ldlm_cli_cancel_unused
+ - fix some DLM races by using new hash table for lock handles (419)
+ - permit the client to specify desired inodes, at replay
+ - duplicate requests when we queue them for replay reintegration
+ - fix last_rcvd offset calculation
+ - sync after each recovered transaction, so we always make progress
+ - never, not always, ERESTART requests without transnos
+ - store the lov_desc in the MDS, so we don't depend on getlovinfo to
+ set it
+ - skip replay if the MDS says that the client is already connected
+ - don't check for a recovery-enabled export to match lctl's UUID
+ - don't INC_USE_COUNT for phantom exports
+ - don't crash when cleaning up phantom exports (567)
+ - don't double-finish or set replay data for errored mdc_open requests
+ - abort requests when they time out, so we don't get old replies
+ - send/receive replies for AST messages again
+ - if the client says that it doesn't have the lock, cancel it on the
+ server
+ - if we timeout during I/O, don't try to cancel an in-use lock; instead
+ mark it as destroyed, it will all work out when decref is called
+ - fix module use counts (22, 581)
+ * protocol changes
+ - ASTs now expect a reply (server cancels lock on error reply)
2002-12-02 Andreas Dilger <adilger@clusterfs.com>
* version v0_5_18
- fix dbench 2, extN refcount problem (170, 258, 356, 418)
- fix double-O_EXCL intent crash (424)
- avoid sending multiple lock CANCELs (352)
- * Features
+ * Features
- MDS can do multi-client recovery (modulo bugs in new code)
- * Documentation
+ * Documentation
- many updates, edits, cleanups
2002-11-18 Phil Schwan <phil@clusterfs.com>
- properly abstracted the echo client
- OSC locked 1 byte too many; fixed
- rewrote brw callback code:
- - fixed recovery bugs related to LOVs (306)
- - fixed too-many-pages-in-one-write crash (191)
- - fixed (again) crash in sync_io_timeout (214)
- - probably fixed callback-related race (385)
+ - fixed recovery bugs related to LOVs (306)
+ - fixed too-many-pages-in-one-write crash (191)
+ - fixed (again) crash in sync_io_timeout (214)
+ - probably fixed callback-related race (385)
* protocol change
- - Add capability to MDS protocol
+ - Add capability to MDS protocol
- LDLM cancellations and callbacks on different portals
2002-10-28 Andreas Dilger <adilger@clusterfs.com>
* small changes in the DLM wire protocol
2002-07-25 Peter J. Braam <braam@clusterfs.com>
- * version 0_5_1 with some initial stability,
- * locking on MD and file I/O.
+ * version 0_5_1 with some initial stability,
+ * locking on MD and file I/O.
* documentation updates
* several bug fixes since 0.5.0
* small changes in wire protocol
* move forward to latest Lustre kernel
2002-06-25 Peter Braam <braam@clusterfs.com>
- * release version v0_4_1. Hopefully stable on single node use.
+ * release version v0_4_1. Hopefully stable on single node use.