tbd Sun Microsystems, Inc.
* version 2.0.0
* Support for kernels:
- 2.6.16.60-0.27 (SLES 10),
- 2.6.18-92.1.10.el5 (RHEL 5),
+ 2.6.16.60-0.33 (SLES 10),
+ 2.6.18-92.1.22.el5 (RHEL 5),
2.6.22.14 vanilla (kernel.org).
* Client support for unpatched kernels:
- (see http://wiki.lustre.org/index.php?title=Patchless_Client)
- 2.6.16 - 2.6.21 vanilla (kernel.org)
+ (see http://wiki.lustre.org/index.php?title=Patchless_Client)
+ 2.6.16 - 2.6.21 vanilla (kernel.org)
* Recommended e2fsprogs version: 1.40.11-sun1
* Note that reiserfs quotas are disabled on SLES 10 in this kernel.
* RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd' to a
- removed cwd "./" (refer to Bugzilla 14399).
+ removed cwd "./" (refer to Bugzilla 14399).
* File join has been disabled in this release, refer to Bugzilla 16929.
+Severity : normal
+Frequency : normal
+Bugzilla : 12069
+Descriptoin: OST grant too much space to client even there are not enough space.
+Details : Client will shrink its grant cache to OST if there are no write
+ activity over 6 mins (GRANT_SHRINK_INTERVAL), and OST will retrieve
+ this grant cache if there are already not enough avaible space
+ (left_space < total_clients * 32M).
+
+Severity : normal
+Frequency : start MDS on uncleanly shutdowned MDS device
+Bugzilla : 16839
+Descriptoin: ll_sync thread stay in waiting mds<>ost recovery finished
+Details : stay in waiting mds<>ost recovery finished produce random bugs
+ due race between two ll_sync thread for one lov target. send
+ ACTIVATE event only if connect realy finished and import have
+ FULL state.
+
+Severity : normal
+Frequency : rare, connect and disconnect target at same time
+Bugzilla : 17310
+Descriptoin: ASSERTION(atomic_read(&imp->imp_inflight) == 0
+Details : don't call obd_disconnect under lov_lock. this long time
+ operation and can block ptlrpcd which answer to connect request.
+
+Severity : normal
+Frequency : rare
+Bugzilla : 18154
+Descriptoin: don't lose wakeup for imp_recovery_waitq
+Details : recover_import_no_retry or invalidate_import and import_close can
+ both sleep on imp_recovery_waitq, but we was send only one wakeup
+ to sleep queue.
+
+Severity : normal
+Frequency : always with long access acl
+Bugzilla : 17636
+Descriptoin: mds can't pack reply with long acl.
+Details : mds don't control size of acl but they limited by reint/getattr
+ reply buffer.
+
+Severity : enhancement
+Bugzilla : 18061
+Description: Update to SLES10 kernel-2.6.16.60-0.33.
+
+Severity : enhancement
+Bugzilla : 18060
+Description: Update to RHEL5 kernel-2.6.18-92.1.22.el5.
+
+Severity : normal
+Frequency : start MDS on uncleanly shutdowned MDS device
+Bugzilla : 18049
+Descriptoin: aborting recovery hang on MDS
+Details : don't throttle destroy RPCs for the MDT.
+
+Severity : major
+Frequency : on remount
+Bugzilla : 18018
+Description: external journal device not working after the remount
+Details : clear dev_rdonly flag for external journal devices in
+ blkdev_put()
+
+Severity : minor
+Frequency : rare
+Bugzilla : 17802
+Description: shutdown vs evict race
+Details : client_disconnect_export vs connect request race.
+ if client will evicted at this time - we start invalidate
+ thread without referece to import and import can be freed
+ at same time.
+
+Severity : normal
+Frequency : rare, need acl's on inode.
+Bugzilla : 16492
+Description: client can't handle ost additional correctly
+Details : if ost was added after client connected to mds client can have
+ hit lnet_try_match_md ... to big messages to wide striped files.
+ in this case need teach client to handle config events about add
+ lov target and update client max ea size at that event.
+
+Severity : enhancement
+Bugzilla : 15699
+Description: Changelogs
+Details : Changelogs are a lightweight mechanism to track filesystem
+ metadata and namespace changes. The changelog is recorded
+ permanently on the MDTs, and is periodically "consumed" / purged
+ when records are no longer needed.
+
+Severity : enhancement
+Bugzilla : 15957
+Description: compact fld format with extents
+Details : Store range of seq rather than every seq in FLD. Seq
+ controller update FLD rather than clients. In Case of CMD, mdt0
+ has FLD, all other metadata server act as non persistent proxy
+ for FLD queries and cache fld entries in fld cache.
+
+Severity : normal
+Frequency : rare
+Bugzilla : 16081
+Description: don't skip ost target if they assigned to file
+Details : Drop slow OSCs if we can, but not for requested start idx.
+ This means "if OSC is slow and it is not the requested
+ start OST, then it can be skipped, otherwise skip it only
+ if it is inactive/recovering/out-of-space.
+
+Severity : normal
+Bugzilla : 16080
+Description: more cleanup in mds_lov
+Details : not send LOV EA under replay, we can't know about they size at this
+ time. Don't allow client connect to mds before any ost connected,
+ for avoid problems with LOV EA size and returning EIO to client.
+
+Severity : enhancement
+Bugzilla : 11826
+Description: Interoperability at server side (Disk interoperability)
+
+Severity : enhancement
+Bugzilla : 17201
+Description: Update to RHEL5 kernel-2.6.18-92.1.17.el5.
+
+Severity : enhancement
+Bugzilla : 17458
+Description: Update to SLES10 SP2 kernel-2.6.16.60-0.31.
+
+Severity : enhancement
+Bugzilla : 14166
+Description: New client IO stack (CLIO).
+
+Severity : enhancement
+Bugzilla : 15393
+Description: Commit on sharing. Eliminate inter-client dependencies between
+ uncommitted transactions by doing transaction commits.
+ Thereby clients may recovery independently.
+
+Severity : normal
+Frequency : Create a symlink file with a very long name
+Bugzilla : 16578
+Description: ldlm_cancel_pack()) ASSERTION(max >= dlm->lock_count + count)
+Details : If there is no extra space in the request for early cancels,
+ ldlm_req_handles_avail() returns 0 instead of a negative value.
+
+Severity : enhancement
+Bugzilla : 1819
+Description: Add /proc entry for import status
+Details : The mdc, osc, and mgc import directories now have
+ an import directory that contains useful import data for debugging
+ connection problems.
+
Severity : enhancement
Bugzilla : 15966
Description: Re-disable certain /proc logging
Details : Enable and disable client's offset_stats, extents_stats and
- extents_stats_per_process stats logging on the fly.
+ extents_stats_per_process stats logging on the fly.
Severity : major
Frequency : Only on FC kernels 2.6.22+
Bugzilla : 16643
Description: Generic /proc file permissions
Details : Set /Proc file permissions in a more generic way to enable non-
- root users operate on some /proc files.
+ root users operate on some /proc files.
Severity : major
Bugzilla : 16561
Description: Hitting mdc_commit_close() ASSERTION
Details : Properly handle request reference release in
- ll_release_openhandle().
+ ll_release_openhandle().
+
+Severity : major
+Bugzilla : 14840
+Description: quota recovery deadlock during mds failover
+Details : This patch includes att18982, att18236, att18237 in bz14840.
+ Slove the problems:
+ 1. fix osts hang when mds does failover with quotaon
+ 2. prevent watchdog storm when osts threads wait for the
+ recovery of mds
Severity : normal
Bugzilla : 15975
Severity : minor
Bugzilla : 16717
Description: LBUG when llog conf file is full
-Details : When llog bitmap is full, ENOSPC should be returned for plain
- log.
+Details : When llog bitmap is full, ENOSPC should be returned for plain log.
Severity : normal
Bugzilla : 16907
Frequency : on recovery
Description: I/O failures after umount during fail back
Details : if client reconnected to restarted server we need join to recovery
- instead of find server handler is changed and process self eviction
- with cancel all locks.
+ instead of find server handler is changed and process self
+ eviction with cancel all locks.
Severity : enhancement
Bugzilla : 16633
into CONN_USED_HASH and this prodice warning when put connection
again in unused hash.
+
Severity : enhancement
Bugzilla : 15899
Description: File striping can now be set to use an arbitrary pool of OSTs.
the connect flags are properly negotiated.
Severity : normal
+Frequency : often
+Bugzilla : 16125
+Description: quotas are not honored with O_DIRECT
+Details : all writes with the flag O_DIRECT will use grants which leads to
+ this problem. Now using OBD_BRW_SYNC to guard this.
+
+Severity : normal
+Bugzilla : 15058
+Description: add quota statistics
+Details : 1. sort out quota proc entries and proc code.
+ 2. add quota statistics
+
+Severity : enhancement
+Bugzilla : 13058
+Description: enable quota support for HEAD.
+
+Severity : normal
Bugzilla : 16006
Description: Properly propagate oinfo flags from lov to osc for statfs
Details : restore missing copy oi_flags to lov requests.
Severity : normal
Bugzilla : 15210
-Description: add recount protection for osc callbacks, so avoid panic on shutdown
+Description: add refcount protection for osc callbacks, avoid panic on shutdown
Severity : normal
Bugzilla : 12653
Severity : minor
Bugzilla : 15837
Description: oops in page fault handler
-Details : kernel page fault handler can return two special 'pages' in error case, don't
- try dereference NOPAGE_SIGBUS and NOPAGE_OMM.
+Details : kernel page fault handler can return two special 'pages' in error
+ case, don't try dereference NOPAGE_SIGBUS and NOPAGE_OMM.
Severity : minor
Bugzilla : 15716
Description: timeout with invalidate import.
-Details : ptlrpcd_check call obd_zombie_impexp_cull and wait request which should be
- handled by ptlrpcd. This produce long age waiting and -ETIMEOUT
- ptlrpc_invalidate_import and as result LASSERT.
+Details : ptlrpcd_check call obd_zombie_impexp_cull and wait request which
+ should be handled by ptlrpcd. This produce long age waiting and
+ -ETIMEOUT ptlrpc_invalidate_import and as result LASSERT.
Severity : enhancement
Bugzilla : 15741
Severity : major
Bugzilla : 14326
Description: Use old size assignment to avoid deadlock
-Details : This reverts the changes in bugs 2369 and bug 14138 that introduced
+Details : Reverts the changes in bugs 2369 and bug 14138 that introduced
the scheduling while holding a spinlock. We do not need locking
for size in ll_update_inode() because size is only updated from
the MDS for directories or files without objects, so there is no
Frequency : rare, on recovery
Description: read procfs can produce deadlock in some situation
Details : Holding lprocfs lock which send rpc can produce block for destroy
- obd objects and this also block reconnect with -EALREADY. This isn't
- fix all lprocfs bugs - but make it rare.
+ obd objects and this also block reconnect with -EALREADY. This
+ isn't fix all lprocfs bugs - but make it rare.
Severity : enhancement
Bugzilla : 15152
Frequency : occasional
Bugzilla : 13537
Description: Correctly check stale fid, not start epoch if ost not support SOM
-Details : open with flag O_CREATE need set old fid in op_fid3 because op_fid2
- overwrited with new generated fid, but mds can anwer with one of these
- two fids and both is not stale. setattr incorectly start epoch and
- assume will be called done_writeting, but without SOM done_writing
- never called.
+Details : open with flag O_CREATE need set old fid in op_fid3 because
+ op_fid2 was overwritten with new generated fid, but mds can answer
+ with one of these two fids and both is not stale. Setattr
+ incorrectly started an epoch and assume will be called
+ done_writing, but without SOM done_writing ever being called.
Severity : major
Frequency : rare, depends on device drivers and load
Frequency : rare
Bugzilla : 13196
Description: Don't allow skipping OSTs if index has been specified.
-Details : Don't allow skipping OSTs if index has been specified, make locking
- in internal create lots better.
+Details : Don't allow skipping OSTs if index has been specified, make
+ locking in internal create lots better.
Severity : normal
Bugzilla : 12228
Severity : enhancement
Bugzilla : 10786
Description: omit set fsid for export NFS
-Details : fix set/restore device id for avoid EMFILE error and mark lustre fs
- as FS_REQUIRES_DEV for avoid problems with generate fsid.
+Details : fix set/restore device id for avoid EMFILE error and mark lustre
+ fs as FS_REQUIRES_DEV for avoid problems with generate fsid.
Severity : normal
Bugzilla : 13304
Severity : normal
Bugzilla : 15950
Description: Hung threads in invalidate_inode_pages2_range
-Details : The direct IO path doesn't call check_rpcs to submit a new RPC once
- one is completed. As a result, some RPCs are stuck in the queue
- and are never sent.
+Details : The direct IO path doesn't call check_rpcs to submit a new RPC
+ once one is completed. As a result, some RPCs are stuck in the
+ queue and are never sent.
Severity : normal
Bugzilla : 14629
Severity : normal
Bugzilla : 16199
Description: don't always update ctime in ext3_xattr_set_handle()
-Details : Current xattr code updates the inode ctime in ext3_xattr_set_handle.
+Details : Current xattr code updates inode ctime in ext3_xattr_set_handle.
In some cases the ctime should not be updated, for example for
2.0->1.8 compatibility it is necessary to delete an xattr and it
should not update the ctime.
Severity : enhancement
Bugzilla : 14095
Description: Add lustre_start utility to start or stop multiple Lustre servers
- from a CSV file.
+ from a CSV file.
Severity : major
Bugzilla : 17024
Severity : normal
Bugzilla : 17026
-Description: (ptllnd_peer.c:557:kptllnd_peer_check_sends()) ASSERTION(!in_interrupt()) failed
-Details : fix stack overflow in the distributed lock manager by defering export
- eviction after a failed ast to the elt thread instead of handling
- it in the dlm interpret routine.
+Description: kptllnd_peer_check_sends()) ASSERTION(!in_interrupt()) failed
+Details : fix stack overflow in the distributed lock manager by defering
+ export eviction after a failed AST to the elt thread instead of
+ handling it in the dlm interpret routine.
Severity : normal
Bugzilla : 16450
Severity : normal
Bugzilla : 16450
Description: Add lockdep support to dt_object_operations locking interface.
-Details : Augment ->do_{read,write}_lock() prototypes with a `role' parameter
- indicating lock ordering. Update mdd code to use new locking
- interface.
+Details : Augment ->do_{read,write}_lock() prototypes with a `role'
+ parameter indicating lock ordering. Update mdd code to use new
+ locking interface.
Severity : normal
Bugzilla : 16450
Severity : normal
Bugzilla : 16450
Description: Add lu_ref support to ldlm_lock
-Details : lu_ref support for ldlm_lock and ldlm_resource. See lu_ref patch.
- lu_ref fields ->l_reference and ->lr_reference are added to ldlm_lock
- and ldlm_resource. LDLM interface has to be changed, because code that
+Details : lu_ref support for ldlm_lock and ldlm_resource. See lu_ref patch.
+ lu_ref fields ->l_reference and ->lr_reference are added to ldlm_lock
+ and ldlm_resource. LDLM interface has to be changed, because code that
releases a reference on a lock, has to "know" what reference this is.
In the most frequent case
...
LDLM_LOCK_PUT(lock);
- no changes are required. When any other reference (received _not_ from
- ldlm_handle2lock()) is released, LDLM_LOCK_RELEASE() has to be called
+ no changes are required. When any other reference (received _not_ from
+ ldlm_handle2lock()) is released, LDLM_LOCK_RELEASE() has to be called
instead of LDLM_LOCK_PUT().
Arguably, changes are pervasive, and interface requires some discipline
- for proper use. On the other hand, it was very instrumental in finding
+ for proper use. On the other hand, it was very instrumental in finding
a few leaked lock references.
Severity : normal
Severity : normal
Bugzilla : 16450
Description: Add ldlm_weigh_callback().
-Details : Add new ->l_weigh_ast() call-back to ldlm_lock. It is called
+Details : Add new ->l_weigh_ast() call-back to ldlm_lock. It is called
by ldlm_cancel_shrink_policy() to estimate lock "value", instead of
hard-coded `number of pages' logic.
Severity : normal
Bugzilla : 16450
Description: Add start and stop methods to lu_device_type_operations.
-Details : Introduce two new methods in lu_device_type_operations, that are
- invoked when first instance of a given type is created and last one
+Details : Introduce two new methods in lu_device_type_operations, that are
+ invoked when first instance of a given type is created and last one
is destroyed respectively. This is need by CLIO.
Severity : normal
Bugzilla : 16450
Description: Introduce struct md_site and move meta-data specific parts of
struct lu_site here.
-Details : Move md-specific fields out of struct lu_site into special struct
+Details : Move md-specific fields out of struct lu_site into special struct
md_site, so that lu_site can be used on a client.
Severity : minor
Severity : normal
Bugzilla : 16450
Description: Add special type for ptlrpc_request interpret functions.
-Details : Add lu_env parameter to ->rq_interpreter call-back. NULL is passed
- there. Actual usage will be in CLIO.
+Details : Add lu_env parameter to ->rq_interpreter call-back. NULL is passed
+ there. Actual usage will be in CLIO.
Severity : normal
Bugzilla : 16450
Description: Use cdebug_show() in CDEBUG-style macros defined outside of libcfs.
Details : Use cdebug_show() in CDEBUG-style macros defined outside of libcfs.
+Severity : normal
+Bugzilla : 16450
+Description: Liblustre build fixes.
+Details : Liblustre build fixes.
+
+Severity : normal
+Bugzilla : 16450
+Description: libcfs: add cfs_{need,cond}_resched() interface.
+Details : libcfs: add cfs_{need,cond}_resched() definition and
+ implementations for Linux, NT, and liblustre.
+
+Severity : enhancement
+Bugzilla : 12800
+Description: More exported tunables for mballoc
+Details : Add support for tunable preallocation window and new tunables for
+ large/small requests
+
+Severity : normal
+Bugzilla : 16680
+Description: Detect corruption of block bitmap and checking for preallocations
+Details : Checks validity of on-disk block bitmap. Also it does better
+ checking of number of applied preallocations. When corruption is
+ found, it turns filesystem readonly to prevent further corruptions.
+
+Severity : normal
+Bugzilla : 17197
+Description: (rw.c:1323:ll_read_ahead_pages()) ASSERTION(page_idx > ria->ria_stoff) failed
+Details : Once the unmatched stride IO mode is detected, shrink the stride-ahead
+ window to 0. If it does hit cache miss, and read-pattern is still
+ stride-io mode, does not reset the stride window, but also does not
+ increase the stride window length in this case.
+
+Severity : normal
+Bugzilla : 16438
+Frequency : only for big-endian servers
+Description: Check if system is big-endian while mounting fs with extents feature
+Details : Mounting a filesystem with extents feature will fail on big-endian
+ systems since ext3-based ldiskfs is not supported on big-endian
+ systems. This can be over-riden with "bigendian_extents" mount option.
+
+Severity : enhancement
+Bugzilla : 12749
+Description: The root squash functionality
+Details : A security feature, which is to prevent users from being able
+ to mount lustre on their desktop, run as root, and delete
+ all of the files in the filesystem. The goal is accomplished by
+ remapping user id (UID) and group id (GID) of the root user to
+ a UID and GID specified by the system administartor via Lustre
+ configuration management server (MGS). The functionality also
+ allows to specify sets of clients for which the remapping does
+ not apply.
+
+Severity : normal
+Bugzilla : 16860
+Description: Excessive recovery window
+Details : With AT enabled, the recovery window can be excessively long (6000+
+ seconds). To address this problem, we no longer use
+ OBD_RECOVERY_FACTOR when extending the recovery window (the connect
+ timeout no longer depends on the service time, it is set to
+ INITIAL_CONNECT_TIMEOUT now) and clients report the old service
+ time via pb_service_time.
+
+Severity : normal
+Bugzilla : 16522
+Description: Watchdog triggered on MDS failover
+Details : enable OBD_CONNECT_MDT flag when connecting from the MDS so that
+ the OSTs know that the MDS "UUID" can be reused for the same export
+ from a different NID, so we do not need to wait for the export to be
+ evicted
+
+Severity : major
+Frequency : rare, only if using MMP with Linux RAID
+Bugzilla : 17895
+Description: MMP doesn't work with Linux RAID
+Details : While using HA for Lustre servers with Linux RAID, it is possible
+ that MMP will not detect multiple mounts. To make this work we
+ need to unplug the device queue in RAID when the MMP block is being
+ written. Also while reading the MMP block, we should read it from
+ disk and not the cached one.
+
+Severity : enhancement
+Bugzilla : 17187
+Description: open file using fid
+Details : A file can be opened using just its fid, like
+ <mntpt>/.lustre/fid/SEQ:OID:VER - this is needed for HSM and replication
--------------------------------------------------------------------------------
Frequency : during server recovery
Bugzilla : 11203
Description: MDS failing to send precreate requests due to OSCC_FLAG_RECOVERING
-Details : request with rq_no_resend flag not awake l_wait_event if they get a
- timeout.
+Details : request with rq_no_resend flag not awake l_wait_event if they get
+ a timeout.
Severity : minor
Frequency : nfs export on patchless client