tbd Sun Microsystems, Inc.
- * version 1.8.0
+ * version 2.0.0
* Support for kernels:
- 2.6.16.54-0.2.5 (SLES 10),
- 2.6.18-53.1.14.el5 (RHEL 5),
+ 2.6.16.60-0.23 (SLES 10),
+ 2.6.18-92.1.6.el5 (RHEL 5),
2.6.22.14 vanilla (kernel.org).
* Client support for unpatched kernels:
(see http://wiki.lustre.org/index.php?title=Patchless_Client)
2.6.16 - 2.6.21 vanilla (kernel.org)
- * Recommended e2fsprogs version: 1.40.7-sun1
+ * Recommended e2fsprogs version: 1.40.11-sun1
* Note that reiserfs quotas are disabled on SLES 10 in this kernel.
* RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd' to a
removed cwd "./" (refer to Bugzilla 14399).
Severity : enhancement
+Bugzilla : 16091
+Description: configure's --enable-quota should check the
+ : kernel .config for CONFIG_QUOTA
+Details : configure is terminated if --enable-quota is passed but
+ : no quota support is in kernel
+
+Severity : normal
+Bugzilla : 13139
+Description: Remove portals compatibility
+Details : Remove portals compatibility, not interoperable with releases
+ before 1.4.6
+
+Severity : normal
+Bugzilla : 15576
+Description: Resolve device initialization race
+Details : Prevent proc handler from accessing devices added to the
+ obd_devs array but yet be intialized.
+
+Severity : enhancement
+Bugzilla : 15308
+Description: Update to SLES10 SP2 kernel-2.6.16.60-0.23.
+
+Severity : enhancement
+Bugzilla : 16190
+Description: Update to RHEL5 kernel-2.6.18-92.1.6.el5.
+
+Severity : normal
+Bugzilla : 12975
+Frequency : rare
+Description: Using wrong pointer in osc_brw_prep_request
+Details : Access to array[-1] can produce panic if kernel compiled with
+ CONFIG_PAGE_ALLOC enabled
+
+Severity : normal
+Bugzilla : 16037
+Description: Client runs out of low memory
+Details : Consider only lowmem when counting initial number of llap pages
+
+Severity : normal
+Bugzilla : 15625
+Description: *optional* service tags registration
+Details : if the "service tags" package is installed on a Lustre node
+ When the filesystem is mounted, a local-node service tag will
+ be created. See http://inventory.sun.com/ for more information
+ about the Service Tags asset management system.
+
+Severity : normal
+Bugzilla : 15825
+Description: Kernel BUG tries to release flock
+Details : Lustre does not destroy flock lock before last reference goes
+ away. So always drop flock locks when client is evicted and
+ perform unlock regardless of successfulness of speaking to MDS.
+
+Severity : normal
+Bugzilla : 15210
+Description: add recount protection for osc callbacks, so avoid panic on shutdown
+
+Severity : normal
+Bugzilla : 12653
+Description: sanity test 65a fails if stripecount of -1 is set
+Details : handle -1 striping on filesystem in ll_dirstripe_verify
+
+Severity : normal
+Bugzilla : 14742
+Frequency : rare
+Description: ASSERTION(CheckWriteback(page,cmd)) failed
+Details : badly clear PG_Writeback bit in ll_ap_completion can produce false
+ positive assertion.
+
+Severity : enhancement
+Bugzilla : 15865
+Description: Update to RHEL5 kernel-2.6.18-53.1.21.el5.
+
+Severity : major
+Bugzilla : 15924
+Description: do not process already freed flock
+Details : flock can possibly be freed by another thread before it reaches
+ to ldlm_flock_completion_ast.
+
+Severity : normal
+Bugzilla : 14480
+Description: LBUG during stress test
+Details : Need properly lock accesses the flock deadlock detection list.
+
+Severity : minor
+Bugzilla : 15837
+Description: oops in page fault handler
+Details : kernel page fault handler can return two special 'pages' in error case, don't
+ try dereference NOPAGE_SIGBUS and NOPAGE_OMM.
+
+Severity : minor
+Bugzilla : 15716
+Description: timeout with invalidate import.
+Details : ptlrpcd_check call obd_zombie_impexp_cull and wait request which should be
+ handled by ptlrpcd. This produce long age waiting and -ETIMEOUT
+ ptlrpc_invalidate_import and as result LASSERT.
+
+Severity : enhancement
+Bugzilla : 15741
+Description: Update to RHEL5 kernel-2.6.18-53.1.19.el5.
+
+Severity : major
+Bugzilla : 14134
+Description: enable MGS and MDT services start separately
+Details : add a 'nomgs' option in mount.lustre to enable start a MDT with
+ a co-located MGS without starting the MGS, which is a complement
+ to 'nosvc' mount option.
+
+Severity : normal
+Bugzilla : 14835
+Frequency : after recovery
+Description: precreate to many object's after del orphan.
+Details : del orphan st in oscc last_id == next_id and this triger growing
+ count of precreated objects. Set flag LOW to skip increase count
+ of precreated objects.
+
+Severity : normal
+Bugzilla : 15139
+Frequency : rare, on clear nid stats
+Description: ASSERTION(client_stat->nid_exp_ref_count == 0)
+Details : when clean nid stats sometimes try destroy live entry,
+ and this produce panic in free.
+
+Severity : major
+Bugzilla : 15575
+Description: Stack overflow during MDS log replay
+ ease stack pressure by using a thread dealing llog_process.
+
+Severity : normal
+Bugzilla : 15443
+Description: wait until IO finished before start new when do lock cancel.
+Details : VM protocol want old IO finished before start new, in this case
+ need wait until PG_writeback is cleared until check dirty flag and
+ call writepages in lock cancel callback.
+
+Severity : enhancement
Bugzilla : 14929
Description: using special macro for print time and cleanup in includes.
Severity : normal
Bugzilla : 12888
-Description: mds_mfd_close() ASSERTION(rc == 0)
-Details : In mds_mfd_close(), we need protect inode's writecount change
- within its orphan write semaphore to prevent possible races.
+Description: mds_mfd_close() ASSERTION(rc == 0)
+Details : In mds_mfd_close(), we need protect inode's writecount change
+ within its orphan write semaphore to prevent possible races.
Severity : minor
Bugzilla : 14929
Bugzilla : 14949
Description: don't panic with use echo client
Details : echo client pass NULL as client nid pointer and this produce null
- pointer dereference.
+ pointer dereference.
Severity : normal
Bugzilla : 15278
Severity : normal
Bugzilla : 13380
-Description: fix for occasional failure case of -ENOSPC in recovery-small tests
-Details : Move the 'good_osts' check before the 'total_bavail' check. This
- will result in an -EAGAIN and in the exit call path we call
- alloc_rr() which will with increasing aggressiveness attempt to
+Description: fix for occasional failure case of -ENOSPC in recovery-small tests
+Details : Move the 'good_osts' check before the 'total_bavail' check. This
+ will result in an -EAGAIN and in the exit call path we call
+ alloc_rr() which will with increasing aggressiveness attempt to
aquire precreated objects on the minimum number of required OSCs.
Severity : major
Bugzilla : 14326
Description: Use old size assignment to avoid deadlock
Details : This reverts the changes in bugs 2369 and bug 14138 that introduced
- the scheduling while holding a spinlock. We do not need locking
- for size in ll_update_inode() because size is only updated from
- the MDS for directories or files without objects, so there is no
- other place to do the update, and concurrent access to such inodes
+ the scheduling while holding a spinlock. We do not need locking
+ for size in ll_update_inode() because size is only updated from
+ the MDS for directories or files without objects, so there is no
+ other place to do the update, and concurrent access to such inodes
are protected by the inode lock.
Severity : normal
Bugzilla : 14803
Description: Don't update lov_desc members until making sure they are valid
Details : When updating lov_desc members via proc fs, need fix their
- validities before doing the real update.
+ validities before doing the real update.
Severity : normal
Bugzilla : 15069
of the error messages complaining that MGS is not connected.
Severity : major
+Bugzilla : 15027
+Frequency : on network error
+Description: panic with double free request if network error
+Details : mdc_finish_enqueue is finish request if any network error ocuring,
+ but it's true only for synchronus enqueue, for async enqueue
+ (via ptlrpcd) this incorrect and ptlrpcd want finish request
+ himself.
+
+Severity : enhancement
+Bugzilla : 11401
+Description: client-side metadata stat-ahead during readdir(directory readahead)
+Details : perform client-side metadata stat-ahead when the client detects
+ readdir and sequential stat of dir entries therein
+
+Severity : major
Frequency : on start mds
Bugzilla : 14884
Description: Implement get_info(last_id) in obdfilter.
Description: Oops in read and write path when failing to allocate lock.
Details : Check if lock allocation failed and return error back.
-Severity : normal
+Severity : normal
Bugzilla : 11679
Description: lstripe command fails for valid OST index
Details : The stripe offset is compared to lov->desc.ld_tgt_count
Description: lfs find on -1 stripe looping in lsm_lmm_verify_common()
Details : Avoid lov_verify_lmm_common() on directory with -1 stripe count.
-Severity : major
-Bugzilla : 12932
-Description: obd_health_check_timeout too short
-Details : set obd_health_check_timeout as 1.5x of obd_timeout
+Severity : enhancement
+Bugzilla : 3055
+Description: Adaptive timeouts
+Details : RPC timeouts adapt to changing server load and network
+ conditions to reduce resend attempts and improve recovery time.
Severity : normal
Bugzilla : 12192
Bugzilla : 13570
Description: To avoid grant space > avaible space when the disk is almost
full. Without this patch you might see the error "grant XXXX >
- available" or some LBUG about grant, when the disk is almost
+ available" or some LBUG about grant, when the disk is almost
full.
Details : In filter_check_grant, for non_grant cache write, we should
check the left space by if (*left > ungranted + bytes), instead
- of (*left > ungranted), because only we are sure the left space
+ of (*left > ungranted), because only we are sure the left space
is enough for another "bytes", then the ungrant space should be
increase. In client, we should update cl_avail_grant only there
is OBD_MD_FLGRANT in the reply.
Bugzilla : 11270
Description: eliminate client locks in face of contention
Details : file contention detection and lockless i/o implementation
- for contended files.
+ for contended files.
Severity : normal
Bugzilla : 15212
Description: Reinitialize optind to 0 so that interactive lfs works in all cases
+Severity : critical
+Frequency : very rare, if additional xattrs are used on kernels >= 2.6.12
+Bugzilla : 15777
+Description: MDS may lose file striping (and hence file data) in some cases
+Details : If there are additional extended attributes stored on the MDS,
+ in particular ACLs, SELinux, or user attributes (if user_xattr
+ is specified for the client mount options) then there is a risk
+ of attribute loss. Additionally, the Lustre file striping
+ needs to be larger than default (e.g. striped over all OSTs),
+ and an additional attribute must be stored initially in the
+ inode and then increase in size enough to be moved to the
+ external attribute block (e.g. ACL growing in size) for file
+ data to be lost.
+
+Severity : normal
+Bugzilla : 15346
+Description: skiplist implementation simplification
+Details : skiplists are used to group compatible locks on granted list
+ that was implemented as tracking first and last lock of each lock group
+ the patch changes that to using doubly linked lists
+
+Severity : normal
+Bugzilla : 15574
+Description: MDS LBUG: ASSERTION(!IS_ERR(dchild))
+Details : Change LASSERTs to client eviction (i.e. abort client's recovery)
+ because LASSERT on both the data supplied by a client, and the data
+ on disk is dangerous and incorrect.
+
+Severity : enhancement
+Bugzilla : 10718
+Description: Slow truncate/writes to huge files at high offsets.
+Details : Directly associate cached pages to lock that protect those pages,
+ this allows us to quickly find what pages to write and remove
+ once lock callback is received.
+
+Severity : normal
+Bugzilla : 15953
+Description: more ldlm soft lockups
+Details : In ldlm_resource_add_lock(), call to ldlm_resource_dump()
+ starve other threads from the resource lock for a long time in
+ case of long waiting queue, so change the debug level from
+ D_OTHER to the less frequently used D_INFO.
+
+Severity : enhancement
+Bugzilla : 13128
+Description: add -gid, -group, -uid, -user options to lfs find
+
+Severity : normal
+Bugzilla : 15950
+Description: Hung threads in invalidate_inode_pages2_range
+Details : The direct IO path doesn't call check_rpcs to submit a new RPC once
+ one is completed. As a result, some RPCs are stuck in the queue
+ and are never sent.
+
+Severity : normal
+Bugzilla : 14629
+Description: filter threads hungs on waiting journal commit
+Details : Cleanup filter group llog code, then only filter group llog will
+ be only created in the MDS/OST syncing process.
+
+Severity : normal
+Bugzilla : 15684
+Description: Procfs and llog threads access destoryed import sometimes.
+Details : Sync the import destoryed process with procfs and llog threads by
+ the import refcount and semaphore.
+
+Severity : enhancement
+Bugzilla : 14975
+Description: openlock cache of b1_6 port to HEAD
+
+Severity : major
+Frequncy : rare
+Bugzilla : 16226
+Description: kernel BUG at ldiskfs2_ext_new_extent_cb
+Details : If insertion of an extent fails, then discard the inode
+ preallocation and free data blocks else it can lead to duplicate
+ blocks.
+
+Severity : normal
+Bugzilla : 16199
+Description: don't always update ctime in ext3_xattr_set_handle()
+Details : Current xattr code updates the inode ctime in ext3_xattr_set_handle.
+ In some cases the ctime should not be updated, for example for
+ 2.0->1.8 compatibility it is necessary to delete an xattr and it
+ should not update the ctime.
+
+Severity : major
+Frequency : rare
+Bugzilla : 15713/16362
+Description: Assertion in iopen_connect_dentry in 1.6.3
+Details : looking up an inode via iopen with the wrong generation number can
+ populate the dcache with a disconneced dentry while the inode
+ number is in the process of being reallocated. This causes an
+ assertion failure in iopen since the inode's dentry list contains
+ both a connected and disconnected dentry.
+
+Severity : normal
+Bugzilla : 16496
+Description: assertion failure in ldlm_handle2lock()
+Details : fix a race between class_handle_unhash() and class_handle2object()
+ introduced in lustre 1.6.5 by bug 13622.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 12755
+Description: Kernel BUG: sd_iostats_bump: unexpected disk index
+Details : remove the limit of 256 scsi disks in the sd_iostat patch
+
+Severity : minor
+Frequency : rare
+Bugzilla : 16494
+Description: oops in sd_iostats_seq_show()
+Details : unloading/reloading the scsi low level driver triggers a kernel
+ bug when trying to access the sd iostat file.
+
+Severity : major
+Frequency : rare
+Bugzilla : 16404
+Description: Kernel panics during QLogic driver reload
+Details : REQ_BLOCK_PC requests are not handled properly in the sd iostat
+ patch, causing memory corruption.
+
+Severity : minor
+Frequency : rare
+Bugzilla : 16140
+Description: journal_dev option does not work in b1_6
+Details : pass mount option during pre-mount.
+
--------------------------------------------------------------------------------
2007-08-10 Cluster File Systems, Inc. <info@clusterfs.com>