X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=blobdiff_plain;f=lustre%2FChangeLog;h=f6ed4e6a1137d70a1f052f26364d6326fbb69b9b;hp=56f7464cd78aebd0d5d115dd819c3c5da56f7dc8;hb=1377b92eb0ed2e6100b15e812109b783bfb7ff45;hpb=a1f3c20e14448b2056d306ac8b8bda127aedcfcd diff --git a/lustre/ChangeLog b/lustre/ChangeLog index 56f7464..f6ed4e6 100644 --- a/lustre/ChangeLog +++ b/lustre/ChangeLog @@ -1,22 +1,190 @@ tbd Sun Microsystems, Inc. * version 2.0.0 * Support for kernels: - 2.6.16.54-0.2.5 (SLES 10), - 2.6.18-53.1.21.el5 (RHEL 5), + 2.6.16.60-0.27 (SLES 10), + 2.6.18-92.1.10.el5 (RHEL 5), 2.6.22.14 vanilla (kernel.org). * Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.21 vanilla (kernel.org) - * Recommended e2fsprogs version: 1.40.7-sun3 + * Recommended e2fsprogs version: 1.40.11-sun1 * Note that reiserfs quotas are disabled on SLES 10 in this kernel. * RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd' to a removed cwd "./" (refer to Bugzilla 14399). + * File join has been disabled in this release, refer to Bugzilla 16929. + +Severity : enhancement +Bugzilla : 1819 +Description: Add /proc entry for import status +Details : The mdc, osc, and mgc import directories now have + an import directory that contains useful import data for debugging + connection problems. + +Severity : enhancement +Bugzilla : 15966 +Description: Re-disable certain /proc logging +Details : Enable and disable client's offset_stats, extents_stats and + extents_stats_per_process stats logging on the fly. + +Severity : major +Frequency : Only on FC kernels 2.6.22+ +Bugzilla : 16303 +Description: oops in statahead +Details : Do not drop reference count for the dentry from VFS when lookup, + VFS will do that by itself. + +Severity : enhancement +Bugzilla : 16643 +Description: Generic /proc file permissions +Details : Set /Proc file permissions in a more generic way to enable non- + root users operate on some /proc files. + +Severity : major +Bugzilla : 16561 +Description: Hitting mdc_commit_close() ASSERTION +Details : Properly handle request reference release in + ll_release_openhandle(). + +Severity : normal +Bugzilla : 15975 +Frequency : only patchless client +Description: add workaround for race between add/remove dentry from hash + +Severity : enhancement +Bugzilla : 16845 +Description: Allow OST glimpses to return PW locks + +Severity : minor +Bugzilla : 16717 +Description: LBUG when llog conf file is full +Details : When llog bitmap is full, ENOSPC should be returned for plain + log. + +Severity : normal +Bugzilla : 16907 +Description: Prevent import from entering FULL state when server in recovery + +Severity : major +Bugzilla : 16750 +Description: service mount cannot take device name with ":" +Details : Only when device name contains ":/" will mount treat it as + client mount. + +Severity : normal +Bugzilla : 15927 +Frequency : rare +Description: replace ptlrpcd with the statahead thread to interpret the async + statahead RPC callback + +Severity : normal +Bugzilla : 16611 +Frequency : on recovery +Description: I/O failures after umount during fail back +Details : if client reconnected to restarted server we need join to recovery + instead of find server handler is changed and process self eviction + with cancel all locks. + +Severity : enhancement +Bugzilla : 16633 +Description: Update to RHEL5 kernel-2.6.18-92.1.10.el5. + +Severity : enhancement +Bugzilla : 16547 +Description: Update to SLES10 SP2 kernel-2.6.16.60-0.27. + +Severity : enhancement +Bugzilla : 16566 +Description: Upcall on Lustre log has been dumped +Details : Allow for a user mode script to be called once a Lustre log has + been dumped. It passes the filename of the dumped log to the + script, the location of the script can be specified via + /proc/sys/lnet/debug_log_upcall. + +Severity : minor +Bugzilla : 16583 +Frequency : rare +Description: avoid idr_remove called for id which is not allocated. +Details : Move assigment s_dev for clustered nfs to end of initialization, + to avoid problem with error handling. + +Severity : minor +Bugzilla : 16109 +Frequency : rare +Description: avoid Already found the key in hash [CONN_UNUSED_HASH] messages +Details : When connection is reused this not moved from CONN_UNUSED_HASH + into CONN_USED_HASH and this prodice warning when put connection + again in unused hash. + +Severity : enhancement +Bugzilla : 15899 +Description: File striping can now be set to use an arbitrary pool of OSTs. + +Severity : enhancement +Bugzilla : 16573 +Description: Export bytes_read/bytes_write count on OSC/OST. + +Severity : normal +Bugzilla : 16237 +Description: Early reply size mismatch, MGC loses connection +Details : Apply the MGS_CONNECT_SUPPORTED mask at reconnect time so + the connect flags are properly negotiated. + +Severity : normal +Bugzilla : 16006 +Description: Properly propagate oinfo flags from lov to osc for statfs +Details : restore missing copy oi_flags to lov requests. + +Severity : enhancement +Bugzilla : 16581 +Description: Add man pages for llobdstat(8), llstat(8), plot-llstat(8), + : l_getgroups(8), lst(8), routerstat(8) +Details : included man pages for llobdstat(8), llstat(8), + : plot-llstat(8), l_getgroups(8), lst(8), routerstat(8) + +Severity : enhancement +Bugzilla : 16091 +Description: configure's --enable-quota should check the + : kernel .config for CONFIG_QUOTA +Details : configure is terminated if --enable-quota is passed but + : no quota support is in kernel + +Severity : normal +Bugzilla : 13139 +Description: Remove portals compatibility +Details : Remove portals compatibility, not interoperable with releases + before 1.4.6 + +Severity : normal +Bugzilla : 15576 +Description: Resolve device initialization race +Details : Prevent proc handler from accessing devices added to the + obd_devs array but yet be intialized. + +Severity : enhancement +Bugzilla : 15308 +Description: Update to SLES10 SP2 kernel-2.6.16.60-0.23. + +Severity : enhancement +Bugzilla : 16190 +Description: Update to RHEL5 kernel-2.6.18-92.1.6.el5. + +Severity : normal +Bugzilla : 12975 +Frequency : rare +Description: Using wrong pointer in osc_brw_prep_request +Details : Access to array[-1] can produce panic if kernel compiled with + CONFIG_PAGE_ALLOC enabled + +Severity : normal +Bugzilla : 16037 +Description: Client runs out of low memory +Details : Consider only lowmem when counting initial number of llap pages Severity : normal Bugzilla : 15625 Description: *optional* service tags registration Details : if the "service tags" package is installed on a Lustre node - When the filesystem is mounted, a local-node service tag will + When the filesystem is mounted, a local-node service tag will be created. See http://inventory.sun.com/ for more information about the Service Tags asset management system. @@ -24,7 +192,7 @@ Severity : normal Bugzilla : 15825 Description: Kernel BUG tries to release flock Details : Lustre does not destroy flock lock before last reference goes - away. So always drop flock locks when client is evicted and + away. So always drop flock locks when client is evicted and perform unlock regardless of successfulness of speaking to MDS. Severity : normal @@ -592,7 +760,7 @@ Frequency : rare Description: Oops in read and write path when failing to allocate lock. Details : Check if lock allocation failed and return error back. -Severity : normal +Severity : normal Bugzilla : 11679 Description: lstripe command fails for valid OST index Details : The stripe offset is compared to lov->desc.ld_tgt_count @@ -885,11 +1053,11 @@ Severity : normal Bugzilla : 13570 Description: To avoid grant space > avaible space when the disk is almost full. Without this patch you might see the error "grant XXXX > - available" or some LBUG about grant, when the disk is almost + available" or some LBUG about grant, when the disk is almost full. Details : In filter_check_grant, for non_grant cache write, we should check the left space by if (*left > ungranted + bytes), instead - of (*left > ungranted), because only we are sure the left space + of (*left > ungranted), because only we are sure the left space is enough for another "bytes", then the ungrant space should be increase. In client, we should update cl_avail_grant only there is OBD_MD_FLGRANT in the reply. @@ -938,7 +1106,7 @@ Details : Console messages can now be disabled via lnet.printk. Severity : normal Bugzilla : 14614 -Description: User code with malformed file open parameter crashes client node +Description: User code with malformed file open parameter crashes client node Details : Before packing join_file req, all the related reference should be checked carefully in case some malformed flags cause fake join_file req on client. @@ -1069,7 +1237,7 @@ Severity : normal Bugzilla : 14257 Description: LASSERT on MDS when client holding flock lock dies Details : ldlm pool logic depends on number of granted locks equal to - number of released locks which is not true for flock locks, so + number of released locks which is not true for flock locks, so just exclude such locks from consideration. Severity : normal @@ -1089,21 +1257,21 @@ Severity : enhancement Bugzilla : 11089 Description: organize the server-side client stats on per-nid basis Details : Change the structure of stats under obdfilter and mds to - New structure: - +- exports - +- nid#1 - | + stats - | + uuids - +- nid#2... - +- clear - The "uuid"s file would list the uuids of _active_ exports. - And the clear entry is to clear all stats and stale nids. + New structure: + +- exports + +- nid#1 + | + stats + | + uuids + +- nid#2... + +- clear + The "uuid"s file would list the uuids of _active_ exports. + And the clear entry is to clear all stats and stale nids. Severity : enhancement Bugzilla : 11270 Description: eliminate client locks in face of contention Details : file contention detection and lockless i/o implementation - for contended files. + for contended files. Severity : normal Bugzilla : 15212 @@ -1127,15 +1295,15 @@ Severity : normal Bugzilla : 15346 Description: skiplist implementation simplification Details : skiplists are used to group compatible locks on granted list - that was implemented as tracking first and last lock of each lock group - the patch changes that to using doubly linked lists + that was implemented as tracking first and last lock of each + lock group the patch changes that to using doubly linked lists Severity : normal Bugzilla : 15574 Description: MDS LBUG: ASSERTION(!IS_ERR(dchild)) Details : Change LASSERTs to client eviction (i.e. abort client's recovery) - because LASSERT on both the data supplied by a client, and the data - on disk is dangerous and incorrect. + because LASSERT on both the data supplied by a client, and the + data on disk is dangerous and incorrect. Severity : enhancement Bugzilla : 10718 @@ -1164,14 +1332,14 @@ Details : The direct IO path doesn't call check_rpcs to submit a new RPC once and are never sent. Severity : normal -Bugzilla : 14629 -Description: filter threads hungs on waiting journal commit +Bugzilla : 14629 +Description: filter threads hungs on waiting journal commit Details : Cleanup filter group llog code, then only filter group llog will be only created in the MDS/OST syncing process. Severity : normal -Bugzilla : 15684 -Description: Procfs and llog threads access destoryed import sometimes. +Bugzilla : 15684 +Description: Procfs and llog threads access destoryed import sometimes. Details : Sync the import destoryed process with procfs and llog threads by the import refcount and semaphore. @@ -1179,6 +1347,390 @@ Severity : enhancement Bugzilla : 14975 Description: openlock cache of b1_6 port to HEAD +Severity : major +Frequncy : rare +Bugzilla : 16226 +Description: kernel BUG at ldiskfs2_ext_new_extent_cb +Details : If insertion of an extent fails, then discard the inode + preallocation and free data blocks else it can lead to duplicate + blocks. + +Severity : normal +Bugzilla : 16199 +Description: don't always update ctime in ext3_xattr_set_handle() +Details : Current xattr code updates the inode ctime in ext3_xattr_set_handle. + In some cases the ctime should not be updated, for example for + 2.0->1.8 compatibility it is necessary to delete an xattr and it + should not update the ctime. + +Severity : major +Frequency : rare +Bugzilla : 15713/16362 +Description: Assertion in iopen_connect_dentry in 1.6.3 +Details : looking up an inode via iopen with the wrong generation number can + populate the dcache with a disconneced dentry while the inode + number is in the process of being reallocated. This causes an + assertion failure in iopen since the inode's dentry list contains + both a connected and disconnected dentry. + +Severity : normal +Bugzilla : 16496 +Description: assertion failure in ldlm_handle2lock() +Details : fix a race between class_handle_unhash() and class_handle2object() + introduced in lustre 1.6.5 by bug 13622. + +Severity : minor +Frequency : rare +Bugzilla : 12755 +Description: Kernel BUG: sd_iostats_bump: unexpected disk index +Details : remove the limit of 256 scsi disks in the sd_iostat patch + +Severity : minor +Frequency : rare +Bugzilla : 16494 +Description: oops in sd_iostats_seq_show() +Details : unloading/reloading the scsi low level driver triggers a kernel + bug when trying to access the sd iostat file. + +Severity : major +Frequency : rare +Bugzilla : 16404 +Description: Kernel panics during QLogic driver reload +Details : REQ_BLOCK_PC requests are not handled properly in the sd iostat + patch, causing memory corruption. + +Severity : minor +Frequency : rare +Bugzilla : 16140 +Description: journal_dev option does not work in b1_6 +Details : pass mount option during pre-mount. + +Severity : enhancement +Bugzilla : 10555 +Description: Add a FIEMAP(FIle Extent MAP) ioctl +Details : FIEMAP ioctl will allow an application to efficiently fetch the + extent information of a file. It can be used to map logical blocks + in a file to physical blocks in the block device. + +Severity : normal +Bugzilla : 15198 +Description: LDLM soft lockups - improvement +Details : It is be possible to send the lock handle along with each read + or write request because the client is already doing a lock match + itself so there isn't any reason the OST should have to re-do that + search. + +Severity : normal +Frequency : only on Cray X2 +Bugzilla : 16813 +Description: X2 build failures +Details : fix build failures on Cray X2. + +Severity : normal +Bugzilla : 2066 +Description: xid & resent requests +Details : Initialize RPC XID from clock at startup (randomly if clock is + bad). + +Severity : enhancement +Bugzilla : 14095 +Description: Add lustre_start utility to start or stop multiple Lustre servers + from a CSV file. + +Severity : major +Bugzilla : 17024 +Description: Lustre GPF in {:ptlrpc:ptlrpc_server_free_request+373} +Details : In case of memory pressure, list_del() can be called twice on + req->rq_history_list, causing a kernel oops. + +Severity : normal +Bugzilla : 17026 +Description: (ptllnd_peer.c:557:kptllnd_peer_check_sends()) ASSERTION(!in_interrupt()) failed +Details : fix stack overflow in the distributed lock manager by defering export + eviction after a failed ast to the elt thread instead of handling + it in the dlm interpret routine. + +Severity : normal +Bugzilla : 16450 +Description: Convert some comments to new format. +Details : Update documenting comments to match doxygen conventions. + +Severity : normal +Bugzilla : 16450 +Description: Grammar fixes. +Details : A couple of trivial spelling fixes. + +Severity : normal +Bugzilla : 16450 +Description: OSD_COUNTERS-mandatory +Details : Make previously optional ->oti_{w,r}_locks sanity checks mandatory + to simplify the code. + +Severity : normal +Bugzilla : 16450 +Description: simplify cmm_device freeing logic. +Details : Call cmm_device_free() in the failure path of cmm_device_alloc(). + +Severity : normal +Bugzilla : 16450 +Description: Add lockdep support to dt_object_operations locking interface. +Details : Augment ->do_{read,write}_lock() prototypes with a `role' parameter + indicating lock ordering. Update mdd code to use new locking + interface. + +Severity : normal +Bugzilla : 16450 +Description: Introduce failloc constants for lockless IO tests. +Details : Add two new failloc constants to test lockless IO. Only one of + them in implemented---another is checked in yet to be landed + core CLIO code. + +Severity : normal +Bugzilla : 16450 +Description: Add lockdep support for inode mutex. +Details : Introduce and use new LOCK_INODE_MUTEX_PARENT() macro to be used + in the situations where Lustre has to lock more than one inode + mutex at a time. + +Severity : normal +Bugzilla : 16450 +Description: Add optional invariants checking support. +Details : Add new LINVRNT() macro, optional on new --enable-invariants + configure switch. This macro is to be used for consistency and + sanity checks that are too expensive to be left in `production' + mode. + +Severity : minor +Bugzilla : 16450 +Description: Zap lock->l_granted_mode with explicit LCK_MINMODE. +Details : Use LCK_MINMODE rather than 0 to reset lock->l_granted_mode to + its initial state. + +Severity : normal +Bugzilla : 16450 +Description: Add lockdep support for ldlm_lock and ldlm_resource. +Details : Use spin_lock_nested() in (the only) situation where more than + one ldlm_lock is locked simultaneously. Also, fix possible + dead-lock in ldlm_lock_change_resource() by enforcing particular + lock ordering. + +Severity : normal +Bugzilla : 16450 +Description: Use struct ldlm_callback_suite in ldlm_lock_create(). +Details : Instead of specifying each ldlm_lock call-back through separate + parameter, wrap them into struct ldlm_callback_suite. + +Severity : normal +Bugzilla : 16450 +Description: Kill join_lru obd method and its callers. +Details : CLIO uses lock weighting policy to keep locks over mmapped regions + in memory---a requirement implemented through ->o_join_lru() obd + method in HEAD. Remove this method and its users. + +Severity : normal +Bugzilla : 16450 +Description: Add asynchronous ldlm ENQUEUE completion handler. +Details : CLIO posts ENQUEUE requests asynchronously through ptlrpcd---a + case that stock ldlm_completion_ast() cannot handle as it waits + until lock is granted. Introduce new ldlm_completion_ast_async() + for this. Also comment ldlm_completion_ast(). + +Severity : normal +Bugzilla : 16450 +Description: ldlm_error <-> errno conversion. +Details : Add functions to map (rather arbitrary) between LDLM error codes + and standard errno values. CLIO needs this to prevent LDLM specific + constants from escaping ldlm and osc. + +Severity : minor +Bugzilla : 16450 +Description: Kill unused ldlm_handle2lock_ns() function. +Details : Kill unused ldlm_handle2lock_ns() function. + +Severity : normal +Bugzilla : 16450 +Description: Add lu_ref support to ldlm_lock +Details : lu_ref support for ldlm_lock and ldlm_resource. See lu_ref patch. + lu_ref fields ->l_reference and ->lr_reference are added to ldlm_lock + and ldlm_resource. LDLM interface has to be changed, because code that + releases a reference on a lock, has to "know" what reference this is. + In the most frequent case + + lock = ldlm_handle2lock(handle); + ... + LDLM_LOCK_PUT(lock); + + no changes are required. When any other reference (received _not_ from + ldlm_handle2lock()) is released, LDLM_LOCK_RELEASE() has to be called + instead of LDLM_LOCK_PUT(). + + Arguably, changes are pervasive, and interface requires some discipline + for proper use. On the other hand, it was very instrumental in finding + a few leaked lock references. + +Severity : normal +Bugzilla : 16450 +Description: Add ldlm_lock_addref_try(). +Details : Introduce ldlm_lock_addref_try() function (used by CLIO) that + attempts to addref a lock that might be being canceled + concurrently. + +Severity : normal +Bugzilla : 16450 +Description: Add ldlm_weigh_callback(). +Details : Add new ->l_weigh_ast() call-back to ldlm_lock. It is called + by ldlm_cancel_shrink_policy() to estimate lock "value", instead of + hard-coded `number of pages' logic. + +Severity : normal +Bugzilla : 16450 +Description: Add lockdep annotations to llog code. +Details : Use appropriately tagged _nested() locking calls in the places + where llog takes more than one ->lgh_lock lock. + +Severity : minor +Bugzilla : 16450 +Description: Add loi_kms_set(). +Details : Wrap kms updates into a helper function. + +Severity : minor +Bugzilla : 16450 +Description: Constify instances of struct lsm_operations. +Details : Constify instances of struct lsm_operations. + +Severity : normal +Bugzilla : 16450 +Description: lu_conf support. +Details : On a server, a file system object is uniquely identified + by a fid, which is sufficient to locate and load all object + state (inode). On a client, on the other hand, more data are + necessary instantiate an object. Change lu_object_find() and + friends to take additional `lu_conf' argument describing object. + Typically this includes layout information. + +Severity : normal +Bugzilla : 16450 +Description: lu_context fixes. +Details : Introduce new lu_context functions that are needed on the client + side, where some system threads (ptlrpcd) are shared by multiple + modules, and so cannot be stopped during module shutdown. + +Severity : normal +Bugzilla : 16450 +Description: Add start and stop methods to lu_device_type_operations. +Details : Introduce two new methods in lu_device_type_operations, that are + invoked when first instance of a given type is created and last one + is destroyed respectively. This is need by CLIO. + +Severity : normal +Bugzilla : 16450 +Description: Add lu_ref support to struct lu_device. +Details : Add lu_ref support to lu_object and lu_device. lu_ref is used to + track leaked references. + +Severity : normal +Bugzilla : 16450 +Description: Introduce lu_kmem_descr. +Details : lu_kmem_descr and its companion interface allow to create + and destroy a number of kmem caches at once. + +Severity : normal +Bugzilla : 16450 +Description: Fix lu_object finalization race. +Details : Fix a race between lu_object_find() finding an object and its + concurrent finalization. This race is (most likely) not possible + on the server, but might happen on the client. + +Severity : normal +Bugzilla : 16450 +Description: Introduce lu_ref interface. +Details : lu_ref is a debugging module allowing to track references to + a given object. It is quite cpu expensive, and has to be + explicitly enabled with --enable-lu_ref. See usage description + within the patch. + +Severity : minor +Bugzilla : 16450 +Description: Factor lu_site procfs stats into a separate function. +Details : Separate lu_site stats printing code into a separate function + that can be reused on a client. + +Severity : minor +Bugzilla : 16450 +Description: Constify instances of struct {lu,dt,md}_device_operations. +Details : Constify instances of struct {lu,dt,md}_device_operations. + +Severity : normal +Bugzilla : 16450 +Description: Introduce struct md_site and move meta-data specific parts of + struct lu_site here. +Details : Move md-specific fields out of struct lu_site into special struct + md_site, so that lu_site can be used on a client. + +Severity : minor +Bugzilla : 16450 +Description: Kill mdd_lov_destroy(). +Details : Remove unused mdd code. + +Severity : minor +Bugzilla : 16450 +Description: Add st_block checking to multistat.c. +Details : Add st_block checking to multistat.c. + +Severity : normal +Bugzilla : 16450 +Description: Add lu_ref support to struct obd_device. +Details : Add lu_ref tracking to obd_device. + +Severity : minor +Bugzilla : 16450 +Description: Kill obd_set_fail_loc(). +Details : Remove unused code. + +Severity : normal +Bugzilla : 16450 +Description: Add special type for ptlrpc_request interpret functions. +Details : Add lu_env parameter to ->rq_interpreter call-back. NULL is passed + there. Actual usage will be in CLIO. + +Severity : normal +Bugzilla : 16450 +Description: Replace RW_LOCK_UNLOCKED() macro with rwlock_init(). +Details : Replace RW_LOCK_UNLOCKED() with rwlock_init() as the former + doesn't work with lockdep. + +Severity : normal +Bugzilla : 16450 +Description: Add rwv.c test program. +Details : New testing program exercising readv(2) and writev(2) (Qian). + +Severity : normal +Bugzilla : 16450 +Description: Add sendfile.c test program. +Details : New testing program exercising sendfile(2) (Jay). + +Severity : minor +Bugzilla : 16450 +Description: Ratelimit a message that can be very frequent. +Details : Ratelimit a memory allocation failure message that can + be too chatty. + +Severity : minor +Bugzilla : 16450 +Description: Use cdebug_show() in CDEBUG-style macros defined outside of libcfs. +Details : Use cdebug_show() in CDEBUG-style macros defined outside of libcfs. + +Severity : normal +Bugzilla : 16450 +Description: Liblustre build fixes. +Details : Liblustre build fixes. + +Severity : normal +Bugzilla : 16450 +Description: libcfs: add cfs_{need,cond}_resched() interface. +Details : libcfs: add cfs_{need,cond}_resched() definition and + implementations for Linux, NT, and liblustre. + -------------------------------------------------------------------------------- 2007-08-10 Cluster File Systems, Inc. @@ -1209,7 +1761,7 @@ Severity : minor Frequency : at statup only Bugzilla : 12860 Description: mds_lov_synchronize race leads to various problems -Details : simultaneous MDT->OST connections at startup can cause the +Details : simultaneous MDT->OST connections at startup can cause the sync to abort, leaving the OSC in a bad state. Severity : enhancement @@ -1378,7 +1930,7 @@ Details : When osc reconnect ost, OST(filter) should check whether it to update the client grant space info. Severity : normal -Frequency : when client reconnect to OST +Frequency : when client reconnect to OST Bugzilla : 11662 Description: Grant Leak when osc do resend and replay bulk write Details : When osc reconnect to OST, OST(filter)should clear grant info of @@ -1387,33 +1939,33 @@ Details : When osc reconnect to OST, OST(filter)should clear grant info of these of resend/replay write req. Severity : normal -Frequency : rare +Frequency : rare Bugzilla : 11662 Description: Grant space more than avaiable left space sometimes. Details : When then OST is about to be full, if two bulk writing from different clients came to OST. Accord the avaliable space of the OST, the first req should be permitted, and the second one - should be denied by ENOSPC. But if the seconde arrived before + should be denied by ENOSPC. But if the seconde arrived before the first one is commited. The OST might wrongly permit second writing, which will cause grant space > avaiable space. Severity : normal -Frequency : when client is evicted +Frequency : when client is evicted Bugzilla : 12371 Description: Grant might be wrongly erased when osc is evicted by OST -Details : when the import is evicted by server, it will fork another - thread ptlrpc_invalidate_import_thread to invalidate the - import, where the grant will be set to 0. While the original - thread will update the grant it got when connecting. So if - the former happened latter, the grant will be wrongly errased +Details : when the import is evicted by server, it will fork another + thread ptlrpc_invalidate_import_thread to invalidate the + import, where the grant will be set to 0. While the original + thread will update the grant it got when connecting. So if + the former happened latter, the grant will be wrongly errased because of this race. Severity : normal -Frequency : rare +Frequency : rare Bugzilla : 12401 -Description: Checking Stale with correct fid -Details : ll_revalidate_it should uses de_inode instead of op_data.fid2 - to check whether it is stale, because sometimes, we want the +Description: Checking Stale with correct fid +Details : ll_revalidate_it should uses de_inode instead of op_data.fid2 + to check whether it is stale, because sometimes, we want the enqueue happened anyway, and op_data.fid2 will not be initialized. Severity : enhancement @@ -1429,29 +1981,29 @@ Details : size of struct ll_inode_info is to big for union inode.u and this can be cause of random memory corruption. Severity : normal -Frequency : rare +Frequency : rare Bugzilla : 10818 Description: Memory leak in recovery Details : Lov_mds_md was not free in an error handler in mds_create_object. - It should also check obd_fail before fsfilt_start, otherwise if + It should also check obd_fail before fsfilt_start, otherwise if fsfilt_start return -EROFS,(failover mds during mds recovery). - then the req will return with repmsg->transno = 0 and rc = EROFS. + then the req will return with repmsg->transno = 0 and rc = EROFS. and we met hit the assert LASSERT(req->rq_reqmsg->transno == - req->rq_repmsg->transno) in ptlrpc_replay_interpret. Fcc should + req->rq_repmsg->transno) in ptlrpc_replay_interpret. Fcc should be freed no matter whether fsfilt_commit success or not. Severity : minor Frequency : only with huge count clients Bugzilla : 11817 -Description: Prevents from taking the superblock lock in llap_from_page for +Description: Prevents from taking the superblock lock in llap_from_page for a soon died page. -Details : using LL_ORIGIN_REMOVEPAGE origin flag instead of LL_ORIGIN_UNKNOW - for llap_from_page call in ll_removepage prevents from taking the +Details : using LL_ORIGIN_REMOVEPAGE origin flag instead of LL_ORIGIN_UNKNOW + for llap_from_page call in ll_removepage prevents from taking the superblock lock for a soon died page. Severity : normal Frequency : rare -Bugzilla : 11935 +Bugzilla : 11935 Description: Not check open intent error before release open handle Details : in some rare cases, the open intent error is not checked before release open handle, which may cause @@ -1460,9 +2012,9 @@ Details : in some rare cases, the open intent error is not checked before Severity : normal Frequency : rare -Bugzilla : 12556 -Description: Set cat log bitmap only after create log success. -Details : in some rare cases, the cat log bitmap is set too early. and it +Bugzilla : 12556 +Description: Set cat log bitmap only after create log success. +Details : in some rare cases, the cat log bitmap is set too early. and it should be set only after create log success. Severity : major @@ -1479,11 +2031,11 @@ Details : Insert cond_resched to give other threads a chance to use some CPU Severity : normal Frequency : rare -Bugzilla : 12086 -Description: the cat log was not initialized in recovery +Bugzilla : 12086 +Description: the cat log was not initialized in recovery Details : When mds(mgs) do recovery, the tgt_count might be zero, so the unlink log on mds will not be initialized until mds post - recovery. And also in mds post recovery, the unlink log will + recovery. And also in mds post recovery, the unlink log will initialization will be done asynchronausly, so there will be race between add unlink log and unlink log initialization. @@ -1504,7 +2056,7 @@ Details : imp_lock should be held while iterating over imp_sending_list for Severity : normal Bugzilla : 12689 Description: replay-single.sh test 52 fails -Details : A lock's skiplist need to be cleanup when it being unlinked +Details : A lock's skiplist need to be cleanup when it being unlinked from its resource list. Severity : normal @@ -1525,15 +2077,22 @@ Severity : enhancement Bugzilla : 4900 Description: Async OSC create to avoid the blocking unnecessarily. Details : If a OST has no remain object, system will block on the creating - when need to create a new object on this OST. Now, ways use - pre-created objects when available, instead of blocking on an - empty osc while others are not empty. If we must block, we block - for the shortest possible period of time. + when need to create a new object on this OST. Now, ways use + pre-created objects when available, instead of blocking on an + empty osc while others are not empty. If we must block, we block + for the shortest possible period of time. + +Severity : major +Bugzilla : 11710 +Description: improve handling recoverable errors +Details : if request processig with error which can be recoverable on server + request should be resend, otherwise page released from cache and + marked as error. Severity : enhancement Bugzilla : 12702 Description: refine locking for avoid write wrong info into lov_objid file -Details : fix possible races with add new target and write/update data in +Details : fix possible races with add new target and write/update data in lov_objid file. -------------------------------------------------------------------------------- @@ -1567,14 +2126,12 @@ Details : The __iget() symbol export is missing. To avoid the need for special upgrade step is needed. Please read the user documentation before upgrading any part of a live system. * WIRE PROTOCOL CHANGE from previous 1.6 beta versions. This - version will not interoperate with 1.6 betas before beta5 (1.5.95). + version will not interoperate with 1.6 betas before beta5 (1.5.95). * WARNING: Lustre configuration and startup changes are required with this release. See https://mail.clusterfs.com/wikis/lustre/MountConf for details. * bug fixes - - Severity : enhancement Bugzilla : 8007 Description: MountConf @@ -1606,7 +2163,7 @@ Bugzilla : 9862 Description: optimized stripe assignment Details : stripe assignments are now made based on ost space available, ost previous usage, and OSS previous usage, in order to try - to optimize storage space and networking resources. + to optimize storage space and networking resources. Severity : enhancement Bugzilla : 4226 @@ -1643,16 +2200,16 @@ Severity : enhancement Bugzilla : 22484 Description: client read/write statistics Details : Add client read/write call usage stats for performance - analysis of user processes. + analysis of user processes. /proc/fs/lustre/llite/*/offset_stats shows non-sequential file access. extents_stats shows chunk size distribution. extents_stats_per_process show chunk size distribution per - user process. + user process. Severity : enhancement Bugzilla : 22486 Description: mds statistics -Details : Add detailed mds operations statistics in +Details : Add detailed mds operations statistics in /proc/fs/lustre/mds/*/stats. Severity : minor @@ -3892,7 +4449,7 @@ Severity : Minor Frequency : Rare Bugzilla : 11248 Description: merge and cleanup kernel patches. -Details : +Details : -----------------------------------------------------------------------------