X-Git-Url: https://git.whamcloud.com/?p=fs%2Flustre-release.git;a=blobdiff_plain;f=lustre%2FChangeLog;h=48d2398bfd486031315da15c8d545fd058e41c67;hp=c10a3e1805fec3439f60343f108ca89f35a3dfea;hb=f44a8c360d5cec35eb27b82cdb502aca37a08bd7;hpb=c24b504cb2d8f0b34a31cf4365a04d3cb5e247cb diff --git a/lustre/ChangeLog b/lustre/ChangeLog index c10a3e1..48d2398 100644 --- a/lustre/ChangeLog +++ b/lustre/ChangeLog @@ -1,23 +1,137 @@ tbd Sun Microsystems, Inc. * version 2.0.0 * Support for kernels: - 2.6.16.60-0.27 (SLES 10), - 2.6.18-92.1.10.el5 (RHEL 5), + 2.6.16.60-0.33 (SLES 10), + 2.6.18-92.1.22.el5 (RHEL 5), 2.6.22.14 vanilla (kernel.org). * Client support for unpatched kernels: - (see http://wiki.lustre.org/index.php?title=Patchless_Client) - 2.6.16 - 2.6.21 vanilla (kernel.org) + (see http://wiki.lustre.org/index.php?title=Patchless_Client) + 2.6.16 - 2.6.21 vanilla (kernel.org) * Recommended e2fsprogs version: 1.40.11-sun1 * Note that reiserfs quotas are disabled on SLES 10 in this kernel. * RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd' to a - removed cwd "./" (refer to Bugzilla 14399). + removed cwd "./" (refer to Bugzilla 14399). * File join has been disabled in this release, refer to Bugzilla 16929. Severity : enhancement +Bugzilla : 18061 +Description: Update to SLES10 kernel-2.6.16.60-0.33. + +Severity : enhancement +Bugzilla : 18060 +Description: Update to RHEL5 kernel-2.6.18-92.1.22.el5. + +Severity : normal +Frequency : start MDS on uncleanly shutdowned MDS device +Bugzilla : 18049 +Descriptoin: aborting recovery hang on MDS +Details : don't throttle destroy RPCs for the MDT. + +Severity : major +Frequency : on remount +Bugzilla : 18018 +Description: external journal device not working after the remount +Details : clear dev_rdonly flag for external journal devices in + blkdev_put() + +Severity : normal +Frequency : race on file read and write +Bugzilla : 16417 +Description: Lustre doesn't delete files +Details : Clients drop lock reference and release openhandle when they find + stale inode. + +Severity : minor +Frequency : rare +Bugzilla : 17802 +Description: shutdown vs evict race +Details : client_disconnect_export vs connect request race. + if client will evicted at this time - we start invalidate + thread without referece to import and import can be freed + at same time. + +Severity : normal +Frequency : rare, need acl's on inode. +Bugzilla : 16492 +Description: client can't handle ost additional correctly +Details : if ost was added after client connected to mds client can have + hit lnet_try_match_md ... to big messages to wide striped files. + in this case need teach client to handle config events about add + lov target and update client max ea size at that event. + +Severity : enhancement +Bugzilla : 15699 +Description: Changelogs +Details : Changelogs are a lightweight mechanism to track filesystem + metadata and namespace changes. The changelog is recorded + permanently on the MDTs, and is periodically "consumed" / purged + when records are no longer needed. + +Severity : enhancement +Bugzilla : 15957 +Description: compact fld format with extents +Details : Store range of seq rather than every seq in FLD. Seq + controller update FLD rather than clients. In Case of CMD, mdt0 + has FLD, all other metadata server act as non persistent proxy + for FLD queries and cache fld entries in fld cache. + +Severity : normal +Frequency : rare +Bugzilla : 16081 +Description: don't skip ost target if they assigned to file +Details : Drop slow OSCs if we can, but not for requested start idx. + This means "if OSC is slow and it is not the requested + start OST, then it can be skipped, otherwise skip it only + if it is inactive/recovering/out-of-space. + +Severity : normal +Bugzilla : 16080 +Description: more cleanup in mds_lov +Details : not send LOV EA under replay, we can't know about they size at this + time. Don't allow client connect to mds before any ost connected, + for avoid problems with LOV EA size and returning EIO to client. + +Severity : enhancement +Bugzilla : 11826 +Description: Interoperability at server side (Disk interoperability) + +Severity : enhancement +Bugzilla : 17201 +Description: Update to RHEL5 kernel-2.6.18-92.1.17.el5. + +Severity : enhancement +Bugzilla : 17458 +Description: Update to SLES10 SP2 kernel-2.6.16.60-0.31. + +Severity : enhancement +Bugzilla : 14166 +Description: New client IO stack (CLIO). + +Severity : enhancement +Bugzilla : 15393 +Description: Commit on sharing. Eliminate inter-client dependencies between + uncommitted transactions by doing transaction commits. + Thereby clients may recovery independently. + +Severity : normal +Frequency : Create a symlink file with a very long name +Bugzilla : 16578 +Description: ldlm_cancel_pack()) ASSERTION(max >= dlm->lock_count + count) +Details : If there is no extra space in the request for early cancels, + ldlm_req_handles_avail() returns 0 instead of a negative value. + +Severity : enhancement +Bugzilla : 1819 +Description: Add /proc entry for import status +Details : The mdc, osc, and mgc import directories now have + an import directory that contains useful import data for debugging + connection problems. + +Severity : enhancement Bugzilla : 15966 Description: Re-disable certain /proc logging Details : Enable and disable client's offset_stats, extents_stats and - extents_stats_per_process stats logging on the fly. + extents_stats_per_process stats logging on the fly. Severity : major Frequency : Only on FC kernels 2.6.22+ @@ -30,13 +144,22 @@ Severity : enhancement Bugzilla : 16643 Description: Generic /proc file permissions Details : Set /Proc file permissions in a more generic way to enable non- - root users operate on some /proc files. + root users operate on some /proc files. Severity : major Bugzilla : 16561 Description: Hitting mdc_commit_close() ASSERTION Details : Properly handle request reference release in - ll_release_openhandle(). + ll_release_openhandle(). + +Severity : major +Bugzilla : 14840 +Description: quota recovery deadlock during mds failover +Details : This patch includes att18982, att18236, att18237 in bz14840. + Slove the problems: + 1. fix osts hang when mds does failover with quotaon + 2. prevent watchdog storm when osts threads wait for the + recovery of mds Severity : normal Bugzilla : 15975 @@ -50,8 +173,7 @@ Description: Allow OST glimpses to return PW locks Severity : minor Bugzilla : 16717 Description: LBUG when llog conf file is full -Details : When llog bitmap is full, ENOSPC should be returned for plain - log. +Details : When llog bitmap is full, ENOSPC should be returned for plain log. Severity : normal Bugzilla : 16907 @@ -74,8 +196,8 @@ Bugzilla : 16611 Frequency : on recovery Description: I/O failures after umount during fail back Details : if client reconnected to restarted server we need join to recovery - instead of find server handler is changed and process self eviction - with cancel all locks. + instead of find server handler is changed and process self + eviction with cancel all locks. Severity : enhancement Bugzilla : 16633 @@ -108,6 +230,7 @@ Details : When connection is reused this not moved from CONN_UNUSED_HASH into CONN_USED_HASH and this prodice warning when put connection again in unused hash. + Severity : enhancement Bugzilla : 15899 Description: File striping can now be set to use an arbitrary pool of OSTs. @@ -123,6 +246,23 @@ Details : Apply the MGS_CONNECT_SUPPORTED mask at reconnect time so the connect flags are properly negotiated. Severity : normal +Frequency : often +Bugzilla : 16125 +Description: quotas are not honored with O_DIRECT +Details : all writes with the flag O_DIRECT will use grants which leads to + this problem. Now using OBD_BRW_SYNC to guard this. + +Severity : normal +Bugzilla : 15058 +Description: add quota statistics +Details : 1. sort out quota proc entries and proc code. + 2. add quota statistics + +Severity : enhancement +Bugzilla : 13058 +Description: enable quota support for HEAD. + +Severity : normal Bugzilla : 16006 Description: Properly propagate oinfo flags from lov to osc for statfs Details : restore missing copy oi_flags to lov requests. @@ -190,7 +330,7 @@ Details : Lustre does not destroy flock lock before last reference goes Severity : normal Bugzilla : 15210 -Description: add recount protection for osc callbacks, so avoid panic on shutdown +Description: add refcount protection for osc callbacks, avoid panic on shutdown Severity : normal Bugzilla : 12653 @@ -222,15 +362,15 @@ Details : Need properly lock accesses the flock deadlock detection list. Severity : minor Bugzilla : 15837 Description: oops in page fault handler -Details : kernel page fault handler can return two special 'pages' in error case, don't - try dereference NOPAGE_SIGBUS and NOPAGE_OMM. +Details : kernel page fault handler can return two special 'pages' in error + case, don't try dereference NOPAGE_SIGBUS and NOPAGE_OMM. Severity : minor Bugzilla : 15716 Description: timeout with invalidate import. -Details : ptlrpcd_check call obd_zombie_impexp_cull and wait request which should be - handled by ptlrpcd. This produce long age waiting and -ETIMEOUT - ptlrpc_invalidate_import and as result LASSERT. +Details : ptlrpcd_check call obd_zombie_impexp_cull and wait request which + should be handled by ptlrpcd. This produce long age waiting and + -ETIMEOUT ptlrpc_invalidate_import and as result LASSERT. Severity : enhancement Bugzilla : 15741 @@ -317,7 +457,7 @@ Details : Move the 'good_osts' check before the 'total_bavail' check. This Severity : major Bugzilla : 14326 Description: Use old size assignment to avoid deadlock -Details : This reverts the changes in bugs 2369 and bug 14138 that introduced +Details : Reverts the changes in bugs 2369 and bug 14138 that introduced the scheduling while holding a spinlock. We do not need locking for size in ll_update_inode() because size is only updated from the MDS for directories or files without objects, so there is no @@ -379,8 +519,8 @@ Bugzilla : 14533 Frequency : rare, on recovery Description: read procfs can produce deadlock in some situation Details : Holding lprocfs lock which send rpc can produce block for destroy - obd objects and this also block reconnect with -EALREADY. This isn't - fix all lprocfs bugs - but make it rare. + obd objects and this also block reconnect with -EALREADY. This + isn't fix all lprocfs bugs - but make it rare. Severity : enhancement Bugzilla : 15152 @@ -423,11 +563,11 @@ Severity : normal Frequency : occasional Bugzilla : 13537 Description: Correctly check stale fid, not start epoch if ost not support SOM -Details : open with flag O_CREATE need set old fid in op_fid3 because op_fid2 - overwrited with new generated fid, but mds can anwer with one of these - two fids and both is not stale. setattr incorectly start epoch and - assume will be called done_writeting, but without SOM done_writing - never called. +Details : open with flag O_CREATE need set old fid in op_fid3 because + op_fid2 was overwritten with new generated fid, but mds can answer + with one of these two fids and both is not stale. Setattr + incorrectly started an epoch and assume will be called + done_writing, but without SOM done_writing ever being called. Severity : major Frequency : rare, depends on device drivers and load @@ -480,8 +620,8 @@ Severity : minor Frequency : rare Bugzilla : 13196 Description: Don't allow skipping OSTs if index has been specified. -Details : Don't allow skipping OSTs if index has been specified, make locking - in internal create lots better. +Details : Don't allow skipping OSTs if index has been specified, make + locking in internal create lots better. Severity : normal Bugzilla : 12228 @@ -958,8 +1098,8 @@ Details : Modify the target file & which_kernel. Severity : enhancement Bugzilla : 10786 Description: omit set fsid for export NFS -Details : fix set/restore device id for avoid EMFILE error and mark lustre fs - as FS_REQUIRES_DEV for avoid problems with generate fsid. +Details : fix set/restore device id for avoid EMFILE error and mark lustre + fs as FS_REQUIRES_DEV for avoid problems with generate fsid. Severity : normal Bugzilla : 13304 @@ -1320,9 +1460,9 @@ Description: add -gid, -group, -uid, -user options to lfs find Severity : normal Bugzilla : 15950 Description: Hung threads in invalidate_inode_pages2_range -Details : The direct IO path doesn't call check_rpcs to submit a new RPC once - one is completed. As a result, some RPCs are stuck in the queue - and are never sent. +Details : The direct IO path doesn't call check_rpcs to submit a new RPC + once one is completed. As a result, some RPCs are stuck in the + queue and are never sent. Severity : normal Bugzilla : 14629 @@ -1351,7 +1491,7 @@ Details : If insertion of an extent fails, then discard the inode Severity : normal Bugzilla : 16199 Description: don't always update ctime in ext3_xattr_set_handle() -Details : Current xattr code updates the inode ctime in ext3_xattr_set_handle. +Details : Current xattr code updates inode ctime in ext3_xattr_set_handle. In some cases the ctime should not be updated, for example for 2.0->1.8 compatibility it is necessary to delete an xattr and it should not update the ctime. @@ -1428,7 +1568,7 @@ Details : Initialize RPC XID from clock at startup (randomly if clock is Severity : enhancement Bugzilla : 14095 Description: Add lustre_start utility to start or stop multiple Lustre servers - from a CSV file. + from a CSV file. Severity : major Bugzilla : 17024 @@ -1438,10 +1578,10 @@ Details : In case of memory pressure, list_del() can be called twice on Severity : normal Bugzilla : 17026 -Description: (ptllnd_peer.c:557:kptllnd_peer_check_sends()) ASSERTION(!in_interrupt()) failed -Details : fix stack overflow in the distributed lock manager by defering export - eviction after a failed ast to the elt thread instead of handling - it in the dlm interpret routine. +Description: kptllnd_peer_check_sends()) ASSERTION(!in_interrupt()) failed +Details : fix stack overflow in the distributed lock manager by defering + export eviction after a failed AST to the elt thread instead of + handling it in the dlm interpret routine. Severity : normal Bugzilla : 16450 @@ -1467,9 +1607,9 @@ Details : Call cmm_device_free() in the failure path of cmm_device_alloc(). Severity : normal Bugzilla : 16450 Description: Add lockdep support to dt_object_operations locking interface. -Details : Augment ->do_{read,write}_lock() prototypes with a `role' parameter - indicating lock ordering. Update mdd code to use new locking - interface. +Details : Augment ->do_{read,write}_lock() prototypes with a `role' + parameter indicating lock ordering. Update mdd code to use new + locking interface. Severity : normal Bugzilla : 16450 @@ -1543,9 +1683,9 @@ Details : Kill unused ldlm_handle2lock_ns() function. Severity : normal Bugzilla : 16450 Description: Add lu_ref support to ldlm_lock -Details : lu_ref support for ldlm_lock and ldlm_resource. See lu_ref patch. - lu_ref fields ->l_reference and ->lr_reference are added to ldlm_lock - and ldlm_resource. LDLM interface has to be changed, because code that +Details : lu_ref support for ldlm_lock and ldlm_resource. See lu_ref patch. + lu_ref fields ->l_reference and ->lr_reference are added to ldlm_lock + and ldlm_resource. LDLM interface has to be changed, because code that releases a reference on a lock, has to "know" what reference this is. In the most frequent case @@ -1553,12 +1693,12 @@ Details : lu_ref support for ldlm_lock and ldlm_resource. See lu_ref patch. ... LDLM_LOCK_PUT(lock); - no changes are required. When any other reference (received _not_ from - ldlm_handle2lock()) is released, LDLM_LOCK_RELEASE() has to be called + no changes are required. When any other reference (received _not_ from + ldlm_handle2lock()) is released, LDLM_LOCK_RELEASE() has to be called instead of LDLM_LOCK_PUT(). Arguably, changes are pervasive, and interface requires some discipline - for proper use. On the other hand, it was very instrumental in finding + for proper use. On the other hand, it was very instrumental in finding a few leaked lock references. Severity : normal @@ -1571,7 +1711,7 @@ Details : Introduce ldlm_lock_addref_try() function (used by CLIO) that Severity : normal Bugzilla : 16450 Description: Add ldlm_weigh_callback(). -Details : Add new ->l_weigh_ast() call-back to ldlm_lock. It is called +Details : Add new ->l_weigh_ast() call-back to ldlm_lock. It is called by ldlm_cancel_shrink_policy() to estimate lock "value", instead of hard-coded `number of pages' logic. @@ -1611,8 +1751,8 @@ Details : Introduce new lu_context functions that are needed on the client Severity : normal Bugzilla : 16450 Description: Add start and stop methods to lu_device_type_operations. -Details : Introduce two new methods in lu_device_type_operations, that are - invoked when first instance of a given type is created and last one +Details : Introduce two new methods in lu_device_type_operations, that are + invoked when first instance of a given type is created and last one is destroyed respectively. This is need by CLIO. Severity : normal @@ -1657,7 +1797,7 @@ Severity : normal Bugzilla : 16450 Description: Introduce struct md_site and move meta-data specific parts of struct lu_site here. -Details : Move md-specific fields out of struct lu_site into special struct +Details : Move md-specific fields out of struct lu_site into special struct md_site, so that lu_site can be used on a client. Severity : minor @@ -1683,8 +1823,8 @@ Details : Remove unused code. Severity : normal Bugzilla : 16450 Description: Add special type for ptlrpc_request interpret functions. -Details : Add lu_env parameter to ->rq_interpreter call-back. NULL is passed - there. Actual usage will be in CLIO. +Details : Add lu_env parameter to ->rq_interpreter call-back. NULL is passed + there. Actual usage will be in CLIO. Severity : normal Bugzilla : 16450 @@ -1697,6 +1837,92 @@ Bugzilla : 16450 Description: Add rwv.c test program. Details : New testing program exercising readv(2) and writev(2) (Qian). +Severity : normal +Bugzilla : 16450 +Description: Add sendfile.c test program. +Details : New testing program exercising sendfile(2) (Jay). + +Severity : minor +Bugzilla : 16450 +Description: Ratelimit a message that can be very frequent. +Details : Ratelimit a memory allocation failure message that can + be too chatty. + +Severity : minor +Bugzilla : 16450 +Description: Use cdebug_show() in CDEBUG-style macros defined outside of libcfs. +Details : Use cdebug_show() in CDEBUG-style macros defined outside of libcfs. + +Severity : normal +Bugzilla : 16450 +Description: Liblustre build fixes. +Details : Liblustre build fixes. + +Severity : normal +Bugzilla : 16450 +Description: libcfs: add cfs_{need,cond}_resched() interface. +Details : libcfs: add cfs_{need,cond}_resched() definition and + implementations for Linux, NT, and liblustre. + +Severity : enhancement +Bugzilla : 12800 +Description: More exported tunables for mballoc +Details : Add support for tunable preallocation window and new tunables for + large/small requests + +Severity : normal +Bugzilla : 16680 +Description: Detect corruption of block bitmap and checking for preallocations +Details : Checks validity of on-disk block bitmap. Also it does better + checking of number of applied preallocations. When corruption is + found, it turns filesystem readonly to prevent further corruptions. + +Severity : normal +Bugzilla : 17197 +Description: (rw.c:1323:ll_read_ahead_pages()) ASSERTION(page_idx > ria->ria_stoff) failed +Details : Once the unmatched stride IO mode is detected, shrink the stride-ahead + window to 0. If it does hit cache miss, and read-pattern is still + stride-io mode, does not reset the stride window, but also does not + increase the stride window length in this case. + +Severity : normal +Bugzilla : 16438 +Frequency : only for big-endian servers +Description: Check if system is big-endian while mounting fs with extents feature +Details : Mounting a filesystem with extents feature will fail on big-endian + systems since ext3-based ldiskfs is not supported on big-endian + systems. This can be over-riden with "bigendian_extents" mount option. + +Severity : enhancement +Bugzilla : 12749 +Description: The root squash functionality +Details : A security feature, which is to prevent users from being able + to mount lustre on their desktop, run as root, and delete + all of the files in the filesystem. The goal is accomplished by + remapping user id (UID) and group id (GID) of the root user to + a UID and GID specified by the system administartor via Lustre + configuration management server (MGS). The functionality also + allows to specify sets of clients for which the remapping does + not apply. + +Severity : normal +Bugzilla : 16860 +Description: Excessive recovery window +Details : With AT enabled, the recovery window can be excessively long (6000+ + seconds). To address this problem, we no longer use + OBD_RECOVERY_FACTOR when extending the recovery window (the connect + timeout no longer depends on the service time, it is set to + INITIAL_CONNECT_TIMEOUT now) and clients report the old service + time via pb_service_time. + +Severity : normal +Bugzilla : 16522 +Description: Watchdog triggered on MDS failover +Details : enable OBD_CONNECT_MDT flag when connecting from the MDS so that + the OSTs know that the MDS "UUID" can be reused for the same export + from a different NID, so we do not need to wait for the export to be + evicted + -------------------------------------------------------------------------------- 2007-08-10 Cluster File Systems, Inc. @@ -1814,8 +2040,8 @@ Severity : normal Frequency : during server recovery Bugzilla : 11203 Description: MDS failing to send precreate requests due to OSCC_FLAG_RECOVERING -Details : request with rq_no_resend flag not awake l_wait_event if they get a - timeout. +Details : request with rq_no_resend flag not awake l_wait_event if they get + a timeout. Severity : minor Frequency : nfs export on patchless client