X-Git-Url: https://git.whamcloud.com/?a=blobdiff_plain;f=lustre%2FChangeLog;h=6daa824652c013899e29d4e3e0cfc6d299b05edf;hb=42a660527279112258b35ec840b59d3f4ad9420b;hp=7bae3294361f1b59d1ac93079d0a174d1c6626bc;hpb=b06bf709e37b1c57119e4c18950687bd1132e3da;p=fs%2Flustre-release.git diff --git a/lustre/ChangeLog b/lustre/ChangeLog index 7bae329..6daa824 100644 --- a/lustre/ChangeLog +++ b/lustre/ChangeLog @@ -2,16 +2,100 @@ tbd Sun Microsystems, Inc. * version 1.8.0 * Support for kernels: 2.6.9-67.0.4.EL (RHEL 4), - 2.6.16.54-0.2.3 (SLES 10), - 2.6.18-53.1.6.el5 (RHEL 5). + 2.6.16.54-0.2.5 (SLES 10), + 2.6.18-53.1.14.el5 (RHEL 5), + 2.6.22.14 vanilla (kernel.org). * Client support for unpatched kernels: (see http://wiki.lustre.org/index.php?title=Patchless_Client) 2.6.16 - 2.6.21 vanilla (kernel.org) - * Recommended e2fsprogs version: 1.40.4-cfs1 + * Recommended e2fsprogs version: 1.40.7-sun1 * Note that reiserfs quotas are disabled on SLES 10 in this kernel. * RHEL 4 and RHEL 5/SLES 10 clients behaves differently on 'cd' to a removed cwd "./" (refer to Bugzilla 14399). +Severity : normal +Bugzilla : 12652 +Description: Add FMODE_EXEC file flag for SLES10 SP1 kernel. + +Severity : enhancement +Bugzilla : 13397 +Description: Update to support 2.6.22.14 vanilla kernel. + +Severity : normal +Bugzilla : 14533 +Frequency : rare, on recovery +Description: read procfs can produce deadlock in some situation +Details : Holding lprocfs lock which send rpc can produce block for destroy + obd objects and this also block reconnect with -EALREADY. This isn't + fix all lprocfs bugs - but make it rare. + +Severity : enhancement +Bugzilla : 15152 +Description: Update kernel to RHEL5 2.6.18-53.1.14.el5. + +Severity : major +Frequency : frequent on X2 node +Bugzilla : 15010 +Description: mdc_set_open_replay_data LBUG +Details : Set replay data for requests that are eligible for replay. + +Severity : normal +Bugzilla : 14321 +Description: lustre_mgs: operation 101 on unconnected MGS +Details : When MGC is disconnected from MGS long enough, MGS will evict the + MGC, and late on MGC cannot successfully connect to MGS and a lot + of the error messages complaining that MGS is not connected. + +Severity : major +Frequency : on start mds +Bugzilla : 14884 +Description: Implement get_info(last_id) in obdfilter. + +Severity : normal +Frequency : occasional +Bugzilla : 13537 +Description: Correctly check stale fid, not start epoch if ost not support SOM +Details : open with flag O_CREATE need set old fid in op_fid3 because op_fid2 + overwrited with new generated fid, but mds can anwer with one of these + two fids and both is not stale. setattr incorectly start epoch and + assume will be called done_writeting, but without SOM done_writing + never called. + +Severity : major +Frequency : rare, depends on device drivers and load +Bugzilla : 14529 +Description: MDS or OSS nodes crash due to stack overflow +Details : Code changes in 1.8.0 increased the stack usage of some functions. + In some cases, in conjunction with device drivers that use a lot + of stack the MDS (or possibly OSS) service threads could overflow + the stack. One change which was identified to consume additional + stack has been reworked to avoid the extra stack usage. + +Severity : normal +Frequency : occasional +Bugzilla : 13730 +Description: Do not fail import if osc_interpret_create gets -EAGAIN +Details : If osc_interpret_create got -EAGAIN it immediately exits and + wakeup oscc_waitq. After wakeup oscc_wait_for_objects call + oscc_has_objects and see OSC has no objests and call + oscc_internal_create to resend create request. + +Severity : enhancement +Bugzilla : 14858 +Description: Update to SLES10 SP1 latest kernel-2.6.16.54-0.2.5. + +Severity : enhancement +Bugzilla : 14876 +Description: Update to RHEL5 latest kernel-2.6.18-53.1.13.el5. + +Severity : normal +Frequency : very rare +Bugzilla : 3462 +Description: Fix replay if there is an un-replied request and open +Details : In some cases, older replay request will revert the + mcd->mcd_last_xid on MDS which is used to record the client's + latest sent request. + Severity : enhancement Bugzilla : 14720 Description: Update to RHEL5 latest kernel-2.6.18-53.1.6.el5. @@ -29,35 +113,35 @@ Frequency : rare Bugzilla : 13196 Description: Don't allow skipping OSTs if index has been specified. Details : Don't allow skipping OSTs if index has been specified, make locking - in internal create lots better. + in internal create lots better. Severity : normal Bugzilla : 12228 Description: LBUG in ptlrpc_check_set() bad phase ebc0de00 -Details : access to bitfield in structure is always rounded to long - and this produce problem with not atomic change any bit. +Details : access to bitfield in structure is always rounded to long + and this produce problem with not atomic change any bit. Severity : normal Bugzilla : 13647 Description: Lustre make rpms failed. Details : Remove ldiskfs spec file to avoids rpmbuild be confused when - builds Lustre rpms from tarball. + builds Lustre rpms from tarball. Severity : normal Frequency : rare on shutdown ost Bugzilla : 14608 Description: If llog cancel was not send before clean_exports phase, this can - produce deadlock in llog code. + produce deadlock in llog code. Details : If llog thread has last reference to obd and call class_import_put - this produce deadlock because llog_cleanup_commit_master wait when - last llog_commit_thread exited, but this never success because was + this produce deadlock because llog_cleanup_commit_master wait when + last llog_commit_thread exited, but this never success because was called from llog_commit_thread. Severity : normal Bugzilla : 9977 Description: allow userland application know is lost one of stripes. Details : fill lvb_blocks with error code on ost and return it to - application if error flag found. + application if error flag found. Severity : normal Bugzilla : 14607 @@ -72,16 +156,16 @@ Severity : normal Bugzilla : 13375 Descriptoin: make lov_create() will not stuck in obd_statfs_rqset() Details : If an OST is down the MDS will hang indefinitely in - obd_statfs_rqset() waiting for the statfs data. While for + obd_statfs_rqset() waiting for the statfs data. While for MDS QOS usage of statfs, it should not stuck in waiting. Severity : enhancement Bugzilla : 11842 Description: remote_acl support Details : Support ACL-based permission check for remote user. - Support setfacl/getfacl for remote user with the utils - "lfs {l,r}{s,g}etfacl" which follow the same parameter format as - the system "{s,g}etfacl" utils. + Support setfacl/getfacl for remote user with the utils + "lfs {l,r}{s,g}etfacl" which follow the same parameter format as + the system "{s,g}etfacl" utils. Severity : enhancement Bugzilla : 14288 @@ -110,7 +194,7 @@ Frequency : rare, at shutdown Description: access already free / zero obd_namespace. Details : if client_disconnect_export was called without force flag set, and exist connect request in flight, this can produce access to - NULL pointer (or already free pointer) when connect_interpret + NULL pointer (or already free pointer) when connect_interpret store ocd flags in obd_namespace. Severity : minor @@ -128,7 +212,7 @@ Details : Make lustre randomly failed allocating memory for testing purpose. Severity : enhancement Bugzilla : 12702 Description: lost problems with lov objid file -Details : Fixes some scability and access to not inited memory problems +Details : Fixes some scability and access to not inited memory problems in work with lov objdid file. Severity : major @@ -170,18 +254,18 @@ Description: Update to RHEL4 latest kernel. Severity : enhancement Bugzilla : 13690 Description: Build SLES10 patchless client fails -Details : The configure was broken by run ./configure with +Details : The configure was broken by run ./configure with --with-linux-obj=.... argument for patchless client. When the configure use --with-linux-obj, the LINUXINCLUDE= -Iinclude - can't search header adequately. Use absolute path such as - -I($LINUX)/include instead. + can't search header adequately. Use absolute path such as + -I($LINUX)/include instead. Severity : normal Bugzilla : 13888 Description: interrupt oig_wait produce painc on resend. Details : brw_redo_request can be used for resend requests from ptlrpcd and private set, and this produce situation when rq_ptlrpcd_data not - copyed to new allocated request and triggered LBUG on assert + copyed to new allocated request and triggered LBUG on assert req->rq_ptlrpcd_data != NULL. But this member used only for wakeup ptlrpcd set if request is changed and can be safety changed to use rq_set directly. @@ -206,10 +290,10 @@ Details : This causes SLES 10 clients to behave as patchless clients Severity : enhancement Bugzilla : 2262 Description: self-adjustable client's lru lists -Details : use adaptive algorithm for managing client cached locks lru +Details : use adaptive algorithm for managing client cached locks lru lists according to current server load, other client's work - pattern, memory activities, etc. Both, server and client - side namespaces provide number of proc tunables for controlling + pattern, memory activities, etc. Both, server and client + side namespaces provide number of proc tunables for controlling things Severity : enhancement @@ -331,7 +415,7 @@ Details : set obd_health_check_timeout as 1.5x of obd_timeout Severity : normal Bugzilla : 12192 Description: llapi_file_create() does not allow some changes -Details : add llapi_file_open() that allows specifying the mode and +Details : add llapi_file_open() that allows specifying the mode and open flags, and also returns an open file handle. Severity : normal @@ -342,9 +426,9 @@ Details : Remove mnt_lustre_list in vfs_intent-2.6-rhel4.patch. Severity : normal Bugzilla : 10657 Description: Add journal checksum support.(Kernel part) -Details : The journal checksum feature adds two new flags i.e - JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT and - JBD2_FEATURE_COMPAT_CHECKSUM. JBD2_FEATURE_CHECKSUM flag +Details : The journal checksum feature adds two new flags i.e + JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT and + JBD2_FEATURE_COMPAT_CHECKSUM. JBD2_FEATURE_CHECKSUM flag indicates that the commit block contains the checksum for the blocks described by the descriptor blocks. Now commit record can be sent to disk without waiting for descriptor @@ -364,7 +448,7 @@ Details : execute lfs setstripe on client Severity : major Bugzilla : 12223 Description: mds_obd_create error creating tmp object -Details : When the user sets quota on root, llog will be affected and can't +Details : When the user sets quota on root, llog will be affected and can't create files and write files. Severity : normal @@ -372,7 +456,7 @@ Frequency : Always on ia64 patchless client, and possibly others. Bugzilla : 12826 Description: Add EXPORT_SYMBOL check for node_to_cpumask symbol. Details : This allows the patchless client to be loaded on architectures - without this export. + without this export. Severity : normal Bugzilla : 13039 @@ -450,7 +534,7 @@ Details : When generating the bio request for lustre file writes the Severity : normal Bugzilla : 11230 -Description: Tune the kernel for good SCSI performance. +Description: Tune the kernel for good SCSI performance. Details : Set the value of /sys/block/{dev}/queue/max_sectors_kb to the value of /sys/block/{dev}/queue/max_hw_sectors_kb in mount_lustre. @@ -492,8 +576,8 @@ Frequency : only on ppc Bugzilla : 12234 Description: /proc/fs/lustre/devices broken on ppc Details : The patch as applied to 1.6.2 doesn't look correct for all arches. - We should make sure the type of 'index' is loff_t and then cast - explicitly as needed below. Do not assign an explicitly cast + We should make sure the type of 'index' is loff_t and then cast + explicitly as needed below. Do not assign an explicitly cast loff_t to an int. Severity : normal @@ -512,14 +596,14 @@ Severity : normal Bugzilla : 13304 Frequency : Always, for kernels after 2.6.16 Description: Fix warning idr_remove called for id=.. which is not allocated. -Details : Last kernels save old s_dev before kill super and not allow +Details : Last kernels save old s_dev before kill super and not allow to restore from callback - restore it before call kill_anon_super. Severity : minor Bugzilla : 12948 Description: buffer overruns could theoretically occur Details : llapi_semantic_traverse() modifies the "path" argument by - appending values to the end of the origin string, and a + appending values to the end of the origin string, and a overrun may occur. Adding buffer overrun check in liblustreapi. Severity : normal @@ -552,12 +636,12 @@ Severity : critical Bugzilla : 13751 Description: Kernel patches update for RHEL5 2.6.18-8.1.14.el5. Details : Modify target file & which_patch. - A flaw was found in the IA32 system call emulation provided - on AMD64 and Intel 64 platforms. An improperly validated 64-bit - value could be stored in the %RAX register, which could trigger an - out-of-bounds system call table access. An untrusted local user - could exploit this flaw to run code in the kernel - (ie a root privilege escalation). (CVE-2007-4573). + A flaw was found in the IA32 system call emulation provided + on AMD64 and Intel 64 platforms. An improperly validated 64-bit + value could be stored in the %RAX register, which could trigger an + out-of-bounds system call table access. An untrusted local user + could exploit this flaw to run code in the kernel + (ie a root privilege escalation). (CVE-2007-4573). Severity : major Bugzilla : 13093 @@ -571,7 +655,7 @@ Bugzilla : 13454 Description: Add jbd statistics patch for RHEL5 and 2.6.18-vanilla Severity : minor -Bugzilla : 13732 +Bugzilla : 13732 Description: change order of libsysio includes Details : '#include sysio.h' should always come before '#include xtio.h' @@ -674,6 +758,119 @@ Details : A lot of unlink operations with concurrent I/O can lead to a max_rpcs_in_flight per OSC and LDLM_FL_DISCARD_DATA blocking callbacks are processed in priority. +Severity : normal +Bugzilla : 13829 +Description: enable ACLs on MDS by default +Details : ACLs must be enabled on MDS by default. + +Severity : normal +Frequency : PPC/PPC64 only +Bugzilla : 14845 +Description: conflicts between asm-ppc64/types.h and lustre_types.h +Details : fix duplicated definitions between asm-ppc64/types.h and + lustre_types.h on PPC. + +Severity : normal +Frequency : PPC/PPC64 only +Bugzilla : 14844 +Description: asm-ppc/segment.h does not exist +Details : fix compile issue on PPC. + +Severity : normal +Bugzilla : 14864 +Description: better handle error messages in extents code + +Severity : normal +Frequency : RHEL4 only +Bugzilla : 14618 +Description: mkfs is very slow on IA64/RHEL4 +Details : A performance regression has been discovered in the MPT Fusion + driver between versions 3.02.73rh and 3.02.99.00rh. As a + consequence, we have downgraded the MPT Fusion driver in the RHEL4 + kernel from 3.02.99.00 to 3.02.73 until this problem is fixed. + +Severity : enhancement +Bugzilla : 14729 +Description: SNMP support enhancement +Details : Adding total number of sampled request for an MDS node in snmp + support. + +Severity : enhancement +Bugzilla : 14748 +Description: Optimize ldlm waiting list processing for PR extent locks +Details : When processing waiting list for read extent lock and meeting read + lock that is same or wider to it that is not contended, skip + processing rest of the list and immediatelly return current + status of conflictness, since we are guaranteed there are no + conflicting locks in the rest of the list. + +Severity : normal +Bugzilla : 14774 +Description: Time out and refuse to reconnect +Details : When the failover node is the primary node, it is possible + to have two identical connections in imp_conn_list. We must + compare not conn's pointers but NIDs, otherwise we can defeat + connection throttling. + +Severity : normal +Bugzilla : 13821 +Description: port llog fixes from b1_6 into HEAD +Details : Port llog reference couting and some llog cleanups from b1_6 + (bug 10800) into HEAD, for protect from panic and access to already + free llog structures. + +Severity : normal +Bugzilla : 14483 +Description: Detect stride IO mode in read-ahead +Details : When a client does stride read, read-ahead should detect that and + read-ahead pages according to the detected stride pattern. + +Severity : normal +Bugzilla : 13805 +Description: data checksumming impacts single node performance +Details : add support for several checksum algorithm. Currently, only CRC32 + and Adler-32 are supported. The checksum type can be changed on + the fly via /proc/fs/lustre/osc/*/checksum_type. + +Severity : normal +Bugzilla : 14648 +Description: use adler32 for page checksums +Details : when available, use the Adler-32 algorithm instead of CRC32 for + page checksums. + +Severity : normal +Bugzilla : 15033 +Description: build for x2 fails +Details : fix compile issue on Cray systems. + +Severity : normal +Bugzilla : 14379 +Description: Properly match for duplicate locks +Details : Due to different lock order from skiplists code, we need to + traverse entire list for now + +Severity : normal +Frequency : only on PPC/SLES10 +Bugzilla : 14855 +Description: "BITS_PER_LONG is not 32 or 64" in linux/idr.h +Details : On SLES10/PPC, fs.h includes idr.h which requires BITS_PER_LONG to + be defined. Add a hack in mkfs_lustre.c to work around this compile + issue. + +Severity : normal +Bugzilla : 14257 +Description: LASSERT on MDS when client holding flock lock dies +Details : ldlm pool logic depends on number of granted locks equal to + number of released locks which is not true for flock locks, so + just exclude such locks from consideration. + +Severity : normal +Bugzilla : 15188 +Description: MDS deadlock with many ll_sync_lov threads and I/O stalled +Details : Use fsfilt_sync() for both the whole filesystem sync and + individual file sync to eliminate dangerous inode locking + with I_LOCK that can lead to a deadlock. + -------------------------------------------------------------------------------- 2007-08-10 Cluster File Systems, Inc.