Branch HEAD

[fs/lustre-release.git] / lustre / ChangeLog
diff --git a/lustre/ChangeLog b/lustre/ChangeLog

index 8ef4590..2897524 100644 (file)
--- a/lustre/ChangeLog
+++ b/lustre/ChangeLog
@@ -8,15 +8,15 @@ tbd         Cluster File Systems, Inc. <info@clusterfs.com>
          this release.  See https://mail.clusterfs.com/wikis/lustre/MountConf
          for details.
         * Support for kernels:
          this release.  See https://mail.clusterfs.com/wikis/lustre/MountConf
          for details.
         * Support for kernels:
-        2.6.9-42.0.3EL (RHEL 4)
-        2.6.5-7.276 (SLES 9)
+        2.6.9-42.0.8EL (RHEL 4)
+        2.6.5-7.283 (SLES 9)
          2.4.21-47.0.1.EL (RHEL 3)
          2.6.12.6 vanilla (kernel.org)
          2.6.16.21-0.8 (SLES10)
         * Client support for unpatched kernels:
          (see https://mail.clusterfs.com/wikis/lustre/PatchlessClient)
          2.6.16 - 2.6.19 vanilla (kernel.org)
          2.4.21-47.0.1.EL (RHEL 3)
          2.6.12.6 vanilla (kernel.org)
          2.6.16.21-0.8 (SLES10)
         * Client support for unpatched kernels:
          (see https://mail.clusterfs.com/wikis/lustre/PatchlessClient)
          2.6.16 - 2.6.19 vanilla (kernel.org)
-        2.6.9-42.0.3EL (RHEL 4)
+        2.6.9-42.0.8EL (RHEL 4)
         * Recommended e2fsprogs version: 1.39.cfs2-0
         * bug fixes
  
         * Recommended e2fsprogs version: 1.39.cfs2-0
         * bug fixes
  
@@ -81,9 +81,9 @@ Details    : Changes the blocksize for regular files to be 2x RPC size,
  Severity   : enhancement
  Bugzilla   : 9293
  Description: Multiple MD RPCs in flight.
  Severity   : enhancement
  Bugzilla   : 9293
  Description: Multiple MD RPCs in flight.
-Details    : Further unserialise some read-only MDS RPCs - learn about intents.
-            To avoid overly-overloading MDS, introduce a limit on number of
-            MDS RPCs in flight for a single client and add /proc controls
+Details    : Further unserialise some read-only MDT RPCs - learn about intents.
+            To avoid overly-overloading MDT, introduce a limit on number of
+            MDT RPCs in flight for a single client and add /proc controls
              to adjust this limit.
  
  Severity   : enhancement
              to adjust this limit.
  
  Severity   : enhancement
@@ -104,8 +104,8 @@ Details    : Add ldlm and operations statistics for each client in
         
  Severity   : enhancement
  Bugzilla   : 22486
         
  Severity   : enhancement
  Bugzilla   : 22486
-Description: mds statistics
-Details    : Add detailed mds operations statistics in
+Description: improved MDT statistics
+Details    : Add detailed MDT operations statistics in
              /proc/fs/lustre/mds/*/stats
         
  Severity   : enhancement
              /proc/fs/lustre/mds/*/stats
         
  Severity   : enhancement
@@ -113,7 +113,7 @@ Bugzilla   : 10968
  Description: VFS operations stats
  Details    : Add client VFS call stats, trackable by pid, ppid, or gid
              /proc/fs/lustre/llite/*/vfs_ops_stats
  Description: VFS operations stats
  Details    : Add client VFS call stats, trackable by pid, ppid, or gid
              /proc/fs/lustre/llite/*/vfs_ops_stats
-            /proc/fs/lustre/llite/*/track_[pid|ppid|gid]
+            /proc/fs/lustre/llite/*/vfs_track_[pid|ppid|gid]
  
  Severity   : minor
  Frequency  : always
  
  Severity   : minor
  Frequency  : always
@@ -153,24 +153,24 @@ Details    : The new msg_v2 system had some failures in mixed-endian
  Severity   : enhancement
  Bugzilla   : 11229
  Description: Easy OST removal
  Severity   : enhancement
  Bugzilla   : 11229
  Description: Easy OST removal
-Details           : OSTs can be permanently deactivated with e.g. 'lctl
+Details    : OSTs can be permanently deactivated with e.g. 'lctl
              conf_param lustre-OST0001.osc.active=0'    
  
  Severity   : enhancement
  Bugzilla   : 11335
  Description: MGS proc entries
              conf_param lustre-OST0001.osc.active=0'    
  
  Severity   : enhancement
  Bugzilla   : 11335
  Description: MGS proc entries
-Details           : Added basic proc entries for the MGS showing what filesystems
+Details    : Added basic proc entries for the MGS showing what filesystems
              are served.
  
  Severity   : enhancement
  Bugzilla   : 10998
  Description: provide MGS failover
              are served.
  
  Severity   : enhancement
  Bugzilla   : 10998
  Description: provide MGS failover
-Details           : Added config lock reacquisition after MGS server failover. 
+Details    : Added config lock reacquisition after MGS server failover. 
         
  Severity   : enhancement
  Bugzilla   : 11461
  Description: add Linux 2.4 support
         
  Severity   : enhancement
  Bugzilla   : 11461
  Description: add Linux 2.4 support
-Details           : Added support for RHEL 2.4.21 kernel for 1.6 servers and clients
+Details    : Added support for RHEL 2.4.21 kernel for 1.6 servers and clients
  
  Severity   : normal
  Bugzilla   : 11330
  
  Severity   : normal
  Bugzilla   : 11330
@@ -198,33 +198,6 @@ Details    : Grouping plain/inodebits in granted list by their request modes
              and bits policy, thus improving the performance of search through
              the granted list.
  
              and bits policy, thus improving the performance of search through
              the granted list.
  
-Severity   : major          
-Frequency  : only if OST filesystem is corrupted
-Bugzilla   : 9829
-Description: client incorrectly hits assertion in ptlrpc_replay_req()
-Details    : for a short time RPCs with bulk IO are in the replay list,
-            but replay of bulk IOs is unimplemented.  If the OST filesystem
-            is corrupted due to disk cache incoherency and then replay is
-            started it is possible to trip an assertion.  Avoid putting
-            committed RPCs into the replay list at all to avoid this issue.
-
-Severity   : major
-Frequency  : liblustre (e.g. catamount) on a large cluster with >= 8 OSTs
-             per OSS
-Bugzilla   : 11684
-Description: System hang on startup
-Details    : This bug allowed the liblustre (e.g. catamount) client to
-             return to the app before handling all startup RPCs.  This
-            could leave the node unresponsive to lustre network traffic
-            and manifested as a server ptllnd timeout.
-
-Severity   : enhancement
-Bugzilla   : 11667
-Description: Add "/proc/sys/lustre/debug_peer_on_timeout"
-             (liblustre envirable: LIBLUSTRE_DEBUG_PEER_ON_TIMEOUT)
-            boolean to control whether to print peer debug info when a
-            client's RPC times out.
-
  Severity   : minor
  Frequency  : only for kernels with patches from Lustre below 1.4.3  
  Bugzilla   : 11248
  Severity   : minor
  Frequency  : only for kernels with patches from Lustre below 1.4.3  
  Bugzilla   : 11248
@@ -239,17 +212,65 @@ Details    : During a commanded failover stop, we set the disk device
              read-only while the server shuts down. We now also set any
              external journal device read-only at the same time. 
         
              read-only while the server shuts down. We now also set any
              external journal device read-only at the same time. 
         
+Severity   : minor
+Frequency  : when upgrading from 1.4 while trying to change parameters 
+Bugzilla   : 11692
+Description: The wrong (new) MDC name was used when setting parameters for
+            upgraded MDT's.  Also allows changing of OSC (and MDC)
+            parameters if --writeconf is specified at tunefs upgrade time.
+
+Severity   : major
+Frequency  : when setting specific ost indicies
+Bugzilla   : 11149
+Description: QOS code breaks on skipped indicies
+Details    : Add checks for missing OST indicies in the QOS code, so OSTs
+            created with --index need not be sequential.
+
+Severity   : normal
+Frequency  : always
+Bugzilla   : 3244
+Description: Addition of EXT3_FEATURE_RO_COMPAT_DIR_NLINKS flag for
+            > 32000 subdirectories
+Details    : Add EXT3_FEATURE_RO_COMPAT_DIR_NLINK flag to 
+            EXT3_FEATURE_RO_COMPAT_SUPP. This flag will be set whenever
+            subdirectory count crosses 32000. This will aid e2fsck to
+            correctly handle more than 32000 subdirectories.
+            
+Severity   : normal
+Frequency  : always
+Bugzilla   : 11090
+Description: versioning check is incomplete
+Details    : Checking the version difference of client vs. server, report
+             error if the gap is too big.
+
  ------------------------------------------------------------------------------
  
  TBD         Cluster File Systems, Inc. <info@clusterfs.com>
         * version 1.4.10
         * Support for kernels:
  ------------------------------------------------------------------------------
  
  TBD         Cluster File Systems, Inc. <info@clusterfs.com>
         * version 1.4.10
         * Support for kernels:
-        2.6.9-42.0.3EL (RHEL 4)
+        2.6.16.21-0.8 (SLES10)
+        2.6.9-42.0.8EL (RHEL 4)
          2.6.5-7.276 (SLES 9)
          2.4.21-47.0.1.EL (RHEL 3)
          2.6.12.6 vanilla (kernel.org)
         * Recommended e2fsprogs version: 1.39.cfs2-0
  
          2.6.5-7.276 (SLES 9)
          2.4.21-47.0.1.EL (RHEL 3)
          2.6.12.6 vanilla (kernel.org)
         * Recommended e2fsprogs version: 1.39.cfs2-0
  
+Severity   : major
+Frequency  : liblustre (e.g. catamount) on a large cluster with >= 8 OSTs/OSS
+Bugzilla   : 11684
+Description: System hang on startup
+Details    : This bug allowed the liblustre (e.g. catamount) client to
+            return to the app before handling all startup RPCs.  This
+            could leave the node unresponsive to lustre network traffic
+            and manifested as a server ptllnd timeout.
+
+Severity   : enhancement
+Bugzilla   : 11667
+Description: Add "/proc/sys/lustre/debug_peer_on_timeout"
+            (liblustre envirable: LIBLUSTRE_DEBUG_PEER_ON_TIMEOUT)
+            boolean to control whether to print peer debug info when a
+            client's RPC times out.
+
  Severity   : normal
  Frequency  : always
  Bugzilla   : 10214
  Severity   : normal
  Frequency  : always
  Bugzilla   : 10214
@@ -277,7 +298,7 @@ Frequency  : Only for files larger than 4GB on 32-bit clients.
  Bugzilla   : 11237
  Description: improperly doing page alignment of locks
  Details    : Modify lustre core code to use CFS_PAGE_* defines instead of 
  Bugzilla   : 11237
  Description: improperly doing page alignment of locks
  Details    : Modify lustre core code to use CFS_PAGE_* defines instead of 
-            PAGE_*. Make CFS_PAGE_MASK 64bit long.
+            PAGE_*. Make CFS_PAGE_MASK a 64-bit mask.
  
  Severity   : normal
  Frequency  : rarely
  
  Severity   : normal
  Frequency  : rarely
@@ -294,6 +315,16 @@ Description: Crash on NFS re-export node
  Details    : under very unusual load conditions an assertion is hit in
              ll_intent_file_open()
  
  Details    : under very unusual load conditions an assertion is hit in
              ll_intent_file_open()
  
+Severity   : major          
+Frequency  : only if OST filesystem is corrupted
+Bugzilla   : 9829
+Description: client incorrectly hits assertion in ptlrpc_replay_req()
+Details    : for a short time RPCs with bulk IO are in the replay list,
+            but replay of bulk IOs is unimplemented.  If the OST filesystem
+            is corrupted due to disk cache incoherency and then replay is
+            started it is possible to trip an assertion.  Avoid putting
+            committed RPCs into the replay list at all to avoid this issue.
+
  Severity   : normal
  Frequency  : always
  Bugzilla   : 10901
  Severity   : normal
  Frequency  : always
  Bugzilla   : 10901
@@ -303,29 +334,30 @@ Details    : Large single O_DIRECT read and write calls can fail to allocate
              allocation failure the allocation is retried with a smaller
              buffer and broken into smaller requests.
  
              allocation failure the allocation is retried with a smaller
              buffer and broken into smaller requests.
  
-Severity   : normal
-Frequency  : always
-Bugzilla   : 3244
-Description: Addition of EXT3_FEATURE_RO_COMPAT_DIR_NLINKS flag for
-            > 32000 subdirectories
-Details    : Add EXT3_FEATURE_RO_COMPAT_DIR_NLINK flag to 
-             EXT3_FEATURE_RO_COMPAT_SUPP. This flag will be set whenever
-             subdirectory count crosses 32000. This will aid e2fsck to
-             correctly handle more than 32000 subdirectories.
-
  ------------------------------------------------------------------------------
  
  ------------------------------------------------------------------------------
  
-TBD         Cluster File Systems, Inc. <info@clusterfs.com>
+2006-02-09  Cluster File Systems, Inc. <info@clusterfs.com>
         * version 1.4.9
         * Support for kernels:
         * version 1.4.9
         * Support for kernels:
+        2.6.16.21-0.8 (SLES10)
          2.6.9-42.0.3EL (RHEL 4)
          2.6.5-7.276 (SLES 9)
          2.6.9-42.0.3EL (RHEL 4)
          2.6.5-7.276 (SLES 9)
-        2.4.21-40.0.1.EL (RHEL 3)
+        2.4.21-47.0.1.EL (RHEL 3)
          2.6.12.6 vanilla (kernel.org)
         * bug fixes
  
          2.6.12.6 vanilla (kernel.org)
         * bug fixes
  
+       * The backwards-compatible /proc/sys/portals symlink has been removed
+        in this release.  Before upgrading, please ensure that you change
+        any configuration scripts or /etc/sysctl.conf files that access
+        /proc/sys/portals/* or sysctl portals.* to use the corresponding
+        entry in /proc/sys/lnet or sysctl lnet.*.  This change can be made
+        in advance of the upgrade on any system running Lustre 1.4.6 or
+        newer, since /proc/sys/lnet was added in that version.
+       * Note that reiserfs quotas are temporarily disabled on SLES 10 in this
+        kernel.
+
  Severity   : critical
  Severity   : critical
-Frequency  : rare
+Frequency  : MDS failover only, very rarely
  Bugzilla   : 11125
  Description: "went back in time" messages on mds failover
  Details    : The greatest transno may be lost when the current operation
  Bugzilla   : 11125
  Description: "went back in time" messages on mds failover
  Details    : The greatest transno may be lost when the current operation
@@ -393,11 +425,10 @@ Frequency  : MDS failover only, very rarely
  Bugzilla   : 11277
  Description: clients may get ASSERTION(granted_lock != NULL)
  Details    : When request was taking a long time, and a client was resending
  Bugzilla   : 11277
  Description: clients may get ASSERTION(granted_lock != NULL)
  Details    : When request was taking a long time, and a client was resending
-            a getattr by name lock request. The were multiple lock
-            requests with the same client lock handle and
-            mds_getattr_name->fixup_handle_for_resent_request found one
-            of the lock handles but later failed with
-            ASSERTION(granted_lock != NULL).
+            a getattr by name lock request. The were multiple lock requests
+            with the same client lock handle and
+            mds_getattr_name->fixup_handle_for_resent_request found one of the
+            lock handles but later failed with ASSERTION(granted_lock != NULL).
  
  Severity   : major
  Frequency  : rare
  
  Severity   : major
  Frequency  : rare
@@ -428,14 +459,15 @@ Frequency  : NFS re-export or patchless client
  Bugzilla   : 10796
  Description: Various nfs/patchless fixes.
  Details    : fixes reuse disconected alias for lookup process - this fixes
  Bugzilla   : 10796
  Description: Various nfs/patchless fixes.
  Details    : fixes reuse disconected alias for lookup process - this fixes
-            warning "find_exported_dentry: npd != pd", fix permission
-            error with open files at nfs.
+            warning "find_exported_dentry: npd != pd",
+            fix permission error with open files at nfs.
+            fix apply umaks when do revalidate.
  
  Severity   : normal
  Frequency  : occasional
  Bugzilla   : 11191
  Description: Crash on NFS re-export node
  
  Severity   : normal
  Frequency  : occasional
  Bugzilla   : 11191
  Description: Crash on NFS re-export node
-Details    : call clear_page on wrong pointer triggered oops in 
+Details    : calling clear_page() on the wrong pointer triggered oops in 
              generic_mapping_read().
  
  Severity   : normal
              generic_mapping_read().
  
  Severity   : normal
@@ -454,19 +486,19 @@ Details    : If only a small amount of IO is done to the RAID device before
  
  Severity   : major
  Frequency  : depends on arch, kernel and compiler version, always on sles10
  
  Severity   : major
  Frequency  : depends on arch, kernel and compiler version, always on sles10
-             kernel and x86_64
+            kernel and x86_64
  Bugzilla   : 11562
  Description: recursive or deep enough symlinks cause stack overflow
  Details    : getting rid of large stack-allocated variable in
  Bugzilla   : 11562
  Description: recursive or deep enough symlinks cause stack overflow
  Details    : getting rid of large stack-allocated variable in
-             __vfs_follow_link
+            __vfs_follow_link
  
  Severity   : minor
  Frequency  : depends on hardware
  Bugzilla   : 11540
  Description: lustre write performance loss in the SLES10 kernel
  Details    : the performance loss is caused by using of write barriers in the
  
  Severity   : minor
  Frequency  : depends on hardware
  Bugzilla   : 11540
  Description: lustre write performance loss in the SLES10 kernel
  Details    : the performance loss is caused by using of write barriers in the
-             ext3 code. The SLES10 kernel turns barrier support on by
-             default. The fix is to undo that change for ldiskfs.
+            ext3 code. The SLES10 kernel turns barrier support on by
+            default. The fix is to undo that change for ldiskfs.
  
  ------------------------------------------------------------------------------
  
  
  ------------------------------------------------------------------------------
  
@@ -495,6 +527,13 @@ Details    : When reading per-device statfs data from /proc, in the
              {kbytes,files}_{total,free,avail} files, it may appear
              as zero or be out of date.
  
              {kbytes,files}_{total,free,avail} files, it may appear
              as zero or be out of date.
  
+Severity   : minor
+Frequency  : systems with MD RAID1 external journal devices
+Bugzilla   : 10832
+Description: lconf's call to blkid is confused by RAID1 journal devices
+Details    : Use the "blkid -l" flag to locate the MD RAID device instead
+            of returning all block devices that match the journal UUID.
+
  Severity   : normal
  Frequency  : always, for aggregate stripe size over 4GB
  Bugzilla   : 10725
  Severity   : normal
  Frequency  : always, for aggregate stripe size over 4GB
  Bugzilla   : 10725
@@ -526,7 +565,6 @@ Details    : With filesystems mounted using the "extents" option (2.6 kernels)
              the truncated size.  No file data is lost.
  
  Severity   : enhancement
              the truncated size.  No file data is lost.
  
  Severity   : enhancement
-Frequency  : liblustre only    
  Bugzilla   : 10452
  Description: Allow recovery/failover for liblustre clients.
  Details    : liblustre clients were unaware of failover configurations until
  Bugzilla   : 10452
  Description: Allow recovery/failover for liblustre clients.
  Details    : liblustre clients were unaware of failover configurations until
@@ -581,7 +619,7 @@ Details    : Re-validate root's dentry in ll_lookup_it to avoid having it
              invalid by the follow_mount time.
  
  Severity   : minor
              invalid by the follow_mount time.
  
  Severity   : minor
-Frequency  : rare
+Frequency  : liblustre clients only
  Bugzilla   : 10883
  Description: Race in 'instant cancel' lock handling could lead to such locks
              never to be granted in case of SMP MDS
  Bugzilla   : 10883
  Description: Race in 'instant cancel' lock handling could lead to such locks
              never to be granted in case of SMP MDS