LUDOC-263 lfsck: review and update LFSCK documentation.

author Richard Henwood <richard.henwood@intel.com>

Fri, 5 Dec 2014 20:20:37 +0000 (14:20 -0600)

committer Richard Henwood <richard.henwood@intel.com>

Wed, 11 Feb 2015 20:34:02 +0000 (20:34 +0000)
author Richard Henwood <richard.henwood@intel.com>
Fri, 5 Dec 2014 20:20:37 +0000 (14:20 -0600)
committer Richard Henwood <richard.henwood@intel.com>
Wed, 11 Feb 2015 20:34:02 +0000 (20:34 +0000)
diff --git a/BackupAndRestore.xml b/BackupAndRestore.xml

index 83202c3..c8a955f 100644 (file)
--- a/BackupAndRestore.xml
+++ b/BackupAndRestore.xml
@@ -1,146 +1,302 @@
-<?xml version='1.0' encoding='UTF-8'?>
-<chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="backupandrestore">
-  <title xml:id="backupandrestore.title">Backing Up and Restoring a File System</title>
-  <para>This chapter describes how to backup and restore at the file system-level, device-level and
-    file-level in a Lustre file system. Each backup approach is described in the the following
-    sections:</para>
+<?xml version='1.0' encoding='utf-8'?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
+xml:id="backupandrestore">
+  <title xml:id="backupandrestore.title">Backing Up and Restoring a File
+  System</title>
+  <para>This chapter describes how to backup and restore at the file
+  system-level, device-level and file-level in a Lustre file system. Each
+  backup approach is described in the the following sections:</para>
    <itemizedlist>
      <listitem>
-      <para><xref linkend="dbdoclet.50438207_56395"/></para>
+      <para>
+        <xref linkend="dbdoclet.50438207_56395" />
+      </para>
      </listitem>
      <listitem>
-      <para><xref linkend="dbdoclet.50438207_71633"/></para>
+      <para>
+        <xref linkend="dbdoclet.50438207_71633" />
+      </para>
      </listitem>
      <listitem>
-      <para><xref linkend="dbdoclet.50438207_21638"/></para>
+      <para>
+        <xref linkend="dbdoclet.50438207_21638" />
+      </para>
      </listitem>
      <listitem>
-      <para><xref linkend="dbdoclet.50438207_22325"/></para>
+      <para>
+        <xref linkend="dbdoclet.50438207_22325" />
+      </para>
      </listitem>
      <listitem>
-      <para><xref linkend="dbdoclet.50438207_31553"/></para>
+      <para>
+        <xref linkend="dbdoclet.50438207_31553" />
+      </para>
      </listitem>
    </itemizedlist>
    <section xml:id="dbdoclet.50438207_56395">
-      <title>
-          <indexterm><primary>backup</primary></indexterm>
-          <indexterm><primary>restoring</primary><see>backup</see></indexterm>
-          <indexterm><primary>LVM</primary><see>backup</see></indexterm>
-          <indexterm><primary>rsync</primary><see>backup</see></indexterm>
-          Backing up a File System</title>
-    <para>Backing up a complete file system gives you full control over the files to back up, and
-      allows restoration of individual files as needed. File system-level backups are also the
-      easiest to integrate into existing backup solutions.</para>
-    <para>File system backups are performed from a Lustre client (or many clients working parallel in different directories) rather than on individual server nodes; this is no different than backing up any other file system.</para>
-    <para>However, due to the large size of most Lustre file systems, it is not always possible to get a complete backup. We recommend that you back up subsets of a file system. This includes subdirectories of the entire file system, filesets for a single user, files incremented by date, and so on.</para>
+    <title>
+    <indexterm>
+      <primary>backup</primary>
+    </indexterm>
+    <indexterm>
+      <primary>restoring</primary>
+      <see>backup</see>
+    </indexterm>
+    <indexterm>
+      <primary>LVM</primary>
+      <see>backup</see>
+    </indexterm>
+    <indexterm>
+      <primary>rsync</primary>
+      <see>backup</see>
+    </indexterm>Backing up a File System</title>
+    <para>Backing up a complete file system gives you full control over the
+    files to back up, and allows restoration of individual files as needed.
+    File system-level backups are also the easiest to integrate into existing
+    backup solutions.</para>
+    <para>File system backups are performed from a Lustre client (or many
+    clients working parallel in different directories) rather than on
+    individual server nodes; this is no different than backing up any other
+    file system.</para>
+    <para>However, due to the large size of most Lustre file systems, it is not
+    always possible to get a complete backup. We recommend that you back up
+    subsets of a file system. This includes subdirectories of the entire file
+    system, filesets for a single user, files incremented by date, and so
+    on.</para>
      <note>
-      <para>In order to allow the file system namespace to scale for future applications, Lustre
-        software release 2.x internally uses a 128-bit file identifier for all files. To interface
-        with user applications, the Lustre software presents 64-bit inode numbers for the
-          <literal>stat()</literal>, <literal>fstat()</literal>, and <literal>readdir()</literal>
-        system calls on 64-bit applications, and 32-bit inode numbers to 32-bit applications.</para>
-      <para>Some 32-bit applications accessing Lustre file systems (on both 32-bit and 64-bit CPUs)
-        may experience problems with the <literal>stat()</literal>, <literal>fstat()</literal>
-          or<literal> readdir()</literal> system calls under certain circumstances, though the
-        Lustre client should return 32-bit inode numbers to these applications.</para>
-      <para>In particular, if the Lustre file system is exported from a 64-bit client via NFS to a
-        32-bit client, the Linux NFS server will export 64-bit inode numbers to applications running
-        on the NFS client. If the 32-bit applications are not compiled with Large File Support
-        (LFS), then they return <literal>EOVERFLOW</literal> errors when accessing the Lustre files.
-        To avoid this problem, Linux NFS clients can use the kernel command-line option
-          &quot;<literal>nfs.enable_ino64=0</literal>&quot; in order to force the NFS client to
-        export 32-bit inode numbers to the client.</para>
-      <para><emphasis role="bold">Workaround</emphasis>: We very strongly recommend that backups using <literal>tar(1)</literal> and other utilities that depend on the inode number to uniquely identify an inode to be run on 64-bit clients. The 128-bit Lustre file identifiers cannot be uniquely mapped to a 32-bit inode number, and as a result these utilities may operate incorrectly on 32-bit clients.</para>
+      <para>In order to allow the file system namespace to scale for future
+      applications, Lustre software release 2.x internally uses a 128-bit file
+      identifier for all files. To interface with user applications, the Lustre
+      software presents 64-bit inode numbers for the 
+      <literal>stat()</literal>, 
+      <literal>fstat()</literal>, and 
+      <literal>readdir()</literal> system calls on 64-bit applications, and
+      32-bit inode numbers to 32-bit applications.</para>
+      <para>Some 32-bit applications accessing Lustre file systems (on both
+      32-bit and 64-bit CPUs) may experience problems with the 
+      <literal>stat()</literal>, 
+      <literal>fstat()</literal> or
+      <literal>readdir()</literal> system calls under certain circumstances,
+      though the Lustre client should return 32-bit inode numbers to these
+      applications.</para>
+      <para>In particular, if the Lustre file system is exported from a 64-bit
+      client via NFS to a 32-bit client, the Linux NFS server will export
+      64-bit inode numbers to applications running on the NFS client. If the
+      32-bit applications are not compiled with Large File Support (LFS), then
+      they return 
+      <literal>EOVERFLOW</literal> errors when accessing the Lustre files. To
+      avoid this problem, Linux NFS clients can use the kernel command-line
+      option "
+      <literal>nfs.enable_ino64=0</literal>" in order to force the NFS client
+      to export 32-bit inode numbers to the client.</para>
+      <para>
+      <emphasis role="bold">Workaround</emphasis>: We very strongly recommend
+      that backups using 
+      <literal>tar(1)</literal> and other utilities that depend on the inode
+      number to uniquely identify an inode to be run on 64-bit clients. The
+      128-bit Lustre file identifiers cannot be uniquely mapped to a 32-bit
+      inode number, and as a result these utilities may operate incorrectly on
+      32-bit clients.</para>
      </note>
      <section remap="h3">
-      <title><indexterm><primary>backup</primary><secondary>rsync</secondary></indexterm>Lustre_rsync</title>
-      <para>The <literal>lustre_rsync </literal>feature keeps the entire file system in sync on a backup by replicating the file system&apos;s changes to a second file system (the second file system need not be a Lustre file system, but it must be sufficiently large). <literal>lustre_rsync </literal>uses Lustre changelogs to efficiently synchronize the file systems without having to scan (directory walk) the Lustre file system. This efficiency is critically important for large file systems, and distinguishes the Lustre <literal>lustre_rsync</literal> feature from other replication/backup solutions.</para>
+      <title>
+      <indexterm>
+        <primary>backup</primary>
+        <secondary>rsync</secondary>
+      </indexterm>Lustre_rsync</title>
+      <para>The 
+      <literal>lustre_rsync</literal> feature keeps the entire file system in
+      sync on a backup by replicating the file system's changes to a second
+      file system (the second file system need not be a Lustre file system, but
+      it must be sufficiently large). 
+      <literal>lustre_rsync</literal> uses Lustre changelogs to efficiently
+      synchronize the file systems without having to scan (directory walk) the
+      Lustre file system. This efficiency is critically important for large
+      file systems, and distinguishes the Lustre 
+      <literal>lustre_rsync</literal> feature from other replication/backup
+      solutions.</para>
        <section remap="h4">
-          <title><indexterm><primary>backup</primary><secondary>rsync</secondary><tertiary>using</tertiary></indexterm>Using Lustre_rsync</title>
-        <para>The <literal>lustre_rsync</literal> feature works by periodically running <literal>lustre_rsync</literal>, a userspace program used to synchronize changes in the Lustre file system onto the target file system. The <literal>lustre_rsync</literal> utility keeps a status file, which enables it to be safely interrupted and restarted without losing synchronization between the file systems.</para>
-        <para>The first time that <literal>lustre_rsync</literal> is run, the user must specify a set of parameters for the program to use. These parameters are described in the following table and in <xref linkend="dbdoclet.50438219_63667"/>. On subsequent runs, these parameters are stored in the the status file, and only the name of the status file needs to be passed to <literal>lustre_rsync</literal>.</para>
-        <para>Before using <literal>lustre_rsync</literal>:</para>
+        <title>
+        <indexterm>
+          <primary>backup</primary>
+          <secondary>rsync</secondary>
+          <tertiary>using</tertiary>
+        </indexterm>Using Lustre_rsync</title>
+        <para>The 
+        <literal>lustre_rsync</literal> feature works by periodically running 
+        <literal>lustre_rsync</literal>, a userspace program used to
+        synchronize changes in the Lustre file system onto the target file
+        system. The 
+        <literal>lustre_rsync</literal> utility keeps a status file, which
+        enables it to be safely interrupted and restarted without losing
+        synchronization between the file systems.</para>
+        <para>The first time that 
+        <literal>lustre_rsync</literal> is run, the user must specify a set of
+        parameters for the program to use. These parameters are described in
+        the following table and in 
+        <xref linkend="dbdoclet.50438219_63667" />. On subsequent runs, these
+        parameters are stored in the the status file, and only the name of the
+        status file needs to be passed to 
+        <literal>lustre_rsync</literal>.</para>
+        <para>Before using 
+        <literal>lustre_rsync</literal>:</para>
          <itemizedlist>
            <listitem>
-            <para>Register the changelog user. For details, see the <xref linkend="systemconfigurationutilities"/> (<literal>changelog_register</literal>) parameter in the <xref linkend="systemconfigurationutilities"/> (<literal>lctl</literal>).</para>
+            <para>Register the changelog user. For details, see the 
+            <xref linkend="systemconfigurationutilities" />(
+            <literal>changelog_register</literal>) parameter in the 
+            <xref linkend="systemconfigurationutilities" />(
+            <literal>lctl</literal>).</para>
            </listitem>
          </itemizedlist>
          <para>- AND -</para>
          <itemizedlist>
            <listitem>
-            <para>Verify that the Lustre file system (source) and the replica file system (target) are identical <emphasis>before</emphasis> registering the changelog user. If the file systems are discrepant, use a utility, e.g. regular <literal>rsync</literal> (not <literal>lustre_rsync</literal>), to make them identical.</para>
+            <para>Verify that the Lustre file system (source) and the replica
+            file system (target) are identical 
+            <emphasis>before</emphasis>registering the changelog user. If the
+            file systems are discrepant, use a utility, e.g. regular 
+            <literal>rsync</literal>(not 
+            <literal>lustre_rsync</literal>), to make them identical.</para>
            </listitem>
          </itemizedlist>
-        <para>The <literal>lustre_rsync</literal> utility uses the following parameters:</para>
+        <para>The 
+        <literal>lustre_rsync</literal> utility uses the following
+        parameters:</para>
          <informaltable frame="all">
            <tgroup cols="2">
-            <colspec colname="c1" colwidth="3*"/>
-            <colspec colname="c2" colwidth="10*"/>
+            <colspec colname="c1" colwidth="3*" />
+            <colspec colname="c2" colwidth="10*" />
              <thead>
                <row>
                  <entry>
-                  <para><emphasis role="bold">Parameter</emphasis></para>
+                  <para>
+                    <emphasis role="bold">Parameter</emphasis>
+                  </para>
                  </entry>
                  <entry>
-                  <para><emphasis role="bold">Description</emphasis></para>
+                  <para>
+                    <emphasis role="bold">Description</emphasis>
+                  </para>
                  </entry>
                </row>
              </thead>
              <tbody>
                <row>
                  <entry>
-                  <para> <literal>--source=<replaceable>src</replaceable></literal></para>
+                  <para>
+                    <literal>--source=
+                    <replaceable>src</replaceable></literal>
+                  </para>
                  </entry>
                  <entry>
-                  <para>The path to the root of the Lustre file system (source) which will be synchronized. This is a mandatory option if a valid status log created during a previous synchronization operation (<literal>--statuslog</literal>) is not specified.</para>
+                  <para>The path to the root of the Lustre file system (source)
+                  which will be synchronized. This is a mandatory option if a
+                  valid status log created during a previous synchronization
+                  operation (
+                  <literal>--statuslog</literal>) is not specified.</para>
                  </entry>
                </row>
                <row>
                  <entry>
-                  <para> <literal>--target=<replaceable>tgt</replaceable></literal></para>
+                  <para>
+                    <literal>--target=
+                    <replaceable>tgt</replaceable></literal>
+                  </para>
                  </entry>
                  <entry>
-                  <para>The path to the root where the source file system will be synchronized (target). This is a mandatory option if the status log created during a previous synchronization operation (<literal>--statuslog</literal>) is not specified. This option can be repeated if multiple synchronization targets are desired.</para>
+                  <para>The path to the root where the source file system will
+                  be synchronized (target). This is a mandatory option if the
+                  status log created during a previous synchronization
+                  operation (
+                  <literal>--statuslog</literal>) is not specified. This option
+                  can be repeated if multiple synchronization targets are
+                  desired.</para>
                  </entry>
                </row>
                <row>
                  <entry>
-                  <para> <literal>--mdt=<replaceable>mdt</replaceable></literal></para>
+                  <para>
+                    <literal>--mdt=
+                    <replaceable>mdt</replaceable></literal>
+                  </para>
                  </entry>
                  <entry>
-                  <para>The metadata device to be synchronized. A changelog user must be registered for this device. This is a mandatory option if a valid status log created during a previous synchronization operation (<literal>--statuslog</literal>) is not specified.</para>
+                  <para>The metadata device to be synchronized. A changelog
+                  user must be registered for this device. This is a mandatory
+                  option if a valid status log created during a previous
+                  synchronization operation (
+                  <literal>--statuslog</literal>) is not specified.</para>
                  </entry>
                </row>
                <row>
                  <entry>
-                  <para> <literal>--user=<replaceable>userid</replaceable></literal></para>
+                  <para>
+                    <literal>--user=
+                    <replaceable>userid</replaceable></literal>
+                  </para>
                  </entry>
                  <entry>
-                  <para>The changelog user ID for the specified MDT. To use <literal>lustre_rsync</literal>, the changelog user must be registered. For details, see the <literal>changelog_register</literal> parameter in <xref linkend="systemconfigurationutilities"/> (<literal>lctl</literal>). This is a mandatory option if a valid status log created during a previous synchronization operation (<literal>--statuslog</literal>) is not specified.</para>
+                  <para>The changelog user ID for the specified MDT. To use 
+                  <literal>lustre_rsync</literal>, the changelog user must be
+                  registered. For details, see the 
+                  <literal>changelog_register</literal> parameter in 
+                  <xref linkend="systemconfigurationutilities" />(
+                  <literal>lctl</literal>). This is a mandatory option if a
+                  valid status log created during a previous synchronization
+                  operation (
+                  <literal>--statuslog</literal>) is not specified.</para>
                  </entry>
                </row>
                <row>
                  <entry>
-                  <para> <literal>--statuslog=<replaceable>log</replaceable></literal></para>
+                  <para>
+                    <literal>--statuslog=
+                    <replaceable>log</replaceable></literal>
+                  </para>
                  </entry>
                  <entry>
-                  <para>A log file to which synchronization status is saved. When the <literal>lustre_rsync</literal> utility starts, if the status log from a previous synchronization operation is specified, then the state is read from the log and otherwise mandatory <literal>--source</literal>, <literal>--target</literal> and <literal>--mdt</literal> options can be skipped. Specifying the <literal>--source</literal>, <literal>--target</literal> and/or <literal>--mdt</literal> options, in addition to the <literal>--statuslog</literal> option, causes the specified parameters in the status log to be overridden. Command line options take precedence over options in the status log.</para>
+                  <para>A log file to which synchronization status is saved.
+                  When the 
+                  <literal>lustre_rsync</literal> utility starts, if the status
+                  log from a previous synchronization operation is specified,
+                  then the state is read from the log and otherwise mandatory 
+                  <literal>--source</literal>, 
+                  <literal>--target</literal> and 
+                  <literal>--mdt</literal> options can be skipped. Specifying
+                  the 
+                  <literal>--source</literal>, 
+                  <literal>--target</literal> and/or 
+                  <literal>--mdt</literal> options, in addition to the 
+                  <literal>--statuslog</literal> option, causes the specified
+                  parameters in the status log to be overridden. Command line
+                  options take precedence over options in the status
+                  log.</para>
                  </entry>
                </row>
                <row>
                  <entry>
-                  <literal> --xattr <replaceable>yes|no</replaceable> </literal>
+                  <literal>--xattr 
+                  <replaceable>yes|no</replaceable></literal>
                  </entry>
                  <entry>
-                  <para>Specifies whether extended attributes (<literal>xattrs</literal>) are synchronized or not. The default is to synchronize extended attributes.</para>
-                  <para><note>
-                      <para>Disabling xattrs causes Lustre striping information not to be synchronized.</para>
-                    </note></para>
+                  <para>Specifies whether extended attributes (
+                  <literal>xattrs</literal>) are synchronized or not. The
+                  default is to synchronize extended attributes.</para>
+                  <para>
+                    <note>
+                      <para>Disabling xattrs causes Lustre striping information
+                      not to be synchronized.</para>
+                    </note>
+                  </para>
                  </entry>
                </row>
                <row>
                  <entry>
-                  <para> <literal>--verbose</literal></para>
+                  <para>
+                    <literal>--verbose</literal>
+                  </para>
                  </entry>
                  <entry>
                    <para>Produces verbose output.</para>
@@ -148,18 +304,28 @@
                </row>
                <row>
                  <entry>
-                  <para> <literal>--dry-run</literal></para>
+                  <para>
+                    <literal>--dry-run</literal>
+                  </para>
                  </entry>
                  <entry>
-                  <para>Shows the output of <literal>lustre_rsync</literal> commands (<literal>copy</literal>, <literal>mkdir</literal>, etc.) on the target file system without actually executing them.</para>
+                  <para>Shows the output of 
+                  <literal>lustre_rsync</literal> commands (
+                  <literal>copy</literal>, 
+                  <literal>mkdir</literal>, etc.) on the target file system
+                  without actually executing them.</para>
                  </entry>
                </row>
                <row>
                  <entry>
-                  <para> <literal>--abort-on-err</literal></para>
+                  <para>
+                    <literal>--abort-on-err</literal>
+                  </para>
                  </entry>
                  <entry>
-                  <para>Stops processing the <literal>lustre_rsync</literal> operation if an error occurs. The default is to continue the operation.</para>
+                  <para>Stops processing the 
+                  <literal>lustre_rsync</literal> operation if an error occurs.
+                  The default is to continue the operation.</para>
                  </entry>
                </row>
              </tbody>
@@ -167,12 +333,22 @@
          </informaltable>
        </section>
        <section remap="h4">
-          <title><indexterm><primary>backup</primary><secondary>rsync</secondary><tertiary>examples</tertiary></indexterm><literal>lustre_rsync</literal> Examples</title>
-        <para>Sample <literal>lustre_rsync</literal> commands are listed below.</para>
-        <para>Register a changelog user for an MDT (e.g. <literal>testfs-MDT0000</literal>).</para>
+        <title>
+        <indexterm>
+          <primary>backup</primary>
+          <secondary>rsync</secondary>
+          <tertiary>examples</tertiary>
+        </indexterm>
+        <literal>lustre_rsync</literal> Examples</title>
+        <para>Sample 
+        <literal>lustre_rsync</literal> commands are listed below.</para>
+        <para>Register a changelog user for an MDT (e.g. 
+        <literal>testfs-MDT0000</literal>).</para>
          <screen># lctl --device testfs-MDT0000 changelog_register testfs-MDT0000
-Registered changelog userid &apos;cl1&apos;</screen>
-        <para>Synchronize a Lustre file system (<literal>/mnt/lustre</literal>) to a target file system (<literal>/mnt/target</literal>).</para>
+Registered changelog userid 'cl1'</screen>
+        <para>Synchronize a Lustre file system (
+        <literal>/mnt/lustre</literal>) to a target file system (
+        <literal>/mnt/target</literal>).</para>
          <screen>$ lustre_rsync --source=/mnt/lustre --target=/mnt/target \
             --mdt=testfs-MDT0000 --user=cl1 --statuslog sync.log  --verbose 
  Lustre filesystem: testfs 
@@ -185,7 +361,10 @@ Starting changelog record: 0
  Errors: 0 
  lustre_rsync took 1 seconds 
  Changelog records consumed: 22</screen>
-        <para>After the file system undergoes changes, synchronize the changes onto the target file system. Only the <literal>statuslog</literal> name needs to be specified, as it has all the parameters passed earlier.</para>
+        <para>After the file system undergoes changes, synchronize the changes
+        onto the target file system. Only the 
+        <literal>statuslog</literal> name needs to be specified, as it has all
+        the parameters passed earlier.</para>
          <screen>$ lustre_rsync --statuslog sync.log --verbose 
  Replicating Lustre filesystem: testfs 
  MDT device: testfs-MDT0000 
@@ -197,7 +376,10 @@ Starting changelog record: 22
  Errors: 0 
  lustre_rsync took 2 seconds 
  Changelog records consumed: 42</screen>
-        <para>To synchronize a Lustre file system (<literal>/mnt/lustre</literal>) to two target file systems (<literal>/mnt/target1</literal> and <literal>/mnt/target2</literal>).</para>
+        <para>To synchronize a Lustre file system (
+        <literal>/mnt/lustre</literal>) to two target file systems (
+        <literal>/mnt/target1</literal> and 
+        <literal>/mnt/target2</literal>).</para>
          <screen>$ lustre_rsync --source=/mnt/lustre --target=/mnt/target1 \
             --target=/mnt/target2 --mdt=testfs-MDT0000 --user=cl1  \
             --statuslog sync.log</screen>
@@ -205,117 +387,210 @@ Changelog records consumed: 42</screen>
      </section>
    </section>
    <section xml:id="dbdoclet.50438207_71633">
-      <title><indexterm><primary>backup</primary><secondary>MDS/OST device level</secondary></indexterm>Backing Up and Restoring an MDS or OST (Device Level)</title>
-    <para>In some cases, it is useful to do a full device-level backup of an individual device (MDT or OST), before replacing hardware, performing maintenance, etc. Doing full device-level backups ensures that all of the data and configuration files is preserved in the original state and is the easiest method of doing a backup. For the MDT file system, it may also be the fastest way to perform the backup and restore, since it can do large streaming read and write operations at the maximum bandwidth of the underlying devices.</para>
+    <title>
+    <indexterm>
+      <primary>backup</primary>
+      <secondary>MDS/OST device level</secondary>
+    </indexterm>Backing Up and Restoring an MDS or OST (Device Level)</title>
+    <para>In some cases, it is useful to do a full device-level backup of an
+    individual device (MDT or OST), before replacing hardware, performing
+    maintenance, etc. Doing full device-level backups ensures that all of the
+    data and configuration files is preserved in the original state and is the
+    easiest method of doing a backup. For the MDT file system, it may also be
+    the fastest way to perform the backup and restore, since it can do large
+    streaming read and write operations at the maximum bandwidth of the
+    underlying devices.</para>
      <note>
-      <para>Keeping an updated full backup of the MDT is especially important because a permanent failure of the MDT file system renders the much larger amount of data in all the OSTs largely inaccessible and unusable.</para>
+      <para>Keeping an updated full backup of the MDT is especially important
+      because a permanent failure of the MDT file system renders the much
+      larger amount of data in all the OSTs largely inaccessible and
+      unusable.</para>
      </note>
      <warning condition='l23'>
-        <para>In Lustre software release 2.0 through 2.2, the only successful way to backup and
-        restore an MDT is to do a device-level backup as is described in this section. File-level
-        restore of an MDT is not possible before Lustre software release 2.3, as the Object Index
-        (OI) file cannot be rebuilt after restore without the OI Scrub functionality. <emphasis
-          role="bold">Since Lustre software release 2.3</emphasis>, Object Index files are
-        automatically rebuilt at first mount after a restore is detected (see <link
-          xl:href="http://jira.hpdd.intel.com/browse/LU-957">LU-957</link>), and file-level backup
-        is supported (see <xref linkend="dbdoclet.50438207_21638"/>).</para>
+      <para>In Lustre software release 2.0 through 2.2, the only successful way
+      to backup and restore an MDT is to do a device-level backup as is
+      described in this section. File-level restore of an MDT is not possible
+      before Lustre software release 2.3, as the Object Index (OI) file cannot
+      be rebuilt after restore without the OI Scrub functionality. 
+      <emphasis role="bold">Since Lustre software release 2.3</emphasis>,
+      Object Index files are automatically rebuilt at first mount after a
+      restore is detected (see 
+      <link xl:href="http://jira.hpdd.intel.com/browse/LU-957">LU-957</link>),
+      and file-level backup is supported (see 
+      <xref linkend="dbdoclet.50438207_21638" />).</para>
      </warning>
-    <para>If hardware replacement is the reason for the backup or if a spare storage device is available, it is possible to do a raw copy of the MDT or OST from one block device to the other, as long as the new device is at least as large as the original device. To do this, run:</para>
+    <para>If hardware replacement is the reason for the backup or if a spare
+    storage device is available, it is possible to do a raw copy of the MDT or
+    OST from one block device to the other, as long as the new device is at
+    least as large as the original device. To do this, run:</para>
      <screen>dd if=/dev/{original} of=/dev/{newdev} bs=1M</screen>
-    <para>If hardware errors cause read problems on the original device, use the command below to allow as much data as possible to be read from the original device while skipping sections of the disk with errors:</para>
+    <para>If hardware errors cause read problems on the original device, use
+    the command below to allow as much data as possible to be read from the
+    original device while skipping sections of the disk with errors:</para>
      <screen>dd if=/dev/{original} of=/dev/{newdev} bs=4k conv=sync,noerror /
        count={original size in 4kB blocks}</screen>
-    <para>Even in the face of hardware errors, the <literal>ldiskfs</literal>
-    file system is very robust and it may be possible to recover the file
-    system data after running <literal>e2fsck -fy /dev/{newdev}</literal> on
-    the new device, along with <literal>ll_recover_lost_found_objs</literal>
-    for OST devices.</para>
+    <para>Even in the face of hardware errors, the 
+    <literal>ldiskfs</literal> file system is very robust and it may be possible
+    to recover the file system data after running 
+    <literal>e2fsck -fy /dev/{newdev}</literal> on the new device, along with 
+    <literal>ll_recover_lost_found_objs</literal> for OST devices.</para>
      <para condition="l26">With Lustre software version 2.6 and later, there is
-    no longer a need to run <literal>ll_recover_lost_found_objs</literal> on
-    the OSTs, since the <literal>LFSCK</literal> scanning will automatically
-    move objects from <literal>lost+found</literal> back into its correct
-    location on the OST after directory corruption.</para>
+    no longer a need to run 
+    <literal>ll_recover_lost_found_objs</literal> on the OSTs, since the 
+    <literal>LFSCK</literal> scanning will automatically move objects from 
+    <literal>lost+found</literal> back into its correct location on the OST
+    after directory corruption.</para>
    </section>
    <section xml:id="dbdoclet.50438207_21638">
-      <title><indexterm><primary>backup</primary><secondary>OST file system</secondary></indexterm><indexterm><primary>backup</primary><secondary>MDT file system</secondary></indexterm>Making a File-Level Backup of an OST or MDT File System</title>
-    <para>This procedure provides an alternative to backup or migrate the data of an OST or MDT at the file level. At the file-level, unused space is omitted from the backed up and the process may be completed quicker with smaller total backup size. Backing up a single OST device is not necessarily the best way to perform backups of the Lustre file system, since the files stored in the backup are not usable without metadata stored on the MDT and additional file stripes that may be on other OSTs. However, it is the preferred method for migration of OST devices, especially when it is desirable to reformat the underlying file system with different configuration options or to reduce fragmentation.</para>
+    <title>
+    <indexterm>
+      <primary>backup</primary>
+      <secondary>OST file system</secondary>
+    </indexterm>
+    <indexterm>
+      <primary>backup</primary>
+      <secondary>MDT file system</secondary>
+    </indexterm>Making a File-Level Backup of an OST or MDT File System</title>
+    <para>This procedure provides an alternative to backup or migrate the data
+    of an OST or MDT at the file level. At the file-level, unused space is
+    omitted from the backed up and the process may be completed quicker with
+    smaller total backup size. Backing up a single OST device is not
+    necessarily the best way to perform backups of the Lustre file system,
+    since the files stored in the backup are not usable without metadata stored
+    on the MDT and additional file stripes that may be on other OSTs. However,
+    it is the preferred method for migration of OST devices, especially when it
+    is desirable to reformat the underlying file system with different
+    configuration options or to reduce fragmentation.</para>
      <note>
-        <para>Prior to Lustre software release 2.3, the only successful way to perform an MDT backup
-        and restore is to do a device-level backup as is described in <xref
-          linkend="dbdoclet.50438207_71633"/>. The ability to do MDT file-level backups is not
-        available for Lustre software release 2.0 through 2.2, because restoration of the Object
-        Index (OI) file does not return the MDT to a functioning state. <emphasis role="bold">Since
-          Lustre software release 2.3</emphasis>, Object Index files are automatically rebuilt at
-        first mount after a restore is detected (see <link
-          xl:href="http://jira.hpdd.intel.com/browse/LU-957">LU-957</link>), so file-level MDT
-        restore is supported.</para>
+      <para>Prior to Lustre software release 2.3, the only successful way to
+      perform an MDT backup and restore is to do a device-level backup as is
+      described in 
+      <xref linkend="dbdoclet.50438207_71633" />. The ability to do MDT
+      file-level backups is not available for Lustre software release 2.0
+      through 2.2, because restoration of the Object Index (OI) file does not
+      return the MDT to a functioning state. 
+      <emphasis role="bold">Since Lustre software release 2.3</emphasis>,
+      Object Index files are automatically rebuilt at first mount after a
+      restore is detected (see 
+      <link xl:href="http://jira.hpdd.intel.com/browse/LU-957">LU-957</link>),
+      so file-level MDT restore is supported.</para>
      </note>
-    <para>For Lustre software release 2.3 and newer with MDT file-level backup support, substitute
-        <literal>mdt</literal> for <literal>ost</literal> in the instructions below.</para>
+    <para>For Lustre software release 2.3 and newer with MDT file-level backup
+    support, substitute 
+    <literal>mdt</literal> for 
+    <literal>ost</literal> in the instructions below.</para>
      <orderedlist>
        <listitem>
-        <para><emphasis role="bold">Make a mountpoint for the file system.</emphasis></para>
+        <para>
+          <emphasis role="bold">Make a mountpoint for the file
+          system.</emphasis>
+        </para>
          <screen>[oss]# mkdir -p /mnt/ost</screen>
        </listitem>
        <listitem>
-        <para><emphasis role="bold">Mount the file system.</emphasis></para>
+        <para>
+          <emphasis role="bold">Mount the file system.</emphasis>
+        </para>
          <screen>[oss]# mount -t ldiskfs /dev/<emphasis>{ostdev}</emphasis> /mnt/ost</screen>
        </listitem>
        <listitem>
-        <para><emphasis role="bold">Change to the mountpoint being backed up.</emphasis></para>
+        <para>
+          <emphasis role="bold">Change to the mountpoint being backed
+          up.</emphasis>
+        </para>
          <screen>[oss]# cd /mnt/ost</screen>
        </listitem>
        <listitem>
-        <para><emphasis role="bold">Back up the extended attributes.</emphasis></para>
-        <screen>[oss]# getfattr -R -d -m &apos;.*&apos; -e hex -P . &gt; ea-$(date +%Y%m%d).bak</screen>
+        <para>
+          <emphasis role="bold">Back up the extended attributes.</emphasis>
+        </para>
+        <screen>[oss]# getfattr -R -d -m '.*' -e hex -P . &gt; ea-$(date +%Y%m%d).bak</screen>
          <note>
-          <para>If the <literal>tar(1)</literal> command supports the <literal>--xattr</literal> option, the <literal>getfattr</literal> step may be unnecessary as long as tar does a backup of the <literal>trusted.*</literal> attributes. However, completing this step is not harmful and can serve as an added safety measure.</para>
+          <para>If the 
+          <literal>tar(1)</literal> command supports the 
+          <literal>--xattr</literal> option, the 
+          <literal>getfattr</literal> step may be unnecessary as long as tar
+          does a backup of the 
+          <literal>trusted.*</literal> attributes. However, completing this step
+          is not harmful and can serve as an added safety measure.</para>
          </note>
          <note>
-          <para>In most distributions, the <literal>getfattr</literal> command is part of the <literal>attr</literal> package. If the <literal>getfattr</literal> command returns errors like <literal>Operation not supported</literal>, then the kernel does not correctly support EAs. Stop and use a different backup method.</para>
+          <para>In most distributions, the 
+          <literal>getfattr</literal> command is part of the 
+          <literal>attr</literal> package. If the 
+          <literal>getfattr</literal> command returns errors like 
+          <literal>Operation not supported</literal>, then the kernel does not
+          correctly support EAs. Stop and use a different backup method.</para>
          </note>
        </listitem>
        <listitem>
-        <para><emphasis role="bold">Verify that the <literal>ea-$date.bak</literal> file has properly backed up the EA data on the OST.</emphasis></para>
-        <para>Without this attribute data, the restore process may be missing extra data that can be very useful in case of later file system corruption. Look at this file with more or a text editor. Each object file should have a corresponding item similar to this:</para>
+        <para>
+          <emphasis role="bold">Verify that the 
+          <literal>ea-$date.bak</literal> file has properly backed up the EA
+          data on the OST.</emphasis>
+        </para>
+        <para>Without this attribute data, the restore process may be missing
+        extra data that can be very useful in case of later file system
+        corruption. Look at this file with more or a text editor. Each object
+        file should have a corresponding item similar to this:</para>
          <screen>[oss]# file: O/0/d0/100992
  trusted.fid= \
  0x0d822200000000004a8a73e500000000808a0100000000000000000000000000</screen>
        </listitem>
        <listitem>
-        <para><emphasis role="bold">Back up all file system data.</emphasis></para>
+        <para>
+          <emphasis role="bold">Back up all file system data.</emphasis>
+        </para>
          <screen>[oss]# tar czvf {backup file}.tgz [--xattrs] --sparse .</screen>
          <note>
-            <para>The tar <literal>--sparse</literal> option is vital for backing up an MDT. In
-            order to have <literal>--sparse</literal> behave correctly, and complete the backup of
-            and MDT in finite time, the version of tar must be specified. Correctly functioning
-            versions of tar include the Lustre software enhanced version of tar at <link
-              xmlns:xlink="http://www.w3.org/1999/xlink"
-              xlink:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Tools#LustreTools-lustre-tar"
-            />, the tar from a Red Hat Enterprise Linux distribution (version 6.3 or more recent)
-            and the GNU tar version 1.25 or more recent.</para>
+          <para>The tar 
+          <literal>--sparse</literal> option is vital for backing up an MDT. In
+          order to have 
+          <literal>--sparse</literal> behave correctly, and complete the backup
+          of and MDT in finite time, the version of tar must be specified.
+          Correctly functioning versions of tar include the Lustre software
+          enhanced version of tar at 
+          <link xmlns:xlink="http://www.w3.org/1999/xlink"
+          xlink:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Tools#LustreTools-lustre-tar" />,
+          the tar from a Red Hat Enterprise Linux distribution (version 6.3 or
+          more recent) and the GNU tar version 1.25 or more recent.</para>
          </note>
          <warning>
-            <para>The tar <literal>--xattrs</literal> option is only available
-           in GNU tar distributions from Red Hat or Intel.</para>
+          <para>The tar 
+          <literal>--xattrs</literal> option is only available in GNU tar
+          distributions from Red Hat or Intel.</para>
          </warning>
        </listitem>
        <listitem>
-        <para><emphasis role="bold">Change directory out of the file system.</emphasis></para>
+        <para>
+          <emphasis role="bold">Change directory out of the file
+          system.</emphasis>
+        </para>
          <screen>[oss]# cd -</screen>
        </listitem>
        <listitem>
-        <para><emphasis role="bold">Unmount the file system.</emphasis></para>
+        <para>
+          <emphasis role="bold">Unmount the file system.</emphasis>
+        </para>
          <screen>[oss]# umount /mnt/ost</screen>
          <note>
-          <para>When restoring an OST backup on a different node as part of an OST migration, you also have to change server NIDs and use the <literal>--writeconf</literal> command to re-generate the configuration logs. See <xref linkend="lustremaintenance"/> (Changing a Server NID).</para>
+          <para>When restoring an OST backup on a different node as part of an
+          OST migration, you also have to change server NIDs and use the 
+          <literal>--writeconf</literal> command to re-generate the
+          configuration logs. See 
+          <xref linkend="lustremaintenance" />(Changing a Server NID).</para>
          </note>
        </listitem>
      </orderedlist>
    </section>
    <section xml:id="dbdoclet.50438207_22325">
-    <title><indexterm><primary>backup</primary><secondary>restoring file system backup</secondary></indexterm>Restoring a File-Level Backup</title>
-    <para>To restore data from a file-level backup, you need to format the device, restore the file data and then restore the EA data.</para>
+    <title>
+    <indexterm>
+      <primary>backup</primary>
+      <secondary>restoring file system backup</secondary>
+    </indexterm>Restoring a File-Level Backup</title>
+    <para>To restore data from a file-level backup, you need to format the
+    device, restore the file data and then restore the EA data.</para>
      <orderedlist>
        <listitem>
          <para>Format the new device.</para>
@@ -341,12 +616,14 @@ trusted.fid= \
          <para>Restore the file system extended attributes.</para>
          <screen>[oss]# setfattr --restore=ea-${date}.bak</screen>
          <note>
-            <para>If <literal>--xattrs</literal> option is supported by tar and specified in the step above, this step is redundant.</para>
+          <para>If 
+          <literal>--xattrs</literal> option is supported by tar and specified
+          in the step above, this step is redundant.</para>
          </note>
        </listitem>
        <listitem>
          <para>Verify that the extended attributes were restored.</para>
-        <screen>[oss]# getfattr -d -m &quot;.*&quot; -e hex O/0/d0/100992 trusted.fid= \
+        <screen>[oss]# getfattr -d -m ".*" -e hex O/0/d0/100992 trusted.fid= \
  0x0d822200000000004a8a73e500000000808a0100000000000000000000000000</screen>
        </listitem>
        <listitem>
@@ -358,41 +635,77 @@ trusted.fid= \
          <screen>[oss]# umount /mnt/ost</screen>
        </listitem>
      </orderedlist>
-    <para>If the file system was used between the time the backup was made and when it was restored, then the online <literal>LFSCK</literal> tool (part of Lustre code) will automatically be run to ensure the file system is coherent. If all of the device file systems were backed up at the same time after the entire Lustre file system was stopped, this is not necessary. In either case, the file system should be immediately usable even if <literal>LFSCK</literal> is not run, though there may be I/O errors reading from files that are present on the MDT but not the OSTs, and files that were created after the MDT backup will not be accessible/visible.  See <xref linkend="dbdoclet.lfsckadmin"/> for details on using LFSCK.</para>
+    <para condition='l23'>If the file system was used between the time the backup was made and
+    when it was restored, then the online 
+    <literal>LFSCK</literal> tool (part of Lustre code after version 2.3) 
+    will automatically be
+    run to ensure the file system is coherent. If all of the device file
+    systems were backed up at the same time after the entire Lustre file system
+    was stopped, this step is unnecessary. In either case, the file system will
+    be immediately although there may be I/O errors reading
+    from files that are present on the MDT but not the OSTs, and files that
+    were created after the MDT backup will not be accessible or visible. See 
+    <xref linkend="dbdoclet.lfsckadmin" />for details on using LFSCK.</para>
    </section>
    <section xml:id="dbdoclet.50438207_31553">
-    <title><indexterm>
-        <primary>backup</primary>
-        <secondary>using LVM</secondary>
-      </indexterm>Using LVM Snapshots with the Lustre File System</title>
-    <para>If you want to perform disk-based backups (because, for example, access to the backup system needs to be as fast as to the primary Lustre file system), you can use the Linux LVM snapshot tool to maintain multiple, incremental file system backups.</para>
-    <para>Because LVM snapshots cost CPU cycles as new files are written, taking snapshots of the main Lustre file system will probably result in unacceptable performance losses. You should create a new, backup Lustre file system and periodically (e.g., nightly) back up new/changed files to it. Periodic snapshots can be taken of this backup file system to create a series of &quot;full&quot; backups.</para>
+    <title>
+    <indexterm>
+      <primary>backup</primary>
+      <secondary>using LVM</secondary>
+    </indexterm>Using LVM Snapshots with the Lustre File System</title>
+    <para>If you want to perform disk-based backups (because, for example,
+    access to the backup system needs to be as fast as to the primary Lustre
+    file system), you can use the Linux LVM snapshot tool to maintain multiple,
+    incremental file system backups.</para>
+    <para>Because LVM snapshots cost CPU cycles as new files are written,
+    taking snapshots of the main Lustre file system will probably result in
+    unacceptable performance losses. You should create a new, backup Lustre
+    file system and periodically (e.g., nightly) back up new/changed files to
+    it. Periodic snapshots can be taken of this backup file system to create a
+    series of "full" backups.</para>
      <note>
-      <para>Creating an LVM snapshot is not as reliable as making a separate backup, because the LVM snapshot shares the same disks as the primary MDT device, and depends on the primary MDT device for much of its data. If the primary MDT device becomes corrupted, this may result in the snapshot being corrupted.</para>
+      <para>Creating an LVM snapshot is not as reliable as making a separate
+      backup, because the LVM snapshot shares the same disks as the primary MDT
+      device, and depends on the primary MDT device for much of its data. If
+      the primary MDT device becomes corrupted, this may result in the snapshot
+      being corrupted.</para>
      </note>
      <section remap="h3">
-        <title><indexterm><primary>backup</primary><secondary>using LVM</secondary><tertiary>creating</tertiary></indexterm>Creating an LVM-based Backup File System</title>
-      <para>Use this procedure to create a backup Lustre file system for use with the LVM snapshot mechanism.</para>
+      <title>
+      <indexterm>
+        <primary>backup</primary>
+        <secondary>using LVM</secondary>
+        <tertiary>creating</tertiary>
+      </indexterm>Creating an LVM-based Backup File System</title>
+      <para>Use this procedure to create a backup Lustre file system for use
+      with the LVM snapshot mechanism.</para>
        <orderedlist>
          <listitem>
            <para>Create LVM volumes for the MDT and OSTs.</para>
-          <para>Create LVM devices for your MDT and OST targets. Make sure not to use the entire disk for the targets; save some room for the snapshots. The snapshots start out as 0 size, but grow as you make changes to the current file system. If you expect to change 20% of the file system between backups, the most recent snapshot will be 20% of the target size, the next older one will be 40%, etc. Here is an example:</para>
+          <para>Create LVM devices for your MDT and OST targets. Make sure not
+          to use the entire disk for the targets; save some room for the
+          snapshots. The snapshots start out as 0 size, but grow as you make
+          changes to the current file system. If you expect to change 20% of
+          the file system between backups, the most recent snapshot will be 20%
+          of the target size, the next older one will be 40%, etc. Here is an
+          example:</para>
            <screen>cfs21:~# pvcreate /dev/sda1
-   Physical volume &quot;/dev/sda1&quot; successfully created
+   Physical volume "/dev/sda1" successfully created
  cfs21:~# vgcreate vgmain /dev/sda1
-   Volume group &quot;vgmain&quot; successfully created
+   Volume group "vgmain" successfully created
  cfs21:~# lvcreate -L200G -nMDT0 vgmain
-   Logical volume &quot;MDT0&quot; created
+   Logical volume "MDT0" created
  cfs21:~# lvcreate -L200G -nOST0 vgmain
-   Logical volume &quot;OST0&quot; created
+   Logical volume "OST0" created
  cfs21:~# lvscan
-   ACTIVE                  &apos;/dev/vgmain/MDT0&apos; [200.00 GB] inherit
-   ACTIVE                  &apos;/dev/vgmain/OST0&apos; [200.00 GB] inherit</screen>
+   ACTIVE                  '/dev/vgmain/MDT0' [200.00 GB] inherit
+   ACTIVE                  '/dev/vgmain/OST0' [200.00 GB] inherit</screen>
          </listitem>
          <listitem>
            <para>Format the LVM volumes as Lustre targets.</para>
-          <para>In this example, the backup file system is called <literal>main</literal> and
-            designates the current, most up-to-date backup.</para>
+          <para>In this example, the backup file system is called 
+          <literal>main</literal> and designates the current, most up-to-date
+          backup.</para>
            <screen>cfs21:~# mkfs.lustre --fsname=main --mdt --index=0 /dev/vgmain/MDT0
   No management node specified, adding MGS to this MDT.
      Permanent disk data:
@@ -413,7 +726,8 @@ checking for existing Lustre data
   mkfs_cmd = mkfs.ext2 -j -b 4096 -L main-MDT0000  -i 4096 -I 512 -q
    -O dir_index -F /dev/vgmain/MDT0
   Writing CONFIGS/mountdata
-cfs21:~# mkfs.lustre --mgsnode=cfs21 --fsname=main --ost --index=0 /dev/vgmain/OST0
+cfs21:~# mkfs.lustre --mgsnode=cfs21 --fsname=main --ost --index=0
+/dev/vgmain/OST0
      Permanent disk data:
   Target:     main-OST0000
   Index:      0
@@ -435,13 +749,20 @@ checking for existing Lustre data
   Writing CONFIGS/mountdata
  cfs21:~# mount -t lustre /dev/vgmain/MDT0 /mnt/mdt
  cfs21:~# mount -t lustre /dev/vgmain/OST0 /mnt/ost
-cfs21:~# mount -t lustre cfs21:/main /mnt/main</screen>
+cfs21:~# mount -t lustre cfs21:/main /mnt/main
+</screen>
          </listitem>
        </orderedlist>
      </section>
      <section remap="h3">
-        <title><indexterm><primary>backup</primary><secondary>new/changed files</secondary></indexterm>Backing up New/Changed Files to the Backup File System</title>
-      <para>At periodic intervals e.g., nightly, back up new and changed files to the LVM-based backup file system.</para>
+      <title>
+      <indexterm>
+        <primary>backup</primary>
+        <secondary>new/changed files</secondary>
+      </indexterm>Backing up New/Changed Files to the Backup File
+      System</title>
+      <para>At periodic intervals e.g., nightly, back up new and changed files
+      to the LVM-based backup file system.</para>
        <screen>cfs21:~# cp /etc/passwd /mnt/main 
   
  cfs21:~# cp /etc/fstab /mnt/main 
@@ -450,29 +771,60 @@ cfs21:~# ls /mnt/main
  fstab  passwd</screen>
      </section>
      <section remap="h3">
-        <title><indexterm><primary>backup</primary><secondary>using LVM</secondary><tertiary>creating snapshots</tertiary></indexterm>Creating Snapshot Volumes</title>
-      <para>Whenever you want to make a &quot;checkpoint&quot; of the main Lustre file system, create LVM snapshots of all target MDT and OSTs in the LVM-based backup file system. You must decide the maximum size of a snapshot ahead of time, although you can dynamically change this later. The size of a daily snapshot is dependent on the amount of data changed daily in the main Lustre file system. It is likely that a two-day old snapshot will be twice as big as a one-day old snapshot.</para>
-      <para>You can create as many snapshots as you have room for in the volume group. If necessary, you can dynamically add disks to the volume group.</para>
-      <para>The snapshots of the target MDT and OSTs should be taken at the same point in time. Make sure that the cronjob updating the backup file system is not running, since that is the only thing writing to the disks. Here is an example:</para>
+      <title>
+      <indexterm>
+        <primary>backup</primary>
+        <secondary>using LVM</secondary>
+        <tertiary>creating snapshots</tertiary>
+      </indexterm>Creating Snapshot Volumes</title>
+      <para>Whenever you want to make a "checkpoint" of the main Lustre file
+      system, create LVM snapshots of all target MDT and OSTs in the LVM-based
+      backup file system. You must decide the maximum size of a snapshot ahead
+      of time, although you can dynamically change this later. The size of a
+      daily snapshot is dependent on the amount of data changed daily in the
+      main Lustre file system. It is likely that a two-day old snapshot will be
+      twice as big as a one-day old snapshot.</para>
+      <para>You can create as many snapshots as you have room for in the volume
+      group. If necessary, you can dynamically add disks to the volume
+      group.</para>
+      <para>The snapshots of the target MDT and OSTs should be taken at the
+      same point in time. Make sure that the cronjob updating the backup file
+      system is not running, since that is the only thing writing to the disks.
+      Here is an example:</para>
        <screen>cfs21:~# modprobe dm-snapshot
  cfs21:~# lvcreate -L50M -s -n MDT0.b1 /dev/vgmain/MDT0
     Rounding up size to full physical extent 52.00 MB
-   Logical volume &quot;MDT0.b1&quot; created
+   Logical volume "MDT0.b1" created
  cfs21:~# lvcreate -L50M -s -n OST0.b1 /dev/vgmain/OST0
     Rounding up size to full physical extent 52.00 MB
-   Logical volume &quot;OST0.b1&quot; created</screen>
-      <para>After the snapshots are taken, you can continue to back up new/changed files to &quot;main&quot;. The snapshots will not contain the new files.</para>
+   Logical volume "OST0.b1" created
+</screen>
+      <para>After the snapshots are taken, you can continue to back up
+      new/changed files to "main". The snapshots will not contain the new
+      files.</para>
        <screen>cfs21:~# cp /etc/termcap /mnt/main
  cfs21:~# ls /mnt/main
-fstab  passwd  termcap</screen>
+fstab  passwd  termcap
+</screen>
      </section>
      <section remap="h3">
-        <title><indexterm><primary>backup</primary><secondary>using LVM</secondary><tertiary>restoring</tertiary></indexterm>Restoring the File System From a Snapshot</title>
-      <para>Use this procedure to restore the file system from an LVM snapshot.</para>
+      <title>
+      <indexterm>
+        <primary>backup</primary>
+        <secondary>using LVM</secondary>
+        <tertiary>restoring</tertiary>
+      </indexterm>Restoring the File System From a Snapshot</title>
+      <para>Use this procedure to restore the file system from an LVM
+      snapshot.</para>
        <orderedlist>
          <listitem>
            <para>Rename the LVM snapshot.</para>
-          <para>Rename the file system snapshot from &quot;main&quot; to &quot;back&quot; so you can mount it without unmounting &quot;main&quot;. This is recommended, but not required. Use the <literal>--reformat</literal> flag to <literal>tunefs.lustre</literal> to force the name change. For example:</para>
+          <para>Rename the file system snapshot from "main" to "back" so you
+          can mount it without unmounting "main". This is recommended, but not
+          required. Use the 
+          <literal>--reformat</literal> flag to 
+          <literal>tunefs.lustre</literal> to force the name change. For
+          example:</para>
            <screen>cfs21:~# tunefs.lustre --reformat --fsname=back --writeconf /dev/vgmain/MDT0.b1
   checking for existing Lustre data
   found Lustre data
@@ -518,9 +870,10 @@ Permanent disk data:
                (OST writeconf )
   Persistent mount opts: errors=remount-ro,extents,mballoc
   Parameters: mgsnode=192.168.0.21@tcp
-Writing CONFIGS/mountdata</screen>
-        <para>When renaming a file system, we must also erase the last_rcvd file from the
-            snapshots</para>
+Writing CONFIGS/mountdata
+</screen>
+          <para>When renaming a file system, we must also erase the last_rcvd
+          file from the snapshots</para>
            <screen>cfs21:~# mount -t ldiskfs /dev/vgmain/MDT0.b1 /mnt/mdtback
  cfs21:~# rm /mnt/mdtback/last_rcvd
  cfs21:~# umount /mnt/mdtback
@@ -529,29 +882,45 @@ cfs21:~# rm /mnt/ostback/last_rcvd
  cfs21:~# umount /mnt/ostback</screen>
          </listitem>
          <listitem>
-          <para>Mount the file system from the LVM snapshot.  For example:</para>
+          <para>Mount the file system from the LVM snapshot. For
+          example:</para>
            <screen>cfs21:~# mount -t lustre /dev/vgmain/MDT0.b1 /mnt/mdtback
  cfs21:~# mount -t lustre /dev/vgmain/OST0.b1 /mnt/ostback
  cfs21:~# mount -t lustre cfs21:/back /mnt/back</screen>
          </listitem>
          <listitem>
-          <para>Note the old directory contents, as of the snapshot time.  For example:</para>
+          <para>Note the old directory contents, as of the snapshot time. For
+          example:</para>
            <screen>cfs21:~/cfs/b1_5/lustre/utils# ls /mnt/back
-fstab  passwds</screen>
+fstab  passwds
+</screen>
          </listitem>
        </orderedlist>
      </section>
      <section remap="h3">
-        <title><indexterm><primary>backup</primary><secondary>using LVM</secondary><tertiary>deleting</tertiary></indexterm>Deleting Old Snapshots</title>
-      <para>To reclaim disk space, you can erase old snapshots as your backup policy dictates. Run:</para>
+      <title>
+      <indexterm>
+        <primary>backup</primary>
+        <secondary>using LVM</secondary>
+        <tertiary>deleting</tertiary>
+      </indexterm>Deleting Old Snapshots</title>
+      <para>To reclaim disk space, you can erase old snapshots as your backup
+      policy dictates. Run:</para>
        <screen>lvremove /dev/vgmain/MDT0.b1</screen>
      </section>
      <section remap="h3">
-      <title><indexterm><primary>backup</primary><secondary>using LVM</secondary><tertiary>resizing</tertiary></indexterm>Changing Snapshot Volume Size</title>
-      <para>You can also extend or shrink snapshot volumes if you find your daily deltas are smaller or larger than expected. Run:</para>
+      <title>
+      <indexterm>
+        <primary>backup</primary>
+        <secondary>using LVM</secondary>
+        <tertiary>resizing</tertiary>
+      </indexterm>Changing Snapshot Volume Size</title>
+      <para>You can also extend or shrink snapshot volumes if you find your
+      daily deltas are smaller or larger than expected. Run:</para>
        <screen>lvextend -L10G /dev/vgmain/MDT0.b1</screen>
        <note>
-        <para>Extending snapshots seems to be broken in older LVM. It is working in LVM v2.02.01.</para>
+        <para>Extending snapshots seems to be broken in older LVM. It is
+        working in LVM v2.02.01.</para>
        </note>
      </section>
    </section>
diff --git a/Glossary.xml b/Glossary.xml

index a8c2be7..84416da 100644 (file)
--- a/Glossary.xml
+++ b/Glossary.xml
@@ -253,7 +253,7 @@
        </glossdef>
      </glossentry>
      <glossentry xml:id="lfsck">
-      <glossterm>lfsck
+      <glossterm>LFSCK
          </glossterm>
        <glossdef>
          <para>Lustre file system check. A distributed version of a disk file system checker.
diff --git a/TroubleShootingRecovery.xml b/TroubleShootingRecovery.xml

index 987228b..3c24798 100644 (file)
--- a/TroubleShootingRecovery.xml
+++ b/TroubleShootingRecovery.xml
@@ -1,693 +1,1445 @@
-<?xml version='1.0' encoding='UTF-8'?><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="troubleshootingrecovery">
-    <title xml:id="troubleshootingrecovery.title">Troubleshooting Recovery</title>
-    <para>This chapter describes what to do if something goes wrong during recovery. It describes:</para>
-    <itemizedlist>
-        <listitem>
-            <para><xref linkend="dbdoclet.50438225_71141"/></para>
-        </listitem>
-        <listitem>
-            <para><xref linkend="dbdoclet.50438225_37365"/></para>
-        </listitem>
-        <listitem>
-            <para><xref linkend="dbdoclet.50438225_12316"/></para>
-        </listitem>
-        <listitem>
-            <para><xref linkend="dbdoclet.lfsckadmin"/></para>
-        </listitem>
-    </itemizedlist>
-    <section xml:id="dbdoclet.50438225_71141">
-        <title><indexterm><primary>recovery</primary><secondary>corruption of backing ldiskfs file system</secondary></indexterm>Recovering from Errors or Corruption on a Backing ldiskfs File System</title>
-        <para>When an OSS, MDS, or MGS server crash occurs, it is not necessary to run e2fsck on the
-            file system. <literal>ldiskfs</literal> journaling ensures that the file system remains
-            consistent over a system crash. The backing file systems are never accessed directly
-            from the client, so client crashes are not relevant for server file system
-            consistency.</para>
-        <para>The only time it is REQUIRED that <literal>e2fsck</literal> be run on a device is when an event causes problems that ldiskfs journaling is unable to handle, such as a hardware device failure or I/O error. If the ldiskfs kernel code detects corruption on the disk, it mounts the file system as read-only to prevent further corruption, but still allows read access to the device. This appears as error &quot;-30&quot; (<literal>EROFS</literal>) in the syslogs on the server, e.g.:</para>
-        <screen>Dec 29 14:11:32 mookie kernel: LDISKFS-fs error (device sdz):
+<?xml version='1.0' encoding='utf-8'?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
+xml:id="troubleshootingrecovery">
+  <title xml:id="troubleshootingrecovery.title">Troubleshooting
+  Recovery</title>
+  <para>This chapter describes what to do if something goes wrong during
+  recovery. It describes:</para>
+  <itemizedlist>
+    <listitem>
+      <para>
+        <xref linkend="dbdoclet.50438225_71141" />
+      </para>
+    </listitem>
+    <listitem>
+      <para>
+        <xref linkend="dbdoclet.50438225_37365" />
+      </para>
+    </listitem>
+    <listitem>
+      <para>
+        <xref linkend="dbdoclet.50438225_12316" />
+      </para>
+    </listitem>
+    <listitem>
+      <para>
+        <xref linkend="dbdoclet.lfsckadmin" />
+      </para>
+    </listitem>
+  </itemizedlist>
+  <section xml:id="dbdoclet.50438225_71141">
+    <title>
+    <indexterm>
+      <primary>recovery</primary>
+      <secondary>corruption of backing ldiskfs file system</secondary>
+    </indexterm>Recovering from Errors or Corruption on a Backing ldiskfs File
+    System</title>
+    <para>When an OSS, MDS, or MGS server crash occurs, it is not necessary to
+    run e2fsck on the file system.
+    <literal>ldiskfs</literal> journaling ensures that the file system remains
+    consistent over a system crash. The backing file systems are never accessed
+    directly from the client, so client crashes are not relevant for server
+    file system consistency.</para>
+    <para>The only time it is REQUIRED that
+    <literal>e2fsck</literal> be run on a device is when an event causes
+    problems that ldiskfs journaling is unable to handle, such as a hardware
+    device failure or I/O error. If the ldiskfs kernel code detects corruption
+    on the disk, it mounts the file system as read-only to prevent further
+    corruption, but still allows read access to the device. This appears as
+    error "-30" (
+    <literal>EROFS</literal>) in the syslogs on the server, e.g.:</para>
+    <screen>Dec 29 14:11:32 mookie kernel: LDISKFS-fs error (device sdz):
              ldiskfs_lookup: unlinked inode 5384166 in dir #145170469
-Dec 29 14:11:32 mookie kernel: Remounting filesystem read-only</screen>
-        <para>In such a situation, it is normally required that e2fsck only be run on the bad device before placing the device back into service.</para>
-        <para>In the vast majority of cases, the Lustre software can cope with any inconsistencies
-            found on the disk and between other devices in the file system.</para>
-        <note>
-            <para>The offline LFSCK tool included with e2fsprogs is rarely required for Lustre file
-                system operation.</para>
-        </note>
-        <para>For problem analysis, it is strongly recommended that <literal>e2fsck</literal> be run under a logger, like script, to record all of the output and changes that are made to the file system in case this information is needed later.</para>
-        <para>If time permits, it is also a good idea to first run <literal>e2fsck</literal> in non-fixing mode (-n option) to assess the type and extent of damage to the file system. The drawback is that in this mode, <literal>e2fsck</literal> does not recover the file system journal, so there may appear to be file system corruption when none really exists.</para>
-        <para>To address concern about whether corruption is real or only due to the journal not
-            being replayed, you can briefly mount and unmount the <literal>ldiskfs</literal> file
-            system directly on the node with the Lustre file system stopped, using a command similar
-            to:</para>
-        <screen>mount -t ldiskfs /dev/{ostdev} /mnt/ost; umount /mnt/ost</screen>
-        <para>This causes the journal to be recovered.</para>
-        <para>The <literal>e2fsck</literal> utility works well when fixing file system corruption
-            (better than similar file system recovery tools and a primary reason why
-                <literal>ldiskfs</literal> was chosen over other file systems). However, it is often
-            useful to identify the type of damage that has occurred so an <literal>ldiskfs</literal>
-            expert can make intelligent decisions about what needs fixing, in place of
-                <literal>e2fsck</literal>.</para>
-        <screen>root# {stop lustre services for this device, if running}
+Dec 29 14:11:32 mookie kernel: Remounting filesystem read-only </screen>
+    <para>In such a situation, it is normally required that e2fsck only be run
+    on the bad device before placing the device back into service.</para>
+    <para>In the vast majority of cases, the Lustre software can cope with any
+    inconsistencies found on the disk and between other devices in the file
+    system.</para>
+    <note>
+         <para>The legacy offline-LFSCK tool included with e2fsprogs is rarely
+      required for Lustre file system operation. offline-LFSCK is not to be
+      confused with LFSCK tool, which is part of Lustre and provides online
+      consistency checking.</para>
+    </note>
+    <para>For problem analysis, it is strongly recommended that
+    <literal>e2fsck</literal> be run under a logger, like script, to record all
+    of the output and changes that are made to the file system in case this
+    information is needed later.</para>
+    <para>If time permits, it is also a good idea to first run
+    <literal>e2fsck</literal> in non-fixing mode (-n option) to assess the type
+    and extent of damage to the file system. The drawback is that in this mode,
+    <literal>e2fsck</literal> does not recover the file system journal, so there
+    may appear to be file system corruption when none really exists.</para>
+    <para>To address concern about whether corruption is real or only due to
+    the journal not being replayed, you can briefly mount and unmount the
+    <literal>ldiskfs</literal> file system directly on the node with the Lustre
+    file system stopped, using a command similar to:</para>
+    <screen>mount -t ldiskfs /dev/{ostdev} /mnt/ost; umount /mnt/ost</screen>
+    <para>This causes the journal to be recovered.</para>
+    <para>The
+    <literal>e2fsck</literal> utility works well when fixing file system
+    corruption (better than similar file system recovery tools and a primary
+    reason why
+    <literal>ldiskfs</literal> was chosen over other file systems). However, it
+    is often useful to identify the type of damage that has occurred so an
+    <literal>ldiskfs</literal> expert can make intelligent decisions about what
+    needs fixing, in place of
+    <literal>e2fsck</literal>.</para>
+    <screen>root# {stop lustre services for this device, if running}
  root# script /tmp/e2fsck.sda
  Script started, file is /tmp/e2fsck.sda
  root# mount -t ldiskfs /dev/sda /mnt/ost
  root# umount /mnt/ost
-root# e2fsck -fn /dev/sda   # don&apos;t fix file system, just check for corruption
+root# e2fsck -fn /dev/sda   # don't fix file system, just check for corruption
  :
  [e2fsck output]
  :
-root# e2fsck -fp /dev/sda   # fix errors with prudent answers (usually <literal>yes</literal>)
-        </screen>
+root# e2fsck -fp /dev/sda   # fix errors with prudent answers (usually <literal>yes</literal>)</screen>
+  </section>
+  <section xml:id="dbdoclet.50438225_37365">
+    <title>
+    <indexterm>
+      <primary>recovery</primary>
+      <secondary>corruption of Lustre file system</secondary>
+    </indexterm>Recovering from Corruption in the Lustre File System</title>
+    <para>In cases where an ldiskfs MDT or OST becomes corrupt, you need to run
+    e2fsck to correct the local filesystem consistency, then use
+    <literal>LFSCK</literal> to run a distributed check on the file system to
+    resolve any inconsistencies between the MDTs and OSTs, or among MDTs.</para>
+    <orderedlist>
+      <listitem>
+        <para>Stop the Lustre file system.</para>
+      </listitem>
+      <listitem>
+        <para>Run
+        <literal>e2fsck -f</literal> on the individual MDT/OST that had
+        problems to fix any local file system damage.</para>
+        <para>We recommend running
+        <literal>e2fsck</literal> under script, to create a log of changes made
+        to the file system in case it is needed later. After
+        <literal>e2fsck</literal> is run, bring up the file system, if
+        necessary, to reduce the outage window.</para>
+      </listitem>
+    </orderedlist>
+    <section xml:id="dbdoclet.50438225_13916">
+      <title>
+      <indexterm>
+        <primary>recovery</primary>
+        <secondary>orphaned objects</secondary>
+      </indexterm>Working with Orphaned Objects</title>
+      <para>The simplest problem to resolve is that of orphaned objects. When
+      the LFSCK layout check is run, these objects are linked to new files and
+      put into 
+      <literal>.lustre/lost+found/MDT<replaceable>xxxx</replaceable></literal> 
+      in the Lustre file system 
+      (where MDTxxxx is the index of the MDT on which the orphan was found),
+      where they can be examined and saved or deleted as necessary.</para>
+      <para condition='l27'>With Lustre version 2.7 and later, LFSCK will
+       identify and process orphan objects found on MDTs as well.</para>
      </section>
-    <section xml:id="dbdoclet.50438225_37365">
-        <title><indexterm><primary>recovery</primary><secondary>corruption of Lustre file system</secondary></indexterm>Recovering from Corruption in the Lustre File System</title>
-        <para>In cases where an ldiskfs MDT or OST becomes corrupt, you need to run e2fsck to correct the local filesystem consistency, then use <literal>LFSCK</literal> to run a distributed check on the file system to resolve any inconsistencies between the MDTs and OSTs.</para>
-        <orderedlist>
-            <listitem>
-                <para>Stop the Lustre file system.</para>
-            </listitem>
-            <listitem>
-                <para>Run <literal>e2fsck -f</literal> on the individual MDS / OST that had problems to fix any local file system damage.</para>
-                <para>We recommend running <literal>e2fsck</literal> under script, to create a log of changes made to the file system in case it is needed later. After <literal>e2fsck</literal> is run, bring up the file system, if necessary, to reduce the outage window.</para>
-            </listitem>
-        </orderedlist>
-        <section xml:id="dbdoclet.50438225_13916">
-            <title><indexterm><primary>recovery</primary><secondary>orphaned objects</secondary></indexterm>Working with Orphaned Objects</title>
-            <para>The easiest problem to resolve is that of orphaned objects. When the LFSCK layout check is run, these objects are linked to new files and put into <literal>.lustre/lost+found</literal> in the Lustre file system, where they can be examined and saved or deleted as necessary.</para>
-        </section>
-    </section>
-    <section xml:id="dbdoclet.50438225_12316">
-        <title><indexterm><primary>recovery</primary><secondary>unavailable OST</secondary></indexterm>Recovering from an Unavailable OST</title>
-        <para>One problem encountered in a Lustre file system environment is
-            when an OST becomes unavailable due to a network partition, OSS node crash, etc. When
-            this happens, the OST&apos;s clients pause and wait for the OST to become available
-            again, either on the primary OSS or a failover OSS. When the OST comes back online, the
-            Lustre file system starts a recovery process to enable clients to reconnect to the OST.
-            Lustre servers put a limit on the time they will wait in recovery for clients to
-            reconnect.</para>
-        <para>During recovery, clients reconnect and replay their requests serially, in the same order they were done originally. Until a client receives a confirmation that a given transaction has been written to stable storage, the client holds on to the transaction, in case it needs to be replayed. Periodically, a progress message prints to the log, stating how_many/expected clients have reconnected. If the recovery is aborted, this log shows how many clients managed to reconnect. When all clients have completed recovery, or if the recovery timeout is reached, the recovery period ends and the OST resumes normal request processing.</para>
-        <para>If some clients fail to replay their requests during the recovery period, this will not stop the recovery from completing. You may have a situation where the OST recovers, but some clients are not able to participate in recovery (e.g. network problems or client failure), so they are evicted and their requests are not replayed. This would result in any operations on the evicted clients failing, including in-progress writes, which would cause cached writes to be lost. This is a normal outcome; the recovery cannot wait indefinitely, or the file system would be hung any time a client failed. The lost transactions are an unfortunate result of the recovery process.</para>
-        <note>
-           <para>The failure of client recovery does not indicate or lead to
-           filesystem corruption.  This is a normal event that is handled by
-           the MDT and OST, and should not result in any inconsistencies
-           between servers.</para>
-        </note>
-        <note>
-            <para>The version-based recovery (VBR) feature enables a failed client to be &apos;&apos;skipped&apos;&apos;, so remaining clients can replay their requests, resulting in a more successful recovery from a downed OST. For more information about the VBR feature, see <xref linkend="lustrerecovery"/>(Version-based Recovery).</para>
-        </note>
-    </section>
-    <section xml:id="dbdoclet.lfsckadmin" condition='l23'>
-        <title><indexterm><primary>recovery</primary><secondary>oiscrub</secondary></indexterm><indexterm><primary>recovery</primary><secondary>lfsck</secondary></indexterm>Checking the file system with LFSCK</title>
-        <para>LFSCK is an administrative tool introduced in Lustre software release 2.3 for checking
-            and repair of the attributes specific to a mounted Lustre file system. It is similar in
-            concept to an offline fsck repair tool for a local filesystem,
-            but LFSCK is implemented to run as part of the Lustre file system while the file
-            system is mounted and in use. This allows consistency of checking and repair by the
-            Lustre software without unnecessary downtime, and can be run on the largest Lustre file
-            systems.</para>
-        <para>In Lustre software release 2.3, LFSCK can verify and repair the Object Index (OI)
-            table that is used internally to map Lustre File Identifiers (FIDs) to MDT internal
-            inode numbers, through a process called OI Scrub. An OI Scrub is required after
-            restoring from a file-level MDT backup (<xref linkend="dbdoclet.50438207_71633"/>), or
-            in case the OI table is otherwise corrupted. Later phases of LFSCK will add further
-            checks to the Lustre distributed file system state.</para>
-        <para condition='l24'>In Lustre software release 2.4, LFSCK namespace scanning can verify and repair the directory FID-in-Dirent and LinkEA consistency.
-</para>
-        <para condition='l26'>In Lustre software release 2.6, LFSCK layout scanning can verify and repair MDT-OST file layout inconsistency. File layout inconsistencies between MDT-objects and OST-objects that are checked and corrected include dangling reference, unreferenced OST-objects, mismatched references and multiple references.
-</para>
-       <para>Control and monitoring of LFSCK is through LFSCK and the <literal>/proc</literal> file system
-            interfaces. LFSCK supports three types of interface: switch interface, status
-            interface and adjustment interface. These interfaces are detailed below.</para>
+  </section>
+  <section xml:id="dbdoclet.50438225_12316">
+    <title>
+    <indexterm>
+      <primary>recovery</primary>
+      <secondary>unavailable OST</secondary>
+    </indexterm>Recovering from an Unavailable OST</title>
+    <para>One problem encountered in a Lustre file system environment is when
+    an OST becomes unavailable due to a network partition, OSS node crash, etc.
+    When this happens, the OST's clients pause and wait for the OST to become
+    available again, either on the primary OSS or a failover OSS. When the OST
+    comes back online, the Lustre file system starts a recovery process to
+    enable clients to reconnect to the OST. Lustre servers put a limit on the
+    time they will wait in recovery for clients to reconnect.</para>
+    <para>During recovery, clients reconnect and replay their requests
+    serially, in the same order they were done originally. Until a client
+    receives a confirmation that a given transaction has been written to stable
+    storage, the client holds on to the transaction, in case it needs to be
+    replayed. Periodically, a progress message prints to the log, stating
+    how_many/expected clients have reconnected. If the recovery is aborted,
+    this log shows how many clients managed to reconnect. When all clients have
+    completed recovery, or if the recovery timeout is reached, the recovery
+    period ends and the OST resumes normal request processing.</para>
+    <para>If some clients fail to replay their requests during the recovery
+    period, this will not stop the recovery from completing. You may have a
+    situation where the OST recovers, but some clients are not able to
+    participate in recovery (e.g. network problems or client failure), so they
+    are evicted and their requests are not replayed. This would result in any
+    operations on the evicted clients failing, including in-progress writes,
+    which would cause cached writes to be lost. This is a normal outcome; the
+    recovery cannot wait indefinitely, or the file system would be hung any
+    time a client failed. The lost transactions are an unfortunate result of
+    the recovery process.</para>
+    <note>
+      <para>The failure of client recovery does not indicate or lead to
+      filesystem corruption. This is a normal event that is handled by the MDT
+      and OST, and should not result in any inconsistencies between
+      servers.</para>
+    </note>
+    <note>
+      <para>The version-based recovery (VBR) feature enables a failed client to
+      be ''skipped'', so remaining clients can replay their requests, resulting
+      in a more successful recovery from a downed OST. For more information
+      about the VBR feature, see
+      <xref linkend="lustrerecovery" />(Version-based Recovery).</para>
+    </note>
+  </section>
+  <section xml:id="dbdoclet.lfsckadmin" condition='l23'>
+    <title>
+    <indexterm>
+      <primary>recovery</primary>
+      <secondary>oiscrub</secondary>
+    </indexterm>
+    <indexterm>
+      <primary>recovery</primary>
+      <secondary>LFSCK</secondary>
+    </indexterm>Checking the file system with LFSCK</title>
+       <para condition='l23'>LFSCK is an administrative tool introduced in Lustre
+    software release 2.3 for checking and repair of the attributes specific to a
+    mounted Lustre file system. It is similar in concept to an offline fsck repair
+    tool for a local filesystem, but LFSCK is implemented to run as part of the
+    Lustre file system while the file system is mounted and in use. This allows
+    consistency of checking and repair by the Lustre software without unnecessary
+    downtime, and can be run on the largest Lustre file systems with negligible
+    disruption to normal operations.</para>
+    <para condition='l23'>Since Lustre software release 2.3, LFSCK can verify
+    and repair the Object Index (OI) table that is used internally to map
+    Lustre File Identifiers (FIDs) to MDT internal ldiskfs inode numbers, in
+    an internal table called the OI Table. An OI Scrub traverses this the IO
+    Table and makes corrections where necessary. An OI Scrub is required after
+    restoring from a file-level MDT backup (
+    <xref linkend="dbdoclet.50438207_71633" />), or in case the OI Table is
+    otherwise corrupted. Later phases of LFSCK will add further checks to the
+    Lustre distributed file system state.</para>
+    <para condition='l24'>In Lustre software release 2.4, LFSCK namespace
+    scanning can verify and repair the directory FID-in-Dirent and LinkEA
+    consistency.</para>
+    <para condition='l26'>In Lustre software release 2.6, LFSCK layout scanning
+    can verify and repair MDT-OST file layout inconsistencies. File layout
+    inconsistencies between MDT-objects and OST-objects that are checked and
+    corrected include dangling reference, unreferenced OST-objects, mismatched
+    references and multiple references.</para>
+    <para condition='l27'>In Lustre software release 2.7, LFSCK layout scanning
+    is enhanced to support verify and repair inconsistencies between multiple
+    MDTs.</para>
+    <para>Control and monitoring of LFSCK is through LFSCK and the
+    <literal>/proc</literal> file system interfaces. LFSCK supports three types
+    of interface: switch interface, status interface, and adjustment interface.
+    These interfaces are detailed below.</para>
      <section>
-        <title>LFSCK switch interface</title>
+      <title>LFSCK switch interface</title>
+      <section>
+        <title>Manually Starting LFSCK</title>
+        <section>
+          <title>Description</title>
+          <para>LFSCK can be started after the MDT is mounted using the
+          <literal>lctl lfsck_start</literal> command.</para>
+        </section>
          <section>
-            <title>Manually Starting LFSCK</title>
-            <section>
-                <title>Description</title>
-               <para>LFSCK can be started after the MDT is mounted using the <literal>lctl lfsck_start</literal> command.</para>
-            </section>
-            <section>
-                <title>Usage</title>
-                <screen>lctl lfsck_start -M | --device <replaceable>[MDT,OST]_device</replaceable> \
+          <title>Usage</title>
+          <screen>lctl lfsck_start -M | --device
+<replaceable>[MDT,OST]_device</replaceable> \
                      [-A | --all] \
-                    [-c | --create_ostobj <replaceable>[on | off]</replaceable>] \
-                    [-C | --create_mdtobj <replaceable>[on | off]</replaceable>] \
-                    [-e | --error <replaceable>{continue | abort}</replaceable>] \
+                    [-c | --create_ostobj
+<replaceable>[on | off]</replaceable>] \
+                    [-C | --create_mdtobj
+<replaceable>[on | off]</replaceable>] \
+                    [-e | --error
+<replaceable>{continue | abort}</replaceable>] \
                      [-h | --help] \
-                    [-n | --dryrun <replaceable>[on | off]</replaceable>] \
+                    [-n | --dryrun
+<replaceable>[on | off]</replaceable>] \
                      [-o | --orphan] \
                      [-r | --reset] \
-                    [-s | --speed <replaceable>ops_per_sec_limit</replaceable>] \
-                    [-t | --type <replaceable>lfsck_type[,lfsck_type...]</replaceable>] \
-                    [-w | --window_size <replaceable>size</replaceable>]
-                </screen>
-            </section>
-            <section>
-                <title>Options</title>
-                <para>The various <literal>lfsck_start</literal> options are listed and described below. For a complete list of available options, type <literal>lctl lfsck_start -h</literal>.</para>
-                <informaltable frame="all">
-                    <tgroup cols="2">
-                        <colspec colname="c1" colwidth="3*"/>
-                        <colspec colname="c2" colwidth="7*"/>
-                        <thead>
-                            <row>
-                                <entry>
-                                    <para><emphasis role="bold">Option</emphasis></para>
-                                </entry>
-                                <entry>
-                                    <para><emphasis role="bold">Description</emphasis></para>
-                                </entry>
-                            </row>
-                        </thead>
-                        <tbody>
-                            <row>
-                                <entry>
-                                    <para><literal>-M | --device</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>The MDT or OST device to start LFSCK/scrub on.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-A | --all</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para condition='l26'>Start LFSCK on all devices via a single lctl command. This applies to both layout and namespace consistency checking and repair.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-c | --create_ostobj</literal> </para>
-                                </entry>
-                                <entry>
-                                       <para condition='l26'>Create the lost OST-object for dangling LOV EA, <literal>off</literal> (default) or <literal>on</literal>. If not specified, then the default behaviour is to keep the dangling LOV EA there without creating the lost OST-object.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-C | --create_mdtobj</literal> </para>
-                                </entry>
-                                <entry>
-                                       <para condition='l27'>Create the lost MDT-object for dangling name entry, <literal>off</literal> (default) or <literal>on</literal>. If not specified, then the default behaviour is to keep the dangling name entry there without creating the lost MDT-object.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-e | --error</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>Error handle, <literal>continue</literal> (default) or <literal>abort</literal>. Specify whether the LFSCK will stop or not if fail to repair something. If it is not specified, the saved value (when resuming from checkpoint) will be used if present. This option cannot be changed if LFSCK is running.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-h | --help</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>Operating help information.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-n | --dryrun</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>Perform a trial without making any changes. <literal>off</literal> (default) or <literal>on</literal>.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-o | --orphan</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para condition='l26'>Repair orphan OST-objects for layout LFSCK.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-r | --reset</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>Reset the start position for the object iteration to the beginning for the specified MDT. By default the iterator will resume scanning from the last checkpoint (saved periodically by LFSCK) provided it is available.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-s | --speed</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>Set the upper speed limit of LFSCK processing in objects per second. If it is not specified, the saved value (when resuming from checkpoint) or default value of 0 (0 = run as fast as possible) is used. Speed can be adjusted while LFSCK is running with the adjustment interface.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-t | --type</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>The type of checking/repairing that should be performed. The new LFSCK framework provides a single interface for a variety of system consistency checking/repairing operations including:</para>
-<para>Without a specified option, the LFSCK component(s) which ran last time and did not finish or the component(s) corresponding to some known system inconsistency, will be started. Anytime the LFSCK is triggered, the OI scrub will run automatically, so there is no need to specify OI_scrub.</para>
-<para condition='l24'><literal>namespace</literal>: check and repair FID-in-Dirent and LinkEA consistency. Lustre-2.7 enhances namespace consistency verification under DNE mode.</para>
-<para condition='l26'><literal>layout</literal>: check and repair MDT-OST inconsistency.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-w | --window_size</literal> </para>
-                                </entry>
-                                <entry>
-                                       <para condition='l26'>The window size for the async request pipeline. The LFSCK async request pipeline's input/output may have quite different processing speeds, and there may be too many requests in the pipeline as to cause abnormal memory/network pressure. If not specified, then the default window size for the async request pipeline is 1024.</para>
-                                </entry>
-                            </row>
-                        </tbody>
-                    </tgroup>
-                </informaltable>
-            </section>
+                    [-s | --speed
+<replaceable>ops_per_sec_limit</replaceable>] \
+                    [-t | --type
+<replaceable>lfsck_type[,lfsck_type...]</replaceable>] \
+                    [-w | --window_size
+<replaceable>size</replaceable>]</screen>
+        </section>
+        <section>
+          <title>Options</title>
+          <para>The various
+          <literal>lfsck_start</literal> options are listed and described below.
+          For a complete list of available options, type
+          <literal>lctl lfsck_start -h</literal>.</para>
+          <informaltable frame="all">
+            <tgroup cols="2">
+              <colspec colname="c1" colwidth="3*" />
+              <colspec colname="c2" colwidth="7*" />
+              <thead>
+                <row>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Option</emphasis>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Description</emphasis>
+                    </para>
+                  </entry>
+                </row>
+              </thead>
+              <tbody>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-M | --device</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>The MDT or OST device to start LFSCK on.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-A | --all</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para condition='l26'>Start LFSCK on all devices.
+                    This applies to both layout and
+                    namespace consistency checking and repair.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-c | --create_ostobj</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para condition='l26'>Create the lost OST-object for
+                    dangling LOV EA,
+                    <literal>off</literal>(default) or
+                    <literal>on</literal>. If not specified, then the default
+                    behaviour is to keep the dangling LOV EA there without
+                    creating the lost OST-object.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-C | --create_mdtobj</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para condition='l27'>Create the lost MDT-object for
+                    dangling name entry,
+                    <literal>off</literal>(default) or
+                    <literal>on</literal>. If not specified, then the default
+                    behaviour is to keep the dangling name entry there without
+                    creating the lost MDT-object.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-e | --error</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>Error handle,
+                    <literal>continue</literal>(default) or
+                    <literal>abort</literal>. Specify whether the LFSCK will
+                    stop or not if fails to repair something. If it is not
+                    specified, the saved value (when resuming from checkpoint)
+                    will be used if present. This option cannot be changed
+                    while LFSCK is running.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-h | --help</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>Operating help information.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-n | --dryrun</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>Perform a trial without making any changes.
+                    <literal>off</literal>(default) or
+                    <literal>on</literal>.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-o | --orphan</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para condition='l26'>Repair orphan OST-objects for layout
+                    LFSCK.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-r | --reset</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>Reset the start position for the object iteration to
+                    the beginning for the specified MDT. By default the
+                    iterator will resume scanning from the last checkpoint
+                    (saved periodically by LFSCK) provided it is
+                    available.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-s | --speed</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>Set the upper speed limit of LFSCK processing in
+                    objects per second. If it is not specified, the saved value
+                    (when resuming from checkpoint) or default value of 0 (0 =
+                    run as fast as possible) is used. Speed can be adjusted
+                    while LFSCK is running with the adjustment
+                    interface.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-t | --type</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>The type of checking/repairing that should be
+                    performed. The new LFSCK framework provides a single
+                    interface for a variety of system consistency
+                    checking/repairing operations including:</para>
+                    <para>Without a specified option, the LFSCK component(s)
+                    which ran last time and did not finish or the component(s)
+                    corresponding to some known system inconsistency, will be
+                    started. Anytime the LFSCK is triggered, the OI scrub will
+                    run automatically, so there is no need to specify
+                    OI_scrub in that case.</para>
+                    <para condition='l24'>
+                    <literal>namespace</literal>: check and repair
+                    FID-in-Dirent and LinkEA consistency.</para>
+                    <para condition='l27'> Lustre-2.7 enhances
+                    namespace consistency verification under DNE mode.</para>
+                    <para condition='l26'>
+                    <literal>layout</literal>: check and repair MDT-OST
+                    inconsistency.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-w | --window_size</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para condition='l26'>The window size for the async request
+                    pipeline. The LFSCK async request pipeline's input/output
+                    may have quite different processing speeds, and there may
+                    be too many requests in the pipeline as to cause abnormal
+                    memory/network pressure. If not specified, then the default
+                    window size for the async request pipeline is 1024.</para>
+                  </entry>
+                </row>
+              </tbody>
+            </tgroup>
+          </informaltable>
+        </section>
+      </section>
+      <section>
+        <title>Manually Stopping LFSCK</title>
+        <section>
+          <title>Description</title>
+          <para>To stop LFSCK when the MDT is mounted, use the
+          <literal>lctl lfsck_stop</literal> command.</para>
          </section>
          <section>
-            <title>Manually Stopping LFSCK</title>
-            <section>
-                <title>Description</title>
-               <para>To stop LFSCK when the MDT is mounted, use the <literal>lctl lfsck_stop</literal> command.</para>
-            </section>
-            <section>
-                <title>Usage</title>
-                <screen>lctl lfsck_stop -M | --device <replaceable>[MDT,OST]_device</replaceable> \
+          <title>Usage</title>
+          <screen>lctl lfsck_stop -M | --device
+<replaceable>[MDT,OST]_device</replaceable> \
                      [-A | --all] \
-                    [-h | --help]
-                </screen>
-            </section>
-            <section>
-                <title>Options</title>
-                <para>The various <literal>lfsck_stop</literal> options are listed and described below. For a complete list of available options, type <literal>lctl lfsck_stop -h</literal>.</para>
-                <informaltable frame="all">
-                    <tgroup cols="2">
-                        <colspec colname="c1" colwidth="3*"/>
-                        <colspec colname="c2" colwidth="7*"/>
-                        <thead>
-                            <row>
-                                <entry>
-                                    <para><emphasis role="bold">Option</emphasis></para>
-                                </entry>
-                                <entry>
-                                    <para><emphasis role="bold">Description</emphasis></para>
-                                </entry>
-                            </row>
-                        </thead>
-                        <tbody>
-                            <row>
-                                <entry>
-                                    <para><literal>-M | --device</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>The MDT or OST device to stop LFSCK/scrub on.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-A | --all</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>Stop LFSCK on all devices.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para><literal>-h | --help</literal> </para>
-                                </entry>
-                                <entry>
-                                    <para>Operating help information.</para>
-                                </entry>
-                            </row>
-                        </tbody>
-                    </tgroup>
-                </informaltable>
-            </section>
+                    [-h | --help]</screen>
          </section>
+        <section>
+          <title>Options</title>
+          <para>The various
+          <literal>lfsck_stop</literal> options are listed and described below.
+          For a complete list of available options, type
+          <literal>lctl lfsck_stop -h</literal>.</para>
+          <informaltable frame="all">
+            <tgroup cols="2">
+              <colspec colname="c1" colwidth="3*" />
+              <colspec colname="c2" colwidth="7*" />
+              <thead>
+                <row>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Option</emphasis>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Description</emphasis>
+                    </para>
+                  </entry>
+                </row>
+              </thead>
+              <tbody>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-M | --device</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>The MDT or OST device to stop LFSCK on.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-A | --all</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>Stop LFSCK on all devices.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>
+                      <literal>-h | --help</literal>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>Operating help information.</para>
+                  </entry>
+                </row>
+              </tbody>
+            </tgroup>
+          </informaltable>
+        </section>
+      </section>
      </section>
      <section>
-        <title>LFSCK status interface</title>
+      <title>LFSCK status interface</title>
+      <section>
+        <title>LFSCK status of OI Scrub via
+        <literal>procfs</literal></title>
+        <section>
+          <title>Description</title>
+          <para>For each LFSCK component there is a dedicated procfs interface
+          to trace the corresponding LFSCK component status. For OI Scrub, the
+          interface is the OSD layer procfs interface, named
+          <literal>oi_scrub</literal>. To display OI Scrub status, the standard
+         
+          <literal>lctl get_param</literal> command is used as shown in the
+          usage below.</para>
+        </section>
+        <section>
+          <title>Usage</title>
+          <screen>lctl get_param -n osd-ldiskfs.<replaceable>FSNAME</replaceable>-[<replaceable>MDT_device|OST_device</replaceable>].oi_scrub</screen>
+        </section>
+        <section>
+          <title>Output</title>
+          <informaltable frame="all">
+            <tgroup cols="2">
+              <colspec colname="c1" colwidth="3*" />
+              <colspec colname="c2" colwidth="7*" />
+              <thead>
+                <row>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Information</emphasis>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Detail</emphasis>
+                    </para>
+                  </entry>
+                </row>
+              </thead>
+              <tbody>
+                <row>
+                  <entry>
+                    <para>General Information</para>
+                  </entry>
+                  <entry>
+                    <itemizedlist>
+                      <listitem>
+                        <para>Name: OI_scrub.</para>
+                      </listitem>
+                      <listitem>
+                        <para>OI scrub magic id (an identifier unique to OI
+                        scrub).</para>
+                      </listitem>
+                      <listitem>
+                        <para>OI files count.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Status: one of the status -
+                        <literal>init</literal>,
+                        <literal>scanning</literal>,
+                        <literal>completed</literal>,
+                        <literal>failed</literal>,
+                        <literal>stopped</literal>,
+                        <literal>paused</literal>, or
+                        <literal>crashed</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Flags: including -
+                        <literal>recreated</literal>(OI file(s) is/are
+                        removed/recreated),
+                        <literal>inconsistent</literal>(restored from
+                        file-level backup),
+                        <literal>auto</literal>(triggered by non-UI mechanism),
+                        and
+                        <literal>upgrade</literal>(from Lustre software release
+                        1.8 IGIF format.)</para>
+                      </listitem>
+                      <listitem>
+                        <para>Parameters: OI scrub parameters, like
+                        <literal>failout</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Last Completed.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Latest Start.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Last Checkpoint.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Latest Start Position: the position for the
+                        latest scrub started from.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Last Checkpoint Position.</para>
+                      </listitem>
+                      <listitem>
+                        <para>First Failure Position: the position for the
+                        first object to be repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Current Position.</para>
+                      </listitem>
+                    </itemizedlist>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>Statistics</para>
+                  </entry>
+                  <entry>
+                    <itemizedlist>
+                      <listitem>
+                        <para>
+                        <literal>Checked</literal> total number of objects
+                        scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Updated</literal> total number of objects
+                        repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Failed</literal> total number of objects that
+                        failed to be repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>No Scrub</literal> total number of objects
+                        marked
+                        <literal>LDISKFS_STATE_LUSTRE_NOSCRUB and
+                        skipped</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>IGIF</literal> total number of objects IGIF
+                        scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Prior Updated</literal> how many objects have
+                        been repaired which are triggered by parallel
+                        RPC.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Success Count</literal> total number of
+                        completed OI_scrub runs on the device.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Run Time</literal> how long the scrub has run,
+                        tally from the time of scanning from the beginning of
+                        the specified MDT device, not include the
+                        paused/failure time among checkpoints.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Average Speed</literal> calculated by dividing
+                        <literal>Checked</literal> by
+                        <literal>run_time</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Real-Time Speed</literal> the speed since last
+                        checkpoint if the OI_scrub is running.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Scanned</literal> total number of objects under
+                        /lost+found that have been scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Repaired</literal> total number of objects
+                        under /lost+found that have been recovered.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Failed</literal> total number of objects under
+                        /lost+found failed to be scanned or failed to be
+                        recovered.</para>
+                      </listitem>
+                    </itemizedlist>
+                  </entry>
+                </row>
+              </tbody>
+            </tgroup>
+          </informaltable>
+        </section>
+      </section>
+      <section condition='l24'>
+        <title>LFSCK status of namespace via
+        <literal>procfs</literal></title>
+        <section>
+          <title>Description</title>
+          <para>The
+          <literal>namespace</literal> component is responsible for checks
+          described in <xref linkend="dbdoclet.lfsckadmin" />. The
+          <literal>procfs</literal> interface for this component is in the
+          MDD layer, named
+          <literal>lfsck_namespace</literal>. To show the status of this
+          component,
+          <literal>lctl get_param</literal> should be used as described in the
+          usage below.</para>
+        </section>
          <section>
-            <title>LFSCK status of OI Scrub via <literal>procfs</literal></title>
-            <section>
-                <title>Description</title>
-                <para>For each LFSCK component there is a dedicated procfs interface to trace the corresponding LFSCK component status. For OI Scrub, the interface is the OSD layer procfs interface, named <literal>oi_scrub</literal>. To display OI Scrub status, the standard <literal>lctl get_param</literal> command is used as shown in the usage below.</para>
-            </section>
-            <section >
-                <title>Usage</title>
-                <screen>lctl get_param -n osd-ldiskfs.<replaceable>FSNAME</replaceable>-<replaceable>MDT_device</replaceable>.oi_scrub
-                </screen>
-            </section>
-            <section>
-                <title>Output</title>
-                <informaltable frame="all">
-                    <tgroup cols="2">
-                        <colspec colname="c1" colwidth="3*"/>
-                        <colspec colname="c2" colwidth="7*"/>
-                        <thead>
-                            <row>
-                                <entry>
-                                    <para><emphasis role="bold">Information</emphasis></para>
-                                </entry>
-                                <entry>
-                                    <para><emphasis role="bold">Detail</emphasis></para>
-                                </entry>
-                            </row>
-                        </thead>
-                        <tbody>
-                            <row>
-                                <entry>
-                                    <para>General Information</para>
-                                </entry>
-                                <entry>
-                                    <itemizedlist>
-                                        <listitem><para>Name: OI_scrub.</para></listitem>
-                                        <listitem><para>OI scrub magic id (an identifier unique to OI scrub).</para></listitem>
-                                        <listitem><para>OI files count.</para></listitem>
-                                        <listitem><para>Status: one of the status - <literal>init</literal>, <literal>scanning</literal>, <literal>completed</literal>, <literal>failed</literal>, <literal>stopped</literal>, <literal>paused</literal>, or <literal>crashed</literal>.</para></listitem>
-                                        <listitem><para>Flags: including - <literal>recreated</literal> (OI file(s) is/are removed/recreated),
-                                                  <literal>inconsistent</literal> (restored from
-                                                  file-level backup), <literal>auto</literal>
-                                                  (triggered by non-UI mechanism), and
-                                                  <literal>upgrade</literal> (from Lustre software
-                                                  release 1.8 IGIF format.)</para></listitem>
-                                        <listitem><para>Parameters: OI scrub parameters, like <literal>failout</literal>.</para></listitem>
-                                        <listitem><para>Time Since Last Completed.</para></listitem>
-                                        <listitem><para>Time Since Latest Start.</para></listitem>
-                                        <listitem><para>Time Since Last Checkpoint.</para></listitem>
-                                        <listitem><para>Latest Start Position: the position for the latest scrub started from.</para></listitem>
-                                        <listitem><para>Last Checkpoint Position.</para></listitem>
-                                        <listitem><para>First Failure Position: the position for the first object to be repaired.</para></listitem>
-                                        <listitem><para>Current Position.</para></listitem>
-                                    </itemizedlist>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para>Statistics</para>
-                                </entry>
-                                <entry>
-                                    <itemizedlist>
-                                        <listitem><para><literal>Checked</literal> total number of objects scanned.</para></listitem>
-                                        <listitem><para><literal>Updated</literal> total number of objects repaired.</para></listitem>
-                                        <listitem><para><literal>Failed</literal> total number of objects that failed to be repaired.</para></listitem>
-                                        <listitem><para><literal>No Scrub</literal> total number of objects marked <literal>LDISKFS_STATE_LUSTRE_NOSCRUB and skipped</literal>.</para></listitem>
-                                        <listitem><para><literal>IGIF</literal> total number of objects IGIF scanned.</para></listitem>
-                                        <listitem><para><literal>Prior Updated</literal> how many objects have been repaired which are triggered by parallel RPC.</para></listitem>
-                                        <listitem><para><literal>Success Count</literal> total number of completed OI_scrub runs on the device.</para></listitem>
-                                        <listitem><para><literal>Run Time</literal> how long the scrub has run, tally from the time of scanning from the beginning of the specified MDT device, not include the paused/failure time among checkpoints.</para></listitem>
-                                        <listitem><para><literal>Average Speed</literal> calculated by dividing <literal>Checked</literal> by <literal>run_time</literal>.</para></listitem>
-                                        <listitem><para><literal>Real-Time Speed</literal> the speed since last checkpoint if the OI_scrub is running.</para></listitem>
-                                        <listitem><para><literal>Scanned</literal> total number of objects under /lost+found that have been scanned.</para></listitem>
-                                        <listitem><para><literal>Repaired</literal> total number of objects under /lost+found that have been recovered.</para></listitem>
-                                        <listitem><para><literal>Failed</literal> total number of objects under /lost+found failed to be scanned or failed to be recovered.</para></listitem>
-                                    </itemizedlist>
-                                </entry>
-                            </row>
-                        </tbody>
-                    </tgroup>
-                </informaltable>
-            </section>
+          <title>Usage</title>
+          <screen>lctl get_param -n mdd. <replaceable>FSNAME</replaceable>-<replaceable>MDT_device</replaceable>.lfsck_namespace</screen>
          </section>
-        <section condition='l24'>
-            <title>LFSCK status of namespace via <literal>procfs</literal></title>
-            <section>
-                <title>Description</title>
-                <para>The <literal>namespace</literal> component is responsible for checking and repairing FID-in-Dirent and LinkEA consistency. The <literal>procfs</literal> interface for this component is in the MDD layer, named <literal>lfsck_namespace</literal>. To show the status of this component, <literal>lctl get_param</literal> should be used as described in the usage below.</para>
-            </section>
-            <section >
-                <title>Usage</title>
-                <screen>lctl get_param -n mdd.<replaceable>FSNAME</replaceable>-<replaceable>MDT_device</replaceable>.lfsck_namespace
-                </screen>
-            </section>
-            <section>
-                <title>Output</title>
-                <informaltable frame="all">
-                    <tgroup cols="2">
-                        <colspec colname="c1" colwidth="3*"/>
-                        <colspec colname="c2" colwidth="7*"/>
-                        <thead>
-                            <row>
-                                <entry>
-                                    <para><emphasis role="bold">Information</emphasis></para>
-                                </entry>
-                                <entry>
-                                    <para><emphasis role="bold">Detail</emphasis></para>
-                                </entry>
-                            </row>
-                        </thead>
-                        <tbody>
-                            <row>
-                                <entry>
-                                    <para>General Information</para>
-                                </entry>
-                                <entry>
-                                    <itemizedlist>
-                                        <listitem><para>Name: <literal>lfsck_namespace</literal></para></listitem>
-                                        <listitem><para>LFSCK namespace magic.</para></listitem>
-                                        <listitem><para>LFSCK namespace version..</para></listitem>
-                                        <listitem><para>Status: one of the status - <literal>init</literal>, <literal>scanning-phase1</literal>, <literal>scanning-phase2</literal>, <literal>completed</literal>, <literal>failed</literal>, <literal>stopped</literal>, <literal>paused</literal>, <literal>partial</literal>, <literal>co-failed</literal>, <literal>co-stopped</literal> or <literal>co-paused</literal>.</para></listitem>
-                                        <listitem><para>Flags: including - <literal>scanned-once</literal> (the first cycle scanning has been
-                                                  completed), <literal>inconsistent</literal> (one
-                                                  or more inconsistent FID-in-Dirent or LinkEA
-                                                  entries that have been discovered),
-                                                  <literal>upgrade</literal> (from Lustre software
-                                                  release 1.8 IGIF format.)</para></listitem>
-                                        <listitem><para>Parameters: including <literal>dryrun</literal>, <literal>all_targets</literal>, <literal>failout</literal>, <literal>broadcast</literal>, <literal>orphan</literal>, <literal>create_ostobj</literal>and<literal>create_mdtobj</literal>.</para></listitem>
-                                        <listitem><para>Time Since Last Completed.</para></listitem>
-                                        <listitem><para>Time Since Latest Start.</para></listitem>
-                                        <listitem><para>Time Since Last Checkpoint.</para></listitem>
-                                        <listitem><para>Latest Start Position: the position the checking began most recently.</para></listitem>
-                                        <listitem><para>Last Checkpoint Position.</para></listitem>
-                                        <listitem><para>First Failure Position: the position for the first object to be repaired.</para></listitem>
-                                        <listitem><para>Current Position.</para></listitem>
-                                    </itemizedlist>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para>Statistics</para>
-                                </entry>
-                                <entry>
-                                    <itemizedlist>
-                                        <listitem><para><literal>Checked Phase1</literal> total number of objects scanned during <literal>scanning-phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Checked Phase2</literal> total number of objects scanned during <literal>scanning-phase2</literal>.</para></listitem>
-                                        <listitem><para><literal>Updated Phase1</literal> total number of objects repaired during <literal>scanning-phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Updated Phase2</literal> total number of objects repaired during <literal>scanning-phase2</literal>.</para></listitem>
-                                        <listitem><para><literal>Failed Phase1</literal> total number of objets that failed to be repaired during <literal>scanning-phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Failed Phase2</literal> total number of objets that failed to be repaired during <literal>scanning-phase2</literal>.</para></listitem>
-                                        <listitem><para><literal>directories</literal> total number of directories scanned.</para></listitem>
-                                        <listitem><para><literal>multiple_linked_checked</literal> total number of multiple-linked objects that have been scanned.</para></listitem>
-                                        <listitem><para><literal>dirent_repaired</literal> total number of FID-in-dirent entries that have been repaired.</para></listitem>
-                                        <listitem><para><literal>linkea_repaired</literal> total number of linkEA entries that have been repaired.</para></listitem>
-                                        <listitem><para><literal>unknown_inconsistency</literal> total number of undefined inconsistencies found in scanning-phase2.</para></listitem>
-                                        <listitem><para><literal>unmatched_pairs_repaired</literal> total number of unmatched pairs that have been repaired.</para></listitem>
-                                        <listitem><para><literal>dangling_repaired</literal> total number of dangling name entries that have been found/repaired.</para></listitem>
-                                        <listitem><para><literal>multi_referenced_repaired</literal> total number of multiple referenced name entries that have been found/repaired.</para></listitem>
-                                        <listitem><para><literal>bad_file_type_repaired</literal> total number of name entries with bad file type that have been repaired.</para></listitem>
-                                        <listitem><para><literal>lost_dirent_repaired</literal> total number of lost name entries that have been re-inserted.</para></listitem>
-                                        <listitem><para><literal>striped_dirs_scanned</literal> total number of striped directories (master) that have been scanned.</para></listitem>
-                                        <listitem><para><literal>striped_dirs_repaired</literal> total number of striped directories (master) that have been repaired.</para></listitem>
-                                        <listitem><para><literal>striped_dirs_failed</literal> total number of striped directories (master) that have failed to be verified.</para></listitem>
-                                        <listitem><para><literal>striped_dirs_disabled</literal> total number of striped directories (master) that have been disabled.</para></listitem>
-                                        <listitem><para><literal>striped_dirs_skipped</literal> total number of striped directories (master) that have been skipped (for shards verification) because of lost master LMV EA.</para></listitem>
-                                        <listitem><para><literal>striped_shards_scanned</literal> total number of striped directory shards (slave) that have been scanned.</para></listitem>
-                                        <listitem><para><literal>striped_shards_repaired</literal> total number of striped directory shards (slave) that have been repaired.</para></listitem>
-                                        <listitem><para><literal>striped_shards_failed</literal> total number of striped directory shards (slave) that have failed to be verified.</para></listitem>
-                                        <listitem><para><literal>striped_shards_skipped</literal> total number of striped directory shards (slave) that have been skipped (for name hash verification) because LFSCK does not know whether the slave LMV EA is valid or not.</para></listitem>
-                                        <listitem><para><literal>name_hash_repaired</literal> total number of name entries under striped directory with bad name hash that have been repaired.</para></listitem>
-                                        <listitem><para><literal>nlinks_repaired</literal> total number of objects with nlink fixed.</para></listitem>
-                                        <listitem><para><literal>mul_linked_repaired</literal> total number of multiple-linked objects that have been repaired.</para></listitem>
-                                        <listitem><para><literal>local_lost_found_scanned</literal> total number of objects under /lost+found that have been scanned.</para></listitem>
-                                        <listitem><para><literal>local_lost_found_moved</literal> total number of objects under /lost+found that have been moved to namespace visible directory.</para></listitem>
-                                        <listitem><para><literal>local_lost_found_skipped</literal> total number of objects under /lost+found that have been skipped.</para></listitem>
-                                        <listitem><para><literal>local_lost_found_failed</literal> total number of objects under /lost+found that have failed to be processed.</para></listitem>
-                                        <listitem><para><literal>Success Count</literal> the total number of completed LFSCK runs on the device.</para></listitem>
-                                        <listitem><para><literal>Run Time Phase1</literal> the duration of the LFSCK run during <literal>scanning-phase1</literal>. Excluding the time spent paused between checkpoints.</para></listitem>
-                                        <listitem><para><literal>Run Time Phase2</literal> the duration of the LFSCK run during <literal>scanning-phase2</literal>. Excluding the time spent paused between checkpoints.</para></listitem>
-                                        <listitem><para><literal>Average Speed Phase1</literal> calculated by dividing <literal>checked_phase1</literal> by <literal>run_time_phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Average Speed Phase2</literal> calculated by dividing <literal>checked_phase2</literal> by <literal>run_time_phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Real-Time Speed Phase1</literal> the speed since the last checkpoint if the LFSCK is running <literal>scanning-phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Real-Time Speed Phase2</literal> the speed since the last checkpoint if the LFSCK is running <literal>scanning-phase2</literal>.</para></listitem>
-                                    </itemizedlist>
-                                </entry>
-                            </row>
-                        </tbody>
-                    </tgroup>
-                </informaltable>
-            </section>
+        <section>
+          <title>Output</title>
+          <informaltable frame="all">
+            <tgroup cols="2">
+              <colspec colname="c1" colwidth="3*" />
+              <colspec colname="c2" colwidth="7*" />
+              <thead>
+                <row>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Information</emphasis>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Detail</emphasis>
+                    </para>
+                  </entry>
+                </row>
+              </thead>
+              <tbody>
+                <row>
+                  <entry>
+                    <para>General Information</para>
+                  </entry>
+                  <entry>
+                    <itemizedlist>
+                      <listitem>
+                        <para>Name:
+                        <literal>lfsck_namespace</literal></para>
+                      </listitem>
+                      <listitem>
+                        <para>LFSCK namespace magic.</para>
+                      </listitem>
+                      <listitem>
+                        <para>LFSCK namespace version..</para>
+                      </listitem>
+                      <listitem>
+                        <para>Status: one of the status -
+                        <literal>init</literal>,
+                        <literal>scanning-phase1</literal>,
+                        <literal>scanning-phase2</literal>,
+                        <literal>completed</literal>,
+                        <literal>failed</literal>,
+                        <literal>stopped</literal>,
+                        <literal>paused</literal>,
+                        <literal>partial</literal>,
+                        <literal>co-failed</literal>,
+                        <literal>co-stopped</literal> or
+                        <literal>co-paused</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Flags: including -
+                        <literal>scanned-once</literal>(the first cycle
+                        scanning has been completed),
+                        <literal>inconsistent</literal>(one or more
+                        inconsistent FID-in-Dirent or LinkEA entries that have
+                        been discovered),
+                        <literal>upgrade</literal>(from Lustre software release
+                        1.8 IGIF format.)</para>
+                      </listitem>
+                      <listitem>
+                        <para>Parameters: including
+                        <literal>dryrun</literal>,
+                        <literal>all_targets</literal>,
+                        <literal>failout</literal>,
+                        <literal>broadcast</literal>,
+                        <literal>orphan</literal>,
+                        <literal>create_ostobj</literal> and
+                        <literal>create_mdtobj</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Last Completed.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Latest Start.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Last Checkpoint.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Latest Start Position: the position the checking
+                        began most recently.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Last Checkpoint Position.</para>
+                      </listitem>
+                      <listitem>
+                        <para>First Failure Position: the position for the
+                        first object to be repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Current Position.</para>
+                      </listitem>
+                    </itemizedlist>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>Statistics</para>
+                  </entry>
+                  <entry>
+                    <itemizedlist>
+                      <listitem>
+                        <para>
+                        <literal>Checked Phase1</literal> total number of
+                        objects scanned during
+                        <literal>scanning-phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Checked Phase2</literal> total number of
+                        objects scanned during
+                        <literal>scanning-phase2</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Updated Phase1</literal> total number of
+                        objects repaired during
+                        <literal>scanning-phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Updated Phase2</literal> total number of
+                        objects repaired during
+                        <literal>scanning-phase2</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Failed Phase1</literal> total number of objets
+                        that failed to be repaired during
+                        <literal>scanning-phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Failed Phase2</literal> total number of objets
+                        that failed to be repaired during
+                        <literal>scanning-phase2</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>directories</literal> total number of
+                        directories scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>multiple_linked_checked</literal> total number
+                        of multiple-linked objects that have been
+                        scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>dirent_repaired</literal> total number of
+                        FID-in-dirent entries that have been repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>linkea_repaired</literal> total number of
+                        linkEA entries that have been repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>unknown_inconsistency</literal> total number of
+                        undefined inconsistencies found in
+                        scanning-phase2.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>unmatched_pairs_repaired</literal> total number
+                        of unmatched pairs that have been repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>dangling_repaired</literal> total number of
+                        dangling name entries that have been
+                        found/repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>multi_referenced_repaired</literal> total
+                        number of multiple referenced name entries that have
+                        been found/repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>bad_file_type_repaired</literal> total number
+                        of name entries with bad file type that have been
+                        repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>lost_dirent_repaired</literal> total number of
+                        lost name entries that have been re-inserted.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_dirs_scanned</literal> total number of
+                        striped directories (master) that have been
+                        scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_dirs_repaired</literal> total number of
+                        striped directories (master) that have been
+                        repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_dirs_failed</literal> total number of
+                        striped directories (master) that have failed to be
+                        verified.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_dirs_disabled</literal> total number of
+                        striped directories (master) that have been
+                        disabled.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_dirs_skipped</literal> total number of
+                        striped directories (master) that have been skipped
+                        (for shards verification) because of lost master LMV
+                        EA.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_shards_scanned</literal> total number
+                        of striped directory shards (slave) that have been
+                        scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_shards_repaired</literal> total number
+                        of striped directory shards (slave) that have been
+                        repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_shards_failed</literal> total number of
+                        striped directory shards (slave) that have failed to be
+                        verified.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>striped_shards_skipped</literal> total number
+                        of striped directory shards (slave) that have been
+                        skipped (for name hash verification) because LFSCK does
+                        not know whether the slave LMV EA is valid or
+                        not.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>name_hash_repaired</literal> total number of
+                        name entries under striped directory with bad name hash
+                        that have been repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>nlinks_repaired</literal> total number of
+                        objects with nlink fixed.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>mul_linked_repaired</literal> total number of
+                        multiple-linked objects that have been repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>local_lost_found_scanned</literal> total number
+                        of objects under /lost+found that have been
+                        scanned.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>local_lost_found_moved</literal> total number
+                        of objects under /lost+found that have been moved to
+                        namespace visible directory.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>local_lost_found_skipped</literal> total number
+                        of objects under /lost+found that have been
+                        skipped.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>local_lost_found_failed</literal> total number
+                        of objects under /lost+found that have failed to be
+                        processed.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Success Count</literal> the total number of
+                        completed LFSCK runs on the device.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Run Time Phase1</literal> the duration of the
+                        LFSCK run during
+                        <literal>scanning-phase1</literal>. Excluding the time
+                        spent paused between checkpoints.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Run Time Phase2</literal> the duration of the
+                        LFSCK run during
+                        <literal>scanning-phase2</literal>. Excluding the time
+                        spent paused between checkpoints.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Average Speed Phase1</literal> calculated by
+                        dividing
+                        <literal>checked_phase1</literal> by
+                        <literal>run_time_phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Average Speed Phase2</literal> calculated by
+                        dividing
+                        <literal>checked_phase2</literal> by
+                        <literal>run_time_phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Real-Time Speed Phase1</literal> the speed
+                        since the last checkpoint if the LFSCK is running
+                        <literal>scanning-phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Real-Time Speed Phase2</literal> the speed
+                        since the last checkpoint if the LFSCK is running
+                        <literal>scanning-phase2</literal>.</para>
+                      </listitem>
+                    </itemizedlist>
+                  </entry>
+                </row>
+              </tbody>
+            </tgroup>
+          </informaltable>
          </section>
-        <section condition='l26'>
-            <title>LFSCK status of layout via <literal>procfs</literal></title>
-            <section>
-                <title>Description</title>
-                <para>The <literal>layout</literal> component is responsible for checking and repairing MDT-OST inconsistency. The <literal>procfs</literal> interface for this component is in the MDD layer, named <literal>lfsck_layout</literal>, and in the OBD layer, named <literal>lfsck_layout</literal>. To show the status of this component <literal>lctl get_param</literal> should be used as described in the usage below.</para>
-            </section>
-            <section >
-                <title>Usage</title>
-                <screen>lctl get_param -n mdd.<replaceable>FSNAME</replaceable>-<replaceable>MDT_device</replaceable>.lfsck_layout
-lctl get_param -n obdfilter.<replaceable>FSNAME</replaceable>-<replaceable>OST_device</replaceable>.lfsck_layout
-                </screen>
-            </section>
-            <section>
-                <title>Output</title>
-                <informaltable frame="all">
-                    <tgroup cols="2">
-                        <colspec colname="c1" colwidth="3*"/>
-                        <colspec colname="c2" colwidth="7*"/>
-                        <thead>
-                            <row>
-                                <entry>
-                                    <para><emphasis role="bold">Information</emphasis></para>
-                                </entry>
-                                <entry>
-                                    <para><emphasis role="bold">Detail</emphasis></para>
-                                </entry>
-                            </row>
-                        </thead>
-                        <tbody>
-                            <row>
-                                <entry>
-                                    <para>General Information</para>
-                                </entry>
-                                <entry>
-                                    <itemizedlist>
-                                        <listitem><para>Name: <literal>lfsck_layout</literal></para></listitem>
-                                        <listitem><para>LFSCK namespace magic.</para></listitem>
-                                        <listitem><para>LFSCK namespace version..</para></listitem>
-                                        <listitem><para>Status: one of the status - <literal>init</literal>, <literal>scanning-phase1</literal>, <literal>scanning-phase2</literal>, <literal>completed</literal>, <literal>failed</literal>, <literal>stopped</literal>, <literal>paused</literal>, <literal>crashed</literal>, <literal>partial</literal>, <literal>co-failed</literal>, <literal>co-stopped</literal>, or <literal>co-paused</literal>.</para></listitem>
-                                        <listitem><para>Flags: including - <literal>scanned-once</literal> (the first cycle scanning has been
-                                                  completed), <literal>inconsistent</literal> (one
-                                                  or more MDT-OST inconsistencies
-                                                  have been discovered),
-                                                  <literal>incomplete</literal> (some MDT or OST did not participate in the LFSCK or failed to finish the LFSCK) or <literal>crashed_lastid</literal> (the lastid files on the OST crashed and needs to be rebuilt).</para></listitem>
-                                        <listitem><para>Parameters: including <literal>dryrun</literal>, <literal>all_targets</literal> and <literal>failout</literal>.</para></listitem>
-                                        <listitem><para>Time Since Last Completed.</para></listitem>
-                                        <listitem><para>Time Since Latest Start.</para></listitem>
-                                        <listitem><para>Time Since Last Checkpoint.</para></listitem>
-                                        <listitem><para>Latest Start Position: the position the checking began most recently.</para></listitem>
-                                        <listitem><para>Last Checkpoint Position.</para></listitem>
-                                        <listitem><para>First Failure Position: the position for the first object to be repaired.</para></listitem>
-                                        <listitem><para>Current Position.</para></listitem>
-                                    </itemizedlist>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para>Statistics</para>
-                                </entry>
-                                <entry>
-                                    <itemizedlist>
-                                        <listitem><para><literal>Success Count:</literal> the total number of completed LFSCK runs on the device.</para></listitem>
-                                        <listitem><para><literal>Repaired Dangling:</literal> total number of MDT-objects with dangling reference have been repaired in the scanning-phase1.</para></listitem>
-                                        <listitem><para><literal>Repaired Unmatched Pairs</literal> total number of unmatched MDT and OST-object paris have been repaired in the scanning-phase1</para></listitem>
-                                        <listitem><para><literal>Repaired Multiple Referenced</literal> total number of OST-objects with multiple reference have been repaired in the scanning-phase1.</para></listitem>
-                                        <listitem><para><literal>Repaired Orphan</literal> total number of orphan OST-objects have been repaired in the scanning-phase2.</para></listitem>
-                                        <listitem><para><literal>Repaired Inconsistent Owner</literal> total number.of OST-objects with incorrect owner information have been repaired in the scanning-phase1.</para></listitem>
-                                        <listitem><para><literal>Repaired Others</literal> total number of.other inconsistency repaired in the scanning phases. </para></listitem>
-                                        <listitem><para><literal>Skipped</literal> Number of skipped objects.</para></listitem>
-                                        <listitem><para><literal>Failed Phase1</literal> total number of objects that failed to be repaired during <literal>scanning-phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Failed Phase2</literal> total number of objects that failed to be repaired during <literal>scanning-phase2</literal>.</para></listitem>
-                                        <listitem><para><literal>Checked Phase1</literal> total number of objects scanned during <literal>scanning-phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Checked Phase2</literal> total number of objects scanned during <literal>scanning-phase2</literal>.</para></listitem>
-                                        <listitem><para><literal>Run Time Phase1</literal> the duration of the LFSCK run during <literal>scanning-phase1</literal>. Excluding the time spent paused between checkpoints.</para></listitem>
-                                        <listitem><para><literal>Run Time Phase2</literal> the duration of the LFSCK run during <literal>scanning-phase2</literal>. Excluding the time spent paused between checkpoints.</para></listitem>
-                                        <listitem><para><literal>Average Speed Phase1</literal> calculated by dividing <literal>checked_phase1</literal> by <literal>run_time_phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Average Speed Phase2</literal> calculated by dividing <literal>checked_phase2</literal> by <literal>run_time_phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Real-Time Speed Phase1</literal> the speed since the last checkpoint if the LFSCK is running <literal>scanning-phase1</literal>.</para></listitem>
-                                        <listitem><para><literal>Real-Time Speed Phase2</literal> the speed since the last checkpoint if the LFSCK is running <literal>scanning-phase2</literal>.</para></listitem>
-                                    </itemizedlist>
-                                </entry>
-                            </row>
-                        </tbody>
-                    </tgroup>
-                </informaltable>
-            </section>
+      </section>
+      <section condition='l26'>
+        <title>LFSCK status of layout via
+        <literal>procfs</literal></title>
+        <section>
+          <title>Description</title>
+          <para>The
+          <literal>layout</literal> component is responsible for checking and
+          repairing MDT-OST inconsistency. The
+          <literal>procfs</literal> interface for this component is in the MDD
+          layer, named
+          <literal>lfsck_layout</literal>, and in the OBD layer, named
+          <literal>lfsck_layout</literal>. To show the status of this component
+          <literal>lctl get_param</literal> should be used as described in the
+          usage below.</para>
+        </section>
+        <section>
+          <title>Usage</title>
+          <screen>lctl get_param -n mdd.
+<replaceable>FSNAME</replaceable>-
+<replaceable>MDT_device</replaceable>.lfsck_layout
+lctl get_param -n obdfilter.
+<replaceable>FSNAME</replaceable>-
+<replaceable>OST_device</replaceable>.lfsck_layout</screen>
          </section>
+        <section>
+          <title>Output</title>
+          <informaltable frame="all">
+            <tgroup cols="2">
+              <colspec colname="c1" colwidth="3*" />
+              <colspec colname="c2" colwidth="7*" />
+              <thead>
+                <row>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Information</emphasis>
+                    </para>
+                  </entry>
+                  <entry>
+                    <para>
+                      <emphasis role="bold">Detail</emphasis>
+                    </para>
+                  </entry>
+                </row>
+              </thead>
+              <tbody>
+                <row>
+                  <entry>
+                    <para>General Information</para>
+                  </entry>
+                  <entry>
+                    <itemizedlist>
+                      <listitem>
+                        <para>Name:
+                        <literal>lfsck_layout</literal></para>
+                      </listitem>
+                      <listitem>
+                        <para>LFSCK namespace magic.</para>
+                      </listitem>
+                      <listitem>
+                        <para>LFSCK namespace version..</para>
+                      </listitem>
+                      <listitem>
+                        <para>Status: one of the status -
+                        <literal>init</literal>,
+                        <literal>scanning-phase1</literal>,
+                        <literal>scanning-phase2</literal>,
+                        <literal>completed</literal>,
+                        <literal>failed</literal>,
+                        <literal>stopped</literal>,
+                        <literal>paused</literal>,
+                        <literal>crashed</literal>,
+                        <literal>partial</literal>,
+                        <literal>co-failed</literal>,
+                        <literal>co-stopped</literal>, or
+                        <literal>co-paused</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Flags: including -
+                        <literal>scanned-once</literal>(the first cycle
+                        scanning has been completed),
+                        <literal>inconsistent</literal>(one or more MDT-OST
+                        inconsistencies have been discovered),
+                        <literal>incomplete</literal>(some MDT or OST did not
+                        participate in the LFSCK or failed to finish the LFSCK)
+                        or
+                        <literal>crashed_lastid</literal>(the lastid files on
+                        the OST crashed and needs to be rebuilt).</para>
+                      </listitem>
+                      <listitem>
+                        <para>Parameters: including
+                        <literal>dryrun</literal>,
+                        <literal>all_targets</literal> and
+                        <literal>failout</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Last Completed.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Latest Start.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Time Since Last Checkpoint.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Latest Start Position: the position the checking
+                        began most recently.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Last Checkpoint Position.</para>
+                      </listitem>
+                      <listitem>
+                        <para>First Failure Position: the position for the
+                        first object to be repaired.</para>
+                      </listitem>
+                      <listitem>
+                        <para>Current Position.</para>
+                      </listitem>
+                    </itemizedlist>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>Statistics</para>
+                  </entry>
+                  <entry>
+                    <itemizedlist>
+                      <listitem>
+                        <para>
+                        <literal>Success Count:</literal> the total number of
+                        completed LFSCK runs on the device.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Repaired Dangling:</literal> total number of
+                        MDT-objects with dangling reference have been repaired
+                        in the scanning-phase1.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Repaired Unmatched Pairs</literal> total number
+                        of unmatched MDT and OST-object paris have been
+                        repaired in the scanning-phase1</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Repaired Multiple Referenced</literal> total
+                        number of OST-objects with multiple reference have been
+                        repaired in the scanning-phase1.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Repaired Orphan</literal> total number of
+                        orphan OST-objects have been repaired in the
+                        scanning-phase2.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Repaired Inconsistent Owner</literal> total
+                        number.of OST-objects with incorrect owner information
+                        have been repaired in the scanning-phase1.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Repaired Others</literal> total number of.other
+                        inconsistency repaired in the scanning phases.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Skipped</literal> Number of skipped
+                        objects.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Failed Phase1</literal> total number of objects
+                        that failed to be repaired during
+                        <literal>scanning-phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Failed Phase2</literal> total number of objects
+                        that failed to be repaired during
+                        <literal>scanning-phase2</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Checked Phase1</literal> total number of
+                        objects scanned during
+                        <literal>scanning-phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Checked Phase2</literal> total number of
+                        objects scanned during
+                        <literal>scanning-phase2</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Run Time Phase1</literal> the duration of the
+                        LFSCK run during
+                        <literal>scanning-phase1</literal>. Excluding the time
+                        spent paused between checkpoints.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Run Time Phase2</literal> the duration of the
+                        LFSCK run during
+                        <literal>scanning-phase2</literal>. Excluding the time
+                        spent paused between checkpoints.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Average Speed Phase1</literal> calculated by
+                        dividing
+                        <literal>checked_phase1</literal> by
+                        <literal>run_time_phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Average Speed Phase2</literal> calculated by
+                        dividing
+                        <literal>checked_phase2</literal> by
+                        <literal>run_time_phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Real-Time Speed Phase1</literal> the speed
+                        since the last checkpoint if the LFSCK is running
+                        <literal>scanning-phase1</literal>.</para>
+                      </listitem>
+                      <listitem>
+                        <para>
+                        <literal>Real-Time Speed Phase2</literal> the speed
+                        since the last checkpoint if the LFSCK is running
+                        <literal>scanning-phase2</literal>.</para>
+                      </listitem>
+                    </itemizedlist>
+                  </entry>
+                </row>
+              </tbody>
+            </tgroup>
+          </informaltable>
+        </section>
+      </section>
      </section>
      <section>
-        <title>LFSCK adjustment interface</title>
-        <section condition='l26'>
-            <title>Rate control</title>
-            <section>
-                <title>Description</title>
-               <para>The LFSCK upper speed limit can be changed using <literal>lctl set_param</literal> as shown in the usage below.</para>
-            </section>
-            <section>
-                <title>Usage</title>
-                <screen>lctl set_param mdd.${FSNAME}-${MDT_device}.lfsck_speed_limit=<replaceable>N</replaceable>
-lctl set_param obdfilter.${FSNAME}-${OST_device}.lfsck_speed_limit=<replaceable>N</replaceable></screen>
-            </section>
-            <section>
-                <title>Values</title>
-                <informaltable frame="all">
-                    <tgroup cols="2">
-                        <colspec colname="c1" colwidth="3*"/>
-                        <colspec colname="c2" colwidth="7*"/>
-                        <tbody>
-                            <row>
-                                <entry>
-                                    <para>0</para>
-                                </entry>
-                                <entry>
-                                    <para>No speed limit (run at maximum speed.)</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para>positive integer</para>
-                                </entry>
-                                <entry>
-                                    <para>Maximum number of objects to scan per second.</para>
-                                </entry>
-                            </row>
-                        </tbody>
-                    </tgroup>
-                </informaltable>
-            </section>
+      <title>LFSCK adjustment interface</title>
+      <section condition='l26'>
+        <title>Rate control</title>
+        <section>
+          <title>Description</title>
+          <para>The LFSCK upper speed limit can be changed using
+          <literal>lctl set_param</literal> as shown in the usage below.</para>
+        </section>
+        <section>
+          <title>Usage</title>
+          <screen>lctl set_param mdd.${FSNAME}-${MDT_device}.lfsck_speed_limit=
+<replaceable>N</replaceable>
+lctl set_param obdfilter.${FSNAME}-${OST_device}.lfsck_speed_limit=
+<replaceable>N</replaceable></screen>
          </section>
-        <section xml:id="dbdoclet.lfsck_auto_scrub">
-            <title>Auto scrub</title>
-            <section>
-                <title>Description</title>
-               <para>The <literal>auto_scrub</literal> parameter controls whether OI scrub will be triggered when an inconsistency is detected during OI lookup. It can be set as described in the usage and values sections below.</para>
-               <para>There is also a <literal>noscrub</literal> mount option (see <xref linkend="dbdoclet.50438219_12635"/>) which can be used to disable automatic OI scrub upon detection of a file-level backup at mount time. If the <literal>noscrub</literal> mount option is specified, <literal>auto_scrub</literal> will also be disabled, so OI scrub will not be triggered when an OI inconsistency is detected. Auto scrub can be renabled after the mount using the command shown in the usage. Manually starting LFSCK after mounting provides finer control over the starting conditions.</para>
-            </section>
-            <section>
-                <title>Usage</title>
-               <screen>lctl set_param osd_ldiskfs.${FSNAME}-${MDT_device}.auto_scrub=<replaceable>N</replaceable>
-                </screen>
-               <para>where <replaceable>N</replaceable> is an integer as described below.</para>
-            </section>
-            <section>
-                <title>Values</title>
-                <informaltable frame="all">
-                    <tgroup cols="2">
-                        <colspec colname="c1" colwidth="3*"/>
-                        <colspec colname="c2" colwidth="7*"/>
-                        <tbody>
-                            <row>
-                                <entry>
-                                    <para>0</para>
-                                </entry>
-                                <entry>
-                                    <para>Do not start OI Scrub automatically.</para>
-                                </entry>
-                            </row>
-                            <row>
-                                <entry>
-                                    <para>positive integer</para>
-                                </entry>
-                                <entry>
-                                    <para>Automatically start OI Scrub if inconsistency is detected during OI lookup.</para>
-                                </entry>
-                            </row>
-                        </tbody>
-                    </tgroup>
-                </informaltable>
-            </section>
+        <section>
+          <title>Values</title>
+          <informaltable frame="all">
+            <tgroup cols="2">
+              <colspec colname="c1" colwidth="3*" />
+              <colspec colname="c2" colwidth="7*" />
+              <tbody>
+                <row>
+                  <entry>
+                    <para>0</para>
+                  </entry>
+                  <entry>
+                    <para>No speed limit (run at maximum speed.)</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>positive integer</para>
+                  </entry>
+                  <entry>
+                    <para>Maximum number of objects to scan per second.</para>
+                  </entry>
+                </row>
+              </tbody>
+            </tgroup>
+          </informaltable>
          </section>
+      </section>
+      <section xml:id="dbdoclet.lfsck_auto_scrub">
+        <title>Auto scrub</title>
+        <section>
+          <title>Description</title>
+          <para>The
+          <literal>auto_scrub</literal> parameter controls whether OI scrub will
+          be triggered when an inconsistency is detected during OI lookup. It
+          can be set as described in the usage and values sections
+          below.</para>
+          <para>There is also a
+          <literal>noscrub</literal> mount option (see
+          <xref linkend="dbdoclet.50438219_12635" />) which can be used to
+          disable automatic OI scrub upon detection of a file-level backup at
+          mount time. If the
+          <literal>noscrub</literal> mount option is specified,
+          <literal>auto_scrub</literal> will also be disabled, so OI scrub will
+          not be triggered when an OI inconsistency is detected. Auto scrub can
+          be renabled after the mount using the command shown in the usage.
+          Manually starting LFSCK after mounting provides finer control over
+          the starting conditions.</para>
+        </section>
+        <section>
+          <title>Usage</title>
+          <screen>lctl set_param osd_ldiskfs.${FSNAME}-${MDT_device}.auto_scrub=<replaceable>N</replaceable></screen>
+          <para>where
+          <replaceable>N</replaceable>is an integer as described below.</para>
+          <note condition='l25'><para>Lustre software 2.5 and later supports
+          <literal>-P</literal> option that makes the
+          <literal>set_param</literal> permanent.</para></note>
+        </section>
+        <section>
+          <title>Values</title>
+          <informaltable frame="all">
+            <tgroup cols="2">
+              <colspec colname="c1" colwidth="3*" />
+              <colspec colname="c2" colwidth="7*" />
+              <tbody>
+                <row>
+                  <entry>
+                    <para>0</para>
+                  </entry>
+                  <entry>
+                    <para>Do not start OI Scrub automatically.</para>
+                  </entry>
+                </row>
+                <row>
+                  <entry>
+                    <para>positive integer</para>
+                  </entry>
+                  <entry>
+                    <para>Automatically start OI Scrub if inconsistency is
+                    detected during OI lookup.</para>
+                  </entry>
+                </row>
+              </tbody>
+            </tgroup>
+          </informaltable>
+        </section>
+      </section>
      </section>
-    </section>
+  </section>
  </chapter>
diff --git a/UnderstandingLustre.xml b/UnderstandingLustre.xml

index 28bca1e..517ba23 100644 (file)
--- a/UnderstandingLustre.xml
+++ b/UnderstandingLustre.xml
@@ -1,88 +1,107 @@
-<?xml version='1.0' encoding='UTF-8'?>
-<chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0"
-  xml:lang="en-US" xml:id="understandinglustre">
-  <title xml:id="understandinglustre.title">Understanding  Lustre Architecture</title>
-  <para>This chapter describes the Lustre architecture and features of the Lustre file system. It
-    includes the following sections:</para>
+<?xml version='1.0' encoding='utf-8'?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
+xml:id="understandinglustre">
+  <title xml:id="understandinglustre.title">Understanding Lustre
+  Architecture</title>
+  <para>This chapter describes the Lustre architecture and features of the
+  Lustre file system. It includes the following sections:</para>
    <itemizedlist>
      <listitem>
        <para>
-        <xref linkend="understandinglustre.whatislustre"/>
+        <xref linkend="understandinglustre.whatislustre" />
        </para>
      </listitem>
      <listitem>
        <para>
-        <xref linkend="understandinglustre.components"/>
+        <xref linkend="understandinglustre.components" />
        </para>
      </listitem>
      <listitem>
        <para>
-        <xref linkend="understandinglustre.storageio"/>
+        <xref linkend="understandinglustre.storageio" />
        </para>
      </listitem>
    </itemizedlist>
    <section xml:id="understandinglustre.whatislustre">
-    <title><indexterm>
-        <primary>Lustre</primary>
-      </indexterm>What a Lustre File System Is (and What It Isn&apos;t)</title>
-    <para>The Lustre architecture is a storage architecture for clusters. The central component of
-      the Lustre architecture is the Lustre file system, which is supported on the Linux operating
-      system and provides a POSIX<superscript>*</superscript> standard-compliant UNIX file system
-      interface.</para>
-    <para>The Lustre storage architecture is used for many different kinds of clusters. It is best
-      known for powering many of the largest high-performance computing (HPC) clusters worldwide,
-      with tens of thousands of client systems, petabytes (PB) of storage and hundreds of gigabytes
-      per second (GB/sec) of I/O throughput. Many HPC sites use a Lustre file system as a site-wide
-      global file system, serving dozens of clusters.</para>
-    <para>The ability of a Lustre file system to scale capacity and performance for any need reduces
-      the need to deploy many separate file systems, such as one for each compute cluster. Storage
-      management is simplified by avoiding the need to copy data between compute clusters. In
-      addition to aggregating storage capacity of many servers, the I/O throughput is also
-      aggregated and scales with additional servers. Moreover, throughput and/or capacity can be
-      easily increased by adding servers dynamically.</para>
-    <para>While a Lustre file system can function in many work environments, it is not necessarily
-      the best choice for all applications. It is best suited for uses that exceed the capacity that
-      a single server can provide, though in some use cases, a Lustre file system can perform better
-      with a single server than other file systems due to its strong locking and data
-      coherency.</para>
+    <title>
+    <indexterm>
+      <primary>Lustre</primary>
+    </indexterm>What a Lustre File System Is (and What It Isn't)</title>
+    <para>The Lustre architecture is a storage architecture for clusters. The
+    central component of the Lustre architecture is the Lustre file system,
+    which is supported on the Linux operating system and provides a POSIX
+    <superscript>*</superscript>standard-compliant UNIX file system
+    interface.</para>
+    <para>The Lustre storage architecture is used for many different kinds of
+    clusters. It is best known for powering many of the largest
+    high-performance computing (HPC) clusters worldwide, with tens of thousands
+    of client systems, petabytes (PB) of storage and hundreds of gigabytes per
+    second (GB/sec) of I/O throughput. Many HPC sites use a Lustre file system
+    as a site-wide global file system, serving dozens of clusters.</para>
+    <para>The ability of a Lustre file system to scale capacity and performance
+    for any need reduces the need to deploy many separate file systems, such as
+    one for each compute cluster. Storage management is simplified by avoiding
+    the need to copy data between compute clusters. In addition to aggregating
+    storage capacity of many servers, the I/O throughput is also aggregated and
+    scales with additional servers. Moreover, throughput and/or capacity can be
+    easily increased by adding servers dynamically.</para>
+    <para>While a Lustre file system can function in many work environments, it
+    is not necessarily the best choice for all applications. It is best suited
+    for uses that exceed the capacity that a single server can provide, though
+    in some use cases, a Lustre file system can perform better with a single
+    server than other file systems due to its strong locking and data
+    coherency.</para>
      <para>A Lustre file system is currently not particularly well suited for
-      &quot;peer-to-peer&quot; usage models where clients and servers are running on the same node,
-      each sharing a small amount of storage, due to the lack of data replication at the Lustre
-      software level. In such uses, if one client/server fails, then the data stored on that node
-      will not be accessible until the node is restarted.</para>
+    "peer-to-peer" usage models where clients and servers are running on the
+    same node, each sharing a small amount of storage, due to the lack of data
+    replication at the Lustre software level. In such uses, if one
+    client/server fails, then the data stored on that node will not be
+    accessible until the node is restarted.</para>
      <section remap="h3">
-      <title><indexterm>
-          <primary>Lustre</primary>
-          <secondary>features</secondary>
-        </indexterm>Lustre Features</title>
-      <para>Lustre file systems run on a variety of vendor&apos;s kernels. For more details, see the
-        Lustre Test Matrix <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-          linkend="dbdoclet.50438261_99193"/>.</para>
-      <para>A Lustre installation can be scaled up or down with respect to the number of client
-        nodes, disk storage and bandwidth. Scalability and performance are dependent on available
-        disk and network bandwidth and the processing power of the servers in the system. A Lustre
-        file system can be deployed in a wide variety of configurations that can be scaled well
-        beyond the size and performance observed in production systems to date.</para>
-      <para><xref linkend="understandinglustre.tab1"/> shows the practical range of scalability and
-        performance characteristics of a Lustre file system and some test results in production
-        systems.</para>
+      <title>
+      <indexterm>
+        <primary>Lustre</primary>
+        <secondary>features</secondary>
+      </indexterm>Lustre Features</title>
+      <para>Lustre file systems run on a variety of vendor's kernels. For more
+      details, see the Lustre Test Matrix
+      <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+      linkend="dbdoclet.50438261_99193" />.</para>
+      <para>A Lustre installation can be scaled up or down with respect to the
+      number of client nodes, disk storage and bandwidth. Scalability and
+      performance are dependent on available disk and network bandwidth and the
+      processing power of the servers in the system. A Lustre file system can
+      be deployed in a wide variety of configurations that can be scaled well
+      beyond the size and performance observed in production systems to
+      date.</para>
+      <para>
+      <xref linkend="understandinglustre.tab1" />shows the practical range of
+      scalability and performance characteristics of a Lustre file system and
+      some test results in production systems.</para>
        <table frame="all">
-        <title xml:id="understandinglustre.tab1">Lustre File System Scalability and
-          Performance</title>
+        <title xml:id="understandinglustre.tab1">Lustre File System Scalability
+        and Performance</title>
          <tgroup cols="3">
-          <colspec colname="c1" colwidth="1*"/>
-          <colspec colname="c2" colwidth="2*"/>
-          <colspec colname="c3" colwidth="3*"/>
+          <colspec colname="c1" colwidth="1*" />
+          <colspec colname="c2" colwidth="2*" />
+          <colspec colname="c3" colwidth="3*" />
            <thead>
              <row>
                <entry>
-                <para><emphasis role="bold">Feature</emphasis></para>
+                <para>
+                  <emphasis role="bold">Feature</emphasis>
+                </para>
                </entry>
                <entry>
-                <para><emphasis role="bold">Current Practical Range</emphasis></para>
+                <para>
+                  <emphasis role="bold">Current Practical Range</emphasis>
+                </para>
                </entry>
                <entry>
-                <para><emphasis role="bold">Tested in Production</emphasis></para>
+                <para>
+                  <emphasis role="bold">Tested in Production</emphasis>
+                </para>
                </entry>
              </row>
            </thead>
@@ -90,55 +109,69 @@
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">Client Scalability</emphasis></para>
+                  <emphasis role="bold">Client Scalability</emphasis>
+                </para>
                </entry>
                <entry>
-                <para> 100-100000</para>
+                <para>100-100000</para>
                </entry>
                <entry>
-                <para> 50000+ clients, many in the 10000 to 20000 range</para>
+                <para>50000+ clients, many in the 10000 to 20000 range</para>
                </entry>
              </row>
              <row>
                <entry>
-                <para><emphasis role="bold">Client Performance</emphasis></para>
+                <para>
+                  <emphasis role="bold">Client Performance</emphasis>
+                </para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single client: </emphasis></para>
+                  <emphasis>Single client:</emphasis>
+                </para>
                  <para>I/O 90% of network bandwidth</para>
-                <para><emphasis>Aggregate:</emphasis></para>
+                <para>
+                  <emphasis>Aggregate:</emphasis>
+                </para>
                  <para>2.5 TB/sec I/O</para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single client: </emphasis></para>
+                  <emphasis>Single client:</emphasis>
+                </para>
                  <para>2 GB/sec I/O, 1000 metadata ops/sec</para>
-                <para><emphasis>Aggregate:</emphasis></para>
-                <para>240 GB/sec I/O </para>
+                <para>
+                  <emphasis>Aggregate:</emphasis>
+                </para>
+                <para>240 GB/sec I/O</para>
                </entry>
              </row>
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">OSS Scalability</emphasis></para>
+                  <emphasis role="bold">OSS Scalability</emphasis>
+                </para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single OSS:</emphasis></para>
+                  <emphasis>Single OSS:</emphasis>
+                </para>
                  <para>1-32 OSTs per OSS,</para>
                  <para>128TB per OST</para>
                  <para>
-                  <emphasis>OSS count:</emphasis></para>
+                  <emphasis>OSS count:</emphasis>
+                </para>
                  <para>500 OSSs, with up to 4000 OSTs</para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single OSS:</emphasis></para>
+                  <emphasis>Single OSS:</emphasis>
+                </para>
                  <para>8 OSTs per OSS,</para>
                  <para>16TB per OST</para>
                  <para>
-                  <emphasis>OSS count:</emphasis></para>
+                  <emphasis>OSS count:</emphasis>
+                </para>
                  <para>450 OSSs with 1000 4TB OSTs</para>
                  <para>192 OSSs with 1344 8TB OSTs</para>
                </entry>
@@ -146,81 +179,99 @@
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">OSS Performance</emphasis></para>
+                  <emphasis role="bold">OSS Performance</emphasis>
+                </para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single OSS:</emphasis></para>
-                <para> 5 GB/sec</para>
+                  <emphasis>Single OSS:</emphasis>
+                </para>
+                <para>5 GB/sec</para>
                  <para>
-                  <emphasis>Aggregate:</emphasis></para>
-                <para> 2.5 TB/sec</para>
+                  <emphasis>Aggregate:</emphasis>
+                </para>
+                <para>2.5 TB/sec</para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single OSS:</emphasis></para>
-                <para> 2.0+ GB/sec</para>
+                  <emphasis>Single OSS:</emphasis>
+                </para>
+                <para>2.0+ GB/sec</para>
                  <para>
-                  <emphasis>Aggregate:</emphasis></para>
-                <para> 240 GB/sec</para>
+                  <emphasis>Aggregate:</emphasis>
+                </para>
+                <para>240 GB/sec</para>
                </entry>
              </row>
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">MDS Scalability</emphasis></para>
+                  <emphasis role="bold">MDS Scalability</emphasis>
+                </para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single MDT:</emphasis></para>
-                <para> 4 billion files (ldiskfs), 256 trillion files (ZFS)</para>
+                  <emphasis>Single MDT:</emphasis>
+                </para>
+                <para>4 billion files (ldiskfs), 256 trillion files
+                (ZFS)</para>
                  <para>
-                  <emphasis>MDS count:</emphasis></para>
-                <para> 1 primary + 1 backup</para>
-                <para condition="l24">Up to 4096 MDTs and up to 4096 MDSs</para>
+                  <emphasis>MDS count:</emphasis>
+                </para>
+                <para>1 primary + 1 backup</para>
+                <para condition="l24">Up to 4096 MDTs and up to 4096
+                MDSs</para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single MDT:</emphasis></para>
-                <para> 1 billion files</para>
+                  <emphasis>Single MDT:</emphasis>
+                </para>
+                <para>1 billion files</para>
                  <para>
-                  <emphasis>MDS count:</emphasis></para>
-                <para> 1 primary + 1 backup</para>
+                  <emphasis>MDS count:</emphasis>
+                </para>
+                <para>1 primary + 1 backup</para>
                </entry>
              </row>
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">MDS Performance</emphasis></para>
+                  <emphasis role="bold">MDS Performance</emphasis>
+                </para>
                </entry>
                <entry>
-                <para> 35000/s create operations,</para>
-                <para> 100000/s metadata stat operations</para>
+                <para>35000/s create operations,</para>
+                <para>100000/s metadata stat operations</para>
                </entry>
                <entry>
-                <para> 15000/s create operations,</para>
-                <para> 35000/s metadata stat operations</para>
+                <para>15000/s create operations,</para>
+                <para>35000/s metadata stat operations</para>
                </entry>
              </row>
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">File system Scalability</emphasis></para>
+                  <emphasis role="bold">File system Scalability</emphasis>
+                </para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single File:</emphasis></para>
+                  <emphasis>Single File:</emphasis>
+                </para>
                  <para>2.5 PB max file size</para>
                  <para>
-                  <emphasis>Aggregate:</emphasis></para>
+                  <emphasis>Aggregate:</emphasis>
+                </para>
                  <para>512 PB space, 4 billion files</para>
                </entry>
                <entry>
                  <para>
-                  <emphasis>Single File:</emphasis></para>
+                  <emphasis>Single File:</emphasis>
+                </para>
                  <para>multi-TB max file size</para>
                  <para>
-                  <emphasis>Aggregate:</emphasis></para>
+                  <emphasis>Aggregate:</emphasis>
+                </para>
                  <para>55 PB space, 1 billion files</para>
                </entry>
              </row>
@@ -230,232 +281,306 @@
        <para>Other Lustre software features are:</para>
        <itemizedlist>
          <listitem>
-          <para><emphasis role="bold">Performance-enhanced ext4 file system:</emphasis> The Lustre
-            file system uses an improved version of the ext4 journaling file system to store data
-            and metadata. This version, called <emphasis role="italic">
-              <literal>ldiskfs</literal></emphasis>, has been enhanced to improve performance and
-            provide additional functionality needed by the Lustre file system.</para>
+          <para>
+          <emphasis role="bold">Performance-enhanced ext4 file
+          system:</emphasis>The Lustre file system uses an improved version of
+          the ext4 journaling file system to store data and metadata. This
+          version, called
+          <emphasis role="italic">
+            <literal>ldiskfs</literal>
+          </emphasis>, has been enhanced to improve performance and provide
+          additional functionality needed by the Lustre file system.</para>
          </listitem>
          <listitem>
-         <para condition="l24">With the Lustre software release 2.4 and later, it is also possible to use ZFS as the backing filesystem for Lustre for the MDT, OST, and MGS storage.  This allows Lustre to leverage the scalability and data integrity features of ZFS for individual storage targets.</para>
+          <para condition="l24">With the Lustre software release 2.4 and later,
+          it is also possible to use ZFS as the backing filesystem for Lustre
+          for the MDT, OST, and MGS storage. This allows Lustre to leverage the
+          scalability and data integrity features of ZFS for individual storage
+          targets.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">POSIX standard compliance:</emphasis> The full POSIX test
-            suite passes in an identical manner to a local ext4 file system, with limited exceptions
-            on Lustre clients. In a cluster, most operations are atomic so that clients never see
-            stale data or metadata. The Lustre software supports mmap() file I/O.</para>
+          <para>
+          <emphasis role="bold">POSIX standard compliance:</emphasis>The full
+          POSIX test suite passes in an identical manner to a local ext4 file
+          system, with limited exceptions on Lustre clients. In a cluster, most
+          operations are atomic so that clients never see stale data or
+          metadata. The Lustre software supports mmap() file I/O.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">High-performance heterogeneous networking:</emphasis> The
-            Lustre software supports a variety of high performance, low latency networks and permits
-            Remote Direct Memory Access (RDMA) for InfiniBand<superscript>*</superscript> (utilizing
-            OpenFabrics Enterprise Distribution (OFED<superscript>*</superscript>) and other
-            advanced networks for fast and efficient network transport. Multiple RDMA networks can
-            be bridged using Lustre routing for maximum performance. The Lustre software also
-            includes integrated network diagnostics.</para>
+          <para>
+          <emphasis role="bold">High-performance heterogeneous
+          networking:</emphasis>The Lustre software supports a variety of high
+          performance, low latency networks and permits Remote Direct Memory
+          Access (RDMA) for InfiniBand
+          <superscript>*</superscript>(utilizing OpenFabrics Enterprise
+          Distribution (OFED
+          <superscript>*</superscript>) and other advanced networks for fast
+          and efficient network transport. Multiple RDMA networks can be
+          bridged using Lustre routing for maximum performance. The Lustre
+          software also includes integrated network diagnostics.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">High-availability:</emphasis> The Lustre file system supports
-            active/active failover using shared storage partitions for OSS targets (OSTs). Lustre
-            software release 2.3 and earlier releases offer active/passive failover using a shared
-            storage partition for the MDS target (MDT).  The Lustre file system can work with a variety of high
-            availability (HA) managers to allow automated failover and has no single point of failure (NSPF).
-           This allows application transparent recovery.  Multiple mount protection (MMP) provides integrated protection from
-            errors in highly-available systems that would otherwise cause file system
-            corruption.</para>
+          <para>
+          <emphasis role="bold">High-availability:</emphasis>The Lustre file
+          system supports active/active failover using shared storage
+          partitions for OSS targets (OSTs). Lustre software release 2.3 and
+          earlier releases offer active/passive failover using a shared storage
+          partition for the MDS target (MDT). The Lustre file system can work
+          with a variety of high availability (HA) managers to allow automated
+          failover and has no single point of failure (NSPF). This allows
+          application transparent recovery. Multiple mount protection (MMP)
+          provides integrated protection from errors in highly-available
+          systems that would otherwise cause file system corruption.</para>
          </listitem>
          <listitem>
            <para condition="l24">With Lustre software release 2.4 or later
-           servers and clients it is possible to configure active/active
-           failover of multiple MDTs.  This allows scaling the metadata
-           performance of Lustre filesystems with the addition of MDT storage
-           devices and MDS nodes.</para>
+          servers and clients it is possible to configure active/active
+          failover of multiple MDTs. This allows scaling the metadata
+          performance of Lustre filesystems with the addition of MDT storage
+          devices and MDS nodes.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Security:</emphasis> By default TCP connections are only
-            allowed from privileged ports. UNIX group membership is verified on the MDS.</para>
+          <para>
+          <emphasis role="bold">Security:</emphasis>By default TCP connections
+          are only allowed from privileged ports. UNIX group membership is
+          verified on the MDS.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Access control list (ACL), extended attributes:</emphasis> the
-            Lustre security model follows that of a UNIX file system, enhanced with POSIX ACLs.
-            Noteworthy additional features include root squash.</para>
+          <para>
+          <emphasis role="bold">Access control list (ACL), extended
+          attributes:</emphasis>the Lustre security model follows that of a
+          UNIX file system, enhanced with POSIX ACLs. Noteworthy additional
+          features include root squash.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Interoperability:</emphasis> The Lustre file system runs on a
-            variety of CPU architectures and mixed-endian clusters and is interoperable between
-            successive major Lustre software releases.</para>
+          <para>
+          <emphasis role="bold">Interoperability:</emphasis>The Lustre file
+          system runs on a variety of CPU architectures and mixed-endian
+          clusters and is interoperable between successive major Lustre
+          software releases.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Object-based architecture:</emphasis> Clients are isolated
-            from the on-disk file structure enabling upgrading of the storage architecture without
-            affecting the client.</para>
+          <para>
+          <emphasis role="bold">Object-based architecture:</emphasis>Clients
+          are isolated from the on-disk file structure enabling upgrading of
+          the storage architecture without affecting the client.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Byte-granular file and fine-grained metadata
-              locking:</emphasis> Many clients can read and modify the same file or directory
-            concurrently. The Lustre distributed lock manager (LDLM) ensures that files are coherent
-            between all clients and servers in the file system. The MDT LDLM manages locks on inode
-            permissions and pathnames. Each OST has its own LDLM for locks on file stripes stored
-            thereon, which scales the locking performance as the file system grows.</para>
+          <para>
+          <emphasis role="bold">Byte-granular file and fine-grained metadata
+          locking:</emphasis>Many clients can read and modify the same file or
+          directory concurrently. The Lustre distributed lock manager (LDLM)
+          ensures that files are coherent between all clients and servers in
+          the file system. The MDT LDLM manages locks on inode permissions and
+          pathnames. Each OST has its own LDLM for locks on file stripes stored
+          thereon, which scales the locking performance as the file system
+          grows.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Quotas:</emphasis> User and group quotas are available for a
-            Lustre file system.</para>
+          <para>
+          <emphasis role="bold">Quotas:</emphasis>User and group quotas are
+          available for a Lustre file system.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Capacity growth:</emphasis> The size of a Lustre file system
-            and aggregate cluster bandwidth can be increased without interruption by adding a new
-            OSS with OSTs to the cluster.</para>
+          <para>
+          <emphasis role="bold">Capacity growth:</emphasis>The size of a Lustre
+          file system and aggregate cluster bandwidth can be increased without
+          interruption by adding a new OSS with OSTs to the cluster.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Controlled striping:</emphasis> The layout of files across
-            OSTs can be configured on a per file, per directory, or per file system basis. This
-            allows file I/O to be tuned to specific application requirements within a single file
-            system. The Lustre file system uses RAID-0 striping and balances space usage across
-            OSTs.</para>
+          <para>
+          <emphasis role="bold">Controlled striping:</emphasis>The layout of
+          files across OSTs can be configured on a per file, per directory, or
+          per file system basis. This allows file I/O to be tuned to specific
+          application requirements within a single file system. The Lustre file
+          system uses RAID-0 striping and balances space usage across
+          OSTs.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Network data integrity protection:</emphasis> A checksum of
-            all data sent from the client to the OSS protects against corruption during data
-            transfer.</para>
+          <para>
+          <emphasis role="bold">Network data integrity protection:</emphasis>A
+          checksum of all data sent from the client to the OSS protects against
+          corruption during data transfer.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">MPI I/O:</emphasis> The Lustre architecture has a dedicated
-            MPI ADIO layer that optimizes parallel I/O to match the underlying file system
-            architecture.</para>
+          <para>
+          <emphasis role="bold">MPI I/O:</emphasis>The Lustre architecture has
+          a dedicated MPI ADIO layer that optimizes parallel I/O to match the
+          underlying file system architecture.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">NFS and CIFS export:</emphasis> Lustre files can be
-            re-exported using NFS (via Linux knfsd) or CIFS (via Samba) enabling them to be shared
-            with non-Linux clients, such as Microsoft<superscript>*</superscript>
-              Windows<superscript>*</superscript> and Apple<superscript>*</superscript> Mac OS
-              X<superscript>*</superscript>.</para>
+          <para>
+          <emphasis role="bold">NFS and CIFS export:</emphasis>Lustre files can
+          be re-exported using NFS (via Linux knfsd) or CIFS (via Samba)
+          enabling them to be shared with non-Linux clients, such as Microsoft
+          <superscript>*</superscript>Windows
+          <superscript>*</superscript>and Apple
+          <superscript>*</superscript>Mac OS X
+          <superscript>*</superscript>.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Disaster recovery tool:</emphasis> The Lustre file system
-            provides an online distributed file system check (LFSCK) that can restore consistency between
-            storage components in case of a major file system error. A Lustre file system can
-            operate even in the presence of file system inconsistencies, and LFSCK can run while the filesystem is in use, so LFSCK is not required to complete
-            before returning the file system to production.</para>
+          <para>
+          <emphasis role="bold">Disaster recovery tool:</emphasis>The Lustre
+          file system provides an online distributed file system check (LFSCK)
+          that can restore consistency between storage components in case of a
+          major file system error. A Lustre file system can operate even in the
+          presence of file system inconsistencies, and LFSCK can run while the
+          filesystem is in use, so LFSCK is not required to complete before
+          returning the file system to production.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Performance monitoring:</emphasis> The Lustre file system
-            offers a variety of mechanisms to examine performance and tuning.</para>
+          <para>
+          <emphasis role="bold">Performance monitoring:</emphasis>The Lustre
+          file system offers a variety of mechanisms to examine performance and
+          tuning.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Open source:</emphasis> The Lustre software is licensed under
-            the GPL 2.0 license for use with the Linux operating system.</para>
+          <para>
+          <emphasis role="bold">Open source:</emphasis>The Lustre software is
+          licensed under the GPL 2.0 license for use with the Linux operating
+          system.</para>
          </listitem>
        </itemizedlist>
      </section>
    </section>
    <section xml:id="understandinglustre.components">
-    <title><indexterm>
-        <primary>Lustre</primary>
-        <secondary>components</secondary>
-      </indexterm>Lustre Components</title>
-    <para>An installation of the Lustre software includes a management server (MGS) and one or more
-      Lustre file systems interconnected with Lustre networking (LNET).</para>
-    <para>A basic configuration of Lustre file system components is shown in <xref
-        linkend="understandinglustre.fig.cluster"/>.</para>
+    <title>
+    <indexterm>
+      <primary>Lustre</primary>
+      <secondary>components</secondary>
+    </indexterm>Lustre Components</title>
+    <para>An installation of the Lustre software includes a management server
+    (MGS) and one or more Lustre file systems interconnected with Lustre
+    networking (LNET).</para>
+    <para>A basic configuration of Lustre file system components is shown in
+    <xref linkend="understandinglustre.fig.cluster" />.</para>
      <figure>
-      <title xml:id="understandinglustre.fig.cluster">Lustre file system components in a basic
-        cluster </title>
+      <title xml:id="understandinglustre.fig.cluster">Lustre file system
+      components in a basic cluster</title>
        <mediaobject>
          <imageobject>
-          <imagedata scalefit="1" width="100%" fileref="./figures/Basic_Cluster.png"/>
+          <imagedata scalefit="1" width="100%"
+          fileref="./figures/Basic_Cluster.png" />
          </imageobject>
          <textobject>
-          <phrase> Lustre file system components in a basic cluster </phrase>
+          <phrase>Lustre file system components in a basic cluster</phrase>
          </textobject>
        </mediaobject>
      </figure>
      <section remap="h3">
-      <title><indexterm>
-          <primary>Lustre</primary>
-          <secondary>MGS</secondary>
-        </indexterm>Management Server (MGS)</title>
-      <para>The MGS stores configuration information for all the Lustre file systems in a cluster
-        and provides this information to other Lustre components. Each Lustre target contacts the
-        MGS to provide information, and Lustre clients contact the MGS to retrieve
-        information.</para>
-      <para>It is preferable that the MGS have its own storage space so that it can be managed
-        independently. However, the MGS can be co-located and share storage space with an MDS as
-        shown in <xref linkend="understandinglustre.fig.cluster"/>.</para>
+      <title>
+      <indexterm>
+        <primary>Lustre</primary>
+        <secondary>MGS</secondary>
+      </indexterm>Management Server (MGS)</title>
+      <para>The MGS stores configuration information for all the Lustre file
+      systems in a cluster and provides this information to other Lustre
+      components. Each Lustre target contacts the MGS to provide information,
+      and Lustre clients contact the MGS to retrieve information.</para>
+      <para>It is preferable that the MGS have its own storage space so that it
+      can be managed independently. However, the MGS can be co-located and
+      share storage space with an MDS as shown in
+      <xref linkend="understandinglustre.fig.cluster" />.</para>
      </section>
      <section remap="h3">
        <title>Lustre File System Components</title>
-      <para>Each Lustre file system consists of the following components:</para>
+      <para>Each Lustre file system consists of the following
+      components:</para>
        <itemizedlist>
          <listitem>
-          <para><emphasis role="bold">Metadata Server (MDS)</emphasis> - The MDS makes metadata
-            stored in one or more MDTs available to Lustre clients. Each MDS manages the names and
-            directories in the Lustre file system(s) and provides network request handling for one
-            or more local MDTs.</para>
+          <para>
+          <emphasis role="bold">Metadata Server (MDS)</emphasis>- The MDS makes
+          metadata stored in one or more MDTs available to Lustre clients. Each
+          MDS manages the names and directories in the Lustre file system(s)
+          and provides network request handling for one or more local
+          MDTs.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Metadata Target (MDT</emphasis> ) - For Lustre software
-            release 2.3 and earlier, each file system has one MDT. The MDT stores metadata (such as
-            filenames, directories, permissions and file layout) on storage attached to an MDS. Each
-            file system has one MDT. An MDT on a shared storage target can be available to multiple
-            MDSs, although only one can access it at a time. If an active MDS fails, a standby MDS
-            can serve the MDT and make it available to clients. This is referred to as MDS
-            failover.</para>
-          <para condition="l24">Since Lustre software release 2.4, multiple MDTs are supported. Each
-            file system has at least one MDT. An MDT on a shared storage target can be available via
-            multiple MDSs, although only one MDS can export the MDT to the clients at one time. Two
-            MDS machines share storage for two or more MDTs. After the failure of one MDS, the
-            remaining MDS begins serving the MDT(s) of the failed MDS.</para>
+          <para>
+          <emphasis role="bold">Metadata Target (MDT</emphasis>) - For Lustre
+          software release 2.3 and earlier, each file system has one MDT. The
+          MDT stores metadata (such as filenames, directories, permissions and
+          file layout) on storage attached to an MDS. Each file system has one
+          MDT. An MDT on a shared storage target can be available to multiple
+          MDSs, although only one can access it at a time. If an active MDS
+          fails, a standby MDS can serve the MDT and make it available to
+          clients. This is referred to as MDS failover.</para>
+          <para condition="l24">Since Lustre software release 2.4, multiple
+          MDTs are supported. Each file system has at least one MDT. An MDT on
+          a shared storage target can be available via multiple MDSs, although
+          only one MDS can export the MDT to the clients at one time. Two MDS
+          machines share storage for two or more MDTs. After the failure of one
+          MDS, the remaining MDS begins serving the MDT(s) of the failed
+          MDS.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Object Storage Servers (OSS)</emphasis> : The OSS provides
-            file I/O service and network request handling for one or more local OSTs. Typically, an
-            OSS serves between two and eight OSTs, up to 16 TB each. A typical configuration is an
-            MDT on a dedicated node, two or more OSTs on each OSS node, and a client on each of a
-            large number of compute nodes.</para>
+          <para>
+          <emphasis role="bold">Object Storage Servers (OSS)</emphasis>: The
+          OSS provides file I/O service and network request handling for one or
+          more local OSTs. Typically, an OSS serves between two and eight OSTs,
+          up to 16 TB each. A typical configuration is an MDT on a dedicated
+          node, two or more OSTs on each OSS node, and a client on each of a
+          large number of compute nodes.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Object Storage Target (OST)</emphasis> : User file data is
-            stored in one or more objects, each object on a separate OST in a Lustre file system.
-            The number of objects per file is configurable by the user and can be tuned to optimize
-            performance for a given workload.</para>
+          <para>
+          <emphasis role="bold">Object Storage Target (OST)</emphasis>: User
+          file data is stored in one or more objects, each object on a separate
+          OST in a Lustre file system. The number of objects per file is
+          configurable by the user and can be tuned to optimize performance for
+          a given workload.</para>
          </listitem>
          <listitem>
-          <para><emphasis role="bold">Lustre clients</emphasis> : Lustre clients are computational,
-            visualization or desktop nodes that are running Lustre client software, allowing them to
-            mount the Lustre file system.</para>
+          <para>
+          <emphasis role="bold">Lustre clients</emphasis>: Lustre clients are
+          computational, visualization or desktop nodes that are running Lustre
+          client software, allowing them to mount the Lustre file
+          system.</para>
          </listitem>
        </itemizedlist>
-      <para>The Lustre client software provides an interface between the Linux virtual file system
-        and the Lustre servers. The client software includes a management client (MGC), a metadata
-        client (MDC), and multiple object storage clients (OSCs), one corresponding to each OST in
-        the file system.</para>
-      <para>A logical object volume (LOV) aggregates the OSCs to provide transparent access across
-        all the OSTs. Thus, a client with the Lustre file system mounted sees a single, coherent,
-        synchronized namespace. Several clients can write to different parts of the same file
-        simultaneously, while, at the same time, other clients can read from the file.</para>
-      <para><xref linkend="understandinglustre.tab.storagerequire"/> provides the requirements for
-        attached storage for each Lustre file system component and describes desirable
-        characteristics of the hardware used.</para>
+      <para>The Lustre client software provides an interface between the Linux
+      virtual file system and the Lustre servers. The client software includes
+      a management client (MGC), a metadata client (MDC), and multiple object
+      storage clients (OSCs), one corresponding to each OST in the file
+      system.</para>
+      <para>A logical object volume (LOV) aggregates the OSCs to provide
+      transparent access across all the OSTs. Thus, a client with the Lustre
+      file system mounted sees a single, coherent, synchronized namespace.
+      Several clients can write to different parts of the same file
+      simultaneously, while, at the same time, other clients can read from the
+      file.</para>
+      <para>
+      <xref linkend="understandinglustre.tab.storagerequire" />provides the
+      requirements for attached storage for each Lustre file system component
+      and describes desirable characteristics of the hardware used.</para>
        <table frame="all">
-        <title xml:id="understandinglustre.tab.storagerequire"><indexterm>
-            <primary>Lustre</primary>
-            <secondary>requirements</secondary>
-          </indexterm>Storage and hardware requirements for Lustre file system components</title>
+        <title xml:id="understandinglustre.tab.storagerequire">
+        <indexterm>
+          <primary>Lustre</primary>
+          <secondary>requirements</secondary>
+        </indexterm>Storage and hardware requirements for Lustre file system
+        components</title>
          <tgroup cols="3">
-          <colspec colname="c1" colwidth="1*"/>
-          <colspec colname="c2" colwidth="3*"/>
-          <colspec colname="c3" colwidth="3*"/>
+          <colspec colname="c1" colwidth="1*" />
+          <colspec colname="c2" colwidth="3*" />
+          <colspec colname="c3" colwidth="3*" />
            <thead>
              <row>
                <entry>
-                <para><emphasis role="bold"/></para>
+                <para>
+                  <emphasis role="bold" />
+                </para>
                </entry>
                <entry>
-                <para><emphasis role="bold">Required attached storage</emphasis></para>
+                <para>
+                  <emphasis role="bold">Required attached storage</emphasis>
+                </para>
                </entry>
                <entry>
-                <para><emphasis role="bold">Desirable hardware characteristics</emphasis></para>
+                <para>
+                  <emphasis role="bold">Desirable hardware
+                  characteristics</emphasis>
+                </para>
                </entry>
              </row>
            </thead>
@@ -463,253 +588,308 @@
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">MDSs</emphasis></para>
+                  <emphasis role="bold">MDSs</emphasis>
+                </para>
                </entry>
                <entry>
-                <para> 1-2% of file system capacity</para>
+                <para>1-2% of file system capacity</para>
                </entry>
                <entry>
-                <para> Adequate CPU power, plenty of memory, fast disk storage.</para>
+                <para>Adequate CPU power, plenty of memory, fast disk
+                storage.</para>
                </entry>
              </row>
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">OSSs</emphasis></para>
+                  <emphasis role="bold">OSSs</emphasis>
+                </para>
                </entry>
                <entry>
-                <para> 1-16 TB per OST, 1-8 OSTs per OSS</para>
+                <para>1-16 TB per OST, 1-8 OSTs per OSS</para>
                </entry>
                <entry>
-                <para> Good bus bandwidth. Recommended that storage be balanced evenly across
-                  OSSs.</para>
+                <para>Good bus bandwidth. Recommended that storage be balanced
+                evenly across OSSs.</para>
                </entry>
              </row>
              <row>
                <entry>
                  <para>
-                  <emphasis role="bold">Clients</emphasis></para>
+                  <emphasis role="bold">Clients</emphasis>
+                </para>
                </entry>
                <entry>
-                <para> None</para>
+                <para>None</para>
                </entry>
                <entry>
-                <para> Low latency, high bandwidth network.</para>
+                <para>Low latency, high bandwidth network.</para>
                </entry>
              </row>
            </tbody>
          </tgroup>
        </table>
-      <para>For additional hardware requirements and considerations, see <xref
-          linkend="settinguplustresystem"/>.</para>
+      <para>For additional hardware requirements and considerations, see
+      <xref linkend="settinguplustresystem" />.</para>
      </section>
      <section remap="h3">
-      <title><indexterm>
-          <primary>Lustre</primary>
-          <secondary>LNET</secondary>
-        </indexterm>Lustre Networking (LNET)</title>
-      <para>Lustre Networking (LNET) is a custom networking API that provides the communication
-        infrastructure that handles metadata and file I/O data for the Lustre file system servers
-        and clients. For more information about LNET, see <xref
-          linkend="understandinglustrenetworking"/>.</para>
+      <title>
+      <indexterm>
+        <primary>Lustre</primary>
+        <secondary>LNET</secondary>
+      </indexterm>Lustre Networking (LNET)</title>
+      <para>Lustre Networking (LNET) is a custom networking API that provides
+      the communication infrastructure that handles metadata and file I/O data
+      for the Lustre file system servers and clients. For more information
+      about LNET, see
+      <xref linkend="understandinglustrenetworking" />.</para>
      </section>
      <section remap="h3">
-      <title><indexterm>
-          <primary>Lustre</primary>
-          <secondary>cluster</secondary>
-        </indexterm>Lustre Cluster</title>
-      <para>At scale, a Lustre file system cluster can include hundreds of OSSs and thousands of
-        clients (see <xref linkend="understandinglustre.fig.lustrescale"/>). More than one type of
-        network can be used in a Lustre cluster. Shared storage between OSSs enables failover
-        capability. For more details about OSS failover, see <xref linkend="understandingfailover"
-        />.</para>
+      <title>
+      <indexterm>
+        <primary>Lustre</primary>
+        <secondary>cluster</secondary>
+      </indexterm>Lustre Cluster</title>
+      <para>At scale, a Lustre file system cluster can include hundreds of OSSs
+      and thousands of clients (see
+      <xref linkend="understandinglustre.fig.lustrescale" />). More than one
+      type of network can be used in a Lustre cluster. Shared storage between
+      OSSs enables failover capability. For more details about OSS failover,
+      see
+      <xref linkend="understandingfailover" />.</para>
        <figure>
-        <title xml:id="understandinglustre.fig.lustrescale"><indexterm>
-            <primary>Lustre</primary>
-            <secondary>at scale</secondary>
-          </indexterm>Lustre cluster at scale</title>
+        <title xml:id="understandinglustre.fig.lustrescale">
+        <indexterm>
+          <primary>Lustre</primary>
+          <secondary>at scale</secondary>
+        </indexterm>Lustre cluster at scale</title>
          <mediaobject>
            <imageobject>
-            <imagedata scalefit="1" width="100%" fileref="./figures/Scaled_Cluster.png"/>
+            <imagedata scalefit="1" width="100%"
+            fileref="./figures/Scaled_Cluster.png" />
            </imageobject>
            <textobject>
-            <phrase> Lustre file system cluster at scale </phrase>
+            <phrase>Lustre file system cluster at scale</phrase>
            </textobject>
          </mediaobject>
        </figure>
      </section>
    </section>
    <section xml:id="understandinglustre.storageio">
-    <title><indexterm>
-        <primary>Lustre</primary>
-        <secondary>storage</secondary>
-      </indexterm>
-      <indexterm>
-        <primary>Lustre</primary>
-        <secondary>I/O</secondary>
-      </indexterm> Lustre File System Storage and I/O</title>
-    <para>In Lustre software release 2.0, Lustre file identifiers (FIDs) were introduced to replace
-      UNIX inode numbers for identifying files or objects. A FID is a 128-bit identifier that
-      contains a unique 64-bit sequence number, a 32-bit object ID (OID), and a 32-bit version
-      number. The sequence number is unique across all Lustre targets in a file system (OSTs and
-      MDTs). This change enabled future support for multiple MDTs (introduced in Lustre software
-      release 2.3) and ZFS (introduced in Lustre software release 2.4).</para>
-    <para>Also introduced in release 2.0 is a feature call <emphasis role="italic"
-        >FID-in-dirent</emphasis> (also known as <emphasis role="italic">dirdata</emphasis>) in
-      which the FID is stored as part of the name of the file in the parent directory. This feature
-      significantly improves performance for <literal>ls</literal> command executions by reducing
-      disk I/O. The FID-in-dirent is generated at the time the file is created.</para>
+    <title>
+    <indexterm>
+      <primary>Lustre</primary>
+      <secondary>storage</secondary>
+    </indexterm>
+    <indexterm>
+      <primary>Lustre</primary>
+      <secondary>I/O</secondary>
+    </indexterm>Lustre File System Storage and I/O</title>
+    <para>In Lustre software release 2.0, Lustre file identifiers (FIDs) were
+    introduced to replace UNIX inode numbers for identifying files or objects.
+    A FID is a 128-bit identifier that contains a unique 64-bit sequence
+    number, a 32-bit object ID (OID), and a 32-bit version number. The sequence
+    number is unique across all Lustre targets in a file system (OSTs and
+    MDTs). This change enabled future support for multiple MDTs (introduced in
+    Lustre software release 2.4) and ZFS (introduced in Lustre software release
+    2.4).</para>
+    <para>Also introduced in release 2.0 is a feature call
+    <emphasis role="italic">FID-in-dirent</emphasis>(also known as
+    <emphasis role="italic">dirdata</emphasis>) in which the FID is stored as
+    part of the name of the file in the parent directory. This feature
+    significantly improves performance for
+    <literal>ls</literal> command executions by reducing disk I/O. The
+    FID-in-dirent is generated at the time the file is created.</para>
      <note>
-      <para>The FID-in-dirent feature is not compatible with the Lustre software release 1.8 format.
-        Therefore, when an upgrade from Lustre software release 1.8 to a Lustre software release 2.x
-        is performed, the FID-in-dirent feature is not automatically enabled. For upgrades from
-        Lustre software release 1.8 to Lustre software releases 2.0 through 2.3, FID-in-dirent can
-        be enabled manually but only takes effect for new files. </para>
-      <para>For more information about upgrading from Lustre software release 1.8 and enabling
-        FID-in-dirent for existing files, see <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-          linkend="upgradinglustre"/>Chapter 16 “Upgrading a Lustre File System”.</para>
+      <para>The FID-in-dirent feature is not compatible with the Lustre
+      software release 1.8 format. Therefore, when an upgrade from Lustre
+      software release 1.8 to a Lustre software release 2.x is performed, the
+      FID-in-dirent feature is not automatically enabled. For upgrades from
+      Lustre software release 1.8 to Lustre software releases 2.0 through 2.3,
+      FID-in-dirent can be enabled manually but only takes effect for new
+      files.</para>
+      <para>For more information about upgrading from Lustre software release
+      1.8 and enabling FID-in-dirent for existing files, see
+      <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+      linkend="upgradinglustre" />Chapter 16 “Upgrading a Lustre File
+      System”.</para>
      </note>
-    <para condition="l24">The LFSCK 1.5 file system administration tool released with Lustre
-      software release 2.4 provides functionality that enables FID-in-dirent for existing files. It
-      includes the following functionality:<itemizedlist>
-        <listitem>
-          <para>Generates IGIF mode FIDs for existing release 1.8 files.</para>
-        </listitem>
-        <listitem>
-          <para>Verifies the FID-in-dirent for each file to determine when it doesn’t exist or is
-            invalid and then regenerates the FID-in-dirent if needed.</para>
-        </listitem>
-        <listitem>
-          <para>Verifies the linkEA entry for each file to determine when it is missing or invalid
-            and then regenerates the linkEA if needed. The <emphasis role="italic">linkEA</emphasis>
-            consists of the file name plus its parent FID and is stored as an extended attribute in
-            the file itself. Thus, the linkEA can be used to parse out the full path name of a file
-            from root.</para>
-        </listitem>
-      </itemizedlist></para>
-    <para>Information about where file data is located on the OST(s) is stored as an extended
-      attribute called layout EA in an MDT object identified by the FID for the file (see <xref
-        xmlns:xlink="http://www.w3.org/1999/xlink" linkend="Fig1.3_LayoutEAonMDT"/>). If the file is
-      a data file (not a directory or symbol link), the MDT object points to 1-to-N OST object(s) on
-      the OST(s) that contain the file data. If the MDT file points to one object, all the file data
-      is stored in that object. If the MDT file points to more than one object, the file data is
-        <emphasis role="italic">striped</emphasis> across the objects using RAID 0, and each object
-      is stored on a different OST. (For more information about how striping is implemented in a
-      Lustre file system, see <xref linkend="dbdoclet.50438250_89922"/>.</para>
+    <para condition="l24">The LFSCK file system consistency checking tool
+    released with Lustre software release 2.4 provides functionality that
+    enables FID-in-dirent for existing files. It includes the following
+    functionality:
+    <itemizedlist>
+      <listitem>
+        <para>Generates IGIF mode FIDs for existing files from a 1.8 version
+        file system files.</para>
+      </listitem>
+      <listitem>
+        <para>Verifies the FID-in-dirent for each file and regenerates the
+        FID-in-dirent if it is invalid or missing.</para>
+      </listitem>
+      <listitem>
+        <para>Verifies the linkEA entry for each and regenerates the linkEA
+        if it is invalid or missing. The
+        <emphasis role="italic">linkEA</emphasis>consists of the file name and
+        parent FID. It is stored as an extended attribute in the file
+        itself. Thus, the linkEA can be used to reconstruct the full path name of
+        a file.</para>
+      </listitem>
+    </itemizedlist></para>
+    <para>Information about where file data is located on the OST(s) is stored
+    as an extended attribute called layout EA in an MDT object identified by
+    the FID for the file (see
+    <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+    linkend="Fig1.3_LayoutEAonMDT" />). If the file is a regular file (not a
+    directory or symbol link), the MDT object points to 1-to-N OST object(s) on
+    the OST(s) that contain the file data. If the MDT file points to one
+    object, all the file data is stored in that object. If the MDT file points
+    to more than one object, the file data is
+    <emphasis role="italic">striped</emphasis>across the objects using RAID 0,
+    and each object is stored on a different OST. (For more information about
+    how striping is implemented in a Lustre file system, see
+    <xref linkend="dbdoclet.50438250_89922" />.</para>
      <figure xml:id="Fig1.3_LayoutEAonMDT">
        <title>Layout EA on MDT pointing to file data on OSTs</title>
        <mediaobject>
          <imageobject>
-          <imagedata scalefit="1" width="80%" fileref="./figures/Metadata_File.png"/>
+          <imagedata scalefit="1" width="80%"
+          fileref="./figures/Metadata_File.png" />
          </imageobject>
          <textobject>
-          <phrase> Layout EA on MDT pointing to file data on OSTs </phrase>
+          <phrase>Layout EA on MDT pointing to file data on OSTs</phrase>
          </textobject>
        </mediaobject>
      </figure>
-    <para>When a client wants to read from or write to a file, it first fetches the layout EA from
-      the MDT object for the file. The client then uses this information to perform I/O on the file,
-      directly interacting with the OSS nodes where the objects are stored.
-      <?oxy_custom_start type="oxy_content_highlight" color="255,255,0"?>This process is illustrated
-      in <xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="Fig1.4_ClientReqstgData"
-      /><?oxy_custom_end?>.</para>
+    <para>When a client wants to read from or write to a file, it first fetches
+    the layout EA from the MDT object for the file. The client then uses this
+    information to perform I/O on the file, directly interacting with the OSS
+    nodes where the objects are stored.
+    <?oxy_custom_start type="oxy_content_highlight" color="255,255,0"?>
+    This process is illustrated in
+    <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+    linkend="Fig1.4_ClientReqstgData" /><?oxy_custom_end?>
+    .</para>
      <figure xml:id="Fig1.4_ClientReqstgData">
        <title>Lustre client requesting file data</title>
        <mediaobject>
          <imageobject>
-          <imagedata scalefit="1" width="75%" fileref="./figures/File_Write.png"/>
+          <imagedata scalefit="1" width="75%"
+          fileref="./figures/File_Write.png" />
          </imageobject>
          <textobject>
-          <phrase> Lustre client requesting file data </phrase>
+          <phrase>Lustre client requesting file data</phrase>
          </textobject>
        </mediaobject>
      </figure>
-    <para>The available bandwidth of a Lustre file system is determined as follows:</para>
+    <para>The available bandwidth of a Lustre file system is determined as
+    follows:</para>
      <itemizedlist>
        <listitem>
-        <para>The <emphasis>network bandwidth</emphasis> equals the aggregated bandwidth of the OSSs
-          to the targets.</para>
+        <para>The
+        <emphasis>network bandwidth</emphasis>equals the aggregated bandwidth
+        of the OSSs to the targets.</para>
        </listitem>
        <listitem>
-        <para>The <emphasis>disk bandwidth</emphasis> equals the sum of the disk bandwidths of the
-          storage targets (OSTs) up to the limit of the network bandwidth.</para>
+        <para>The
+        <emphasis>disk bandwidth</emphasis>equals the sum of the disk
+        bandwidths of the storage targets (OSTs) up to the limit of the network
+        bandwidth.</para>
        </listitem>
        <listitem>
-        <para>The <emphasis>aggregate bandwidth</emphasis> equals the minimum of the disk bandwidth
-          and the network bandwidth.</para>
+        <para>The
+        <emphasis>aggregate bandwidth</emphasis>equals the minimum of the disk
+        bandwidth and the network bandwidth.</para>
        </listitem>
        <listitem>
-        <para>The <emphasis>available file system space</emphasis> equals the sum of the available
-          space of all the OSTs.</para>
+        <para>The
+        <emphasis>available file system space</emphasis>equals the sum of the
+        available space of all the OSTs.</para>
        </listitem>
      </itemizedlist>
      <section xml:id="dbdoclet.50438250_89922">
        <title>
-        <indexterm>
-          <primary>Lustre</primary>
-          <secondary>striping</secondary>
-        </indexterm>
-        <indexterm>
-          <primary>striping</primary>
-          <secondary>overview</secondary>
-        </indexterm> Lustre File System and Striping</title>
-      <para>One of the main factors leading to the high performance of Lustre file systems is the
-        ability to stripe data across multiple OSTs in a round-robin fashion. Users can optionally
-        configure for each file the number of stripes, stripe size, and OSTs that are used.</para>
-      <para>Striping can be used to improve performance when the aggregate bandwidth to a single
-        file exceeds the bandwidth of a single OST. The ability to stripe is also useful when a
-        single OST does not have enough free space to hold an entire file. For more information
-        about benefits and drawbacks of file striping, see <xref linkend="dbdoclet.50438209_48033"
-        />.</para>
-      <para>Striping allows segments or &apos;chunks&apos; of data in a file to be stored on
-        different OSTs, as shown in <xref linkend="understandinglustre.fig.filestripe"/>. In the
-        Lustre file system, a RAID 0 pattern is used in which data is &quot;striped&quot; across a
-        certain number of objects. The number of objects in a single file is called the
-          <literal>stripe_count</literal>.</para>
-      <para>Each object contains a chunk of data from the file. When the chunk of data being written
-        to a particular object exceeds the <literal>stripe_size</literal>, the next chunk of data in
-        the file is stored on the next object.</para>
-      <para>Default values for <literal>stripe_count</literal> and <literal>stripe_size</literal>
-        are set for the file system. The default value for <literal>stripe_count</literal> is 1
-        stripe for file and the default value for <literal>stripe_size</literal> is 1MB. The user
-        may change these values on a per directory or per file basis. For more details, see <xref
-          linkend="dbdoclet.50438209_78664"/>.</para>
-      <para><xref linkend="understandinglustre.fig.filestripe"/>, the <literal>stripe_size</literal>
-        for File C is larger than the <literal>stripe_size</literal> for File A, allowing more data
-        to be stored in a single stripe for File C. The <literal>stripe_count</literal> for File A
-        is 3, resulting in data striped across three objects, while the
-          <literal>stripe_count</literal> for File B and File C is 1.</para>
-      <para>No space is reserved on the OST for unwritten data. File A in <xref
-          linkend="understandinglustre.fig.filestripe"/>.</para>
+      <indexterm>
+        <primary>Lustre</primary>
+        <secondary>striping</secondary>
+      </indexterm>
+      <indexterm>
+        <primary>striping</primary>
+        <secondary>overview</secondary>
+      </indexterm>Lustre File System and Striping</title>
+      <para>One of the main factors leading to the high performance of Lustre
+      file systems is the ability to stripe data across multiple OSTs in a
+      round-robin fashion. Users can optionally configure for each file the
+      number of stripes, stripe size, and OSTs that are used.</para>
+      <para>Striping can be used to improve performance when the aggregate
+      bandwidth to a single file exceeds the bandwidth of a single OST. The
+      ability to stripe is also useful when a single OST does not have enough
+      free space to hold an entire file. For more information about benefits
+      and drawbacks of file striping, see
+      <xref linkend="dbdoclet.50438209_48033" />.</para>
+      <para>Striping allows segments or 'chunks' of data in a file to be stored
+      on different OSTs, as shown in
+      <xref linkend="understandinglustre.fig.filestripe" />. In the Lustre file
+      system, a RAID 0 pattern is used in which data is "striped" across a
+      certain number of objects. The number of objects in a single file is
+      called the
+      <literal>stripe_count</literal>.</para>
+      <para>Each object contains a chunk of data from the file. When the chunk
+      of data being written to a particular object exceeds the
+      <literal>stripe_size</literal>, the next chunk of data in the file is
+      stored on the next object.</para>
+      <para>Default values for
+      <literal>stripe_count</literal> and
+      <literal>stripe_size</literal> are set for the file system. The default
+      value for
+      <literal>stripe_count</literal> is 1 stripe for file and the default value
+      for
+      <literal>stripe_size</literal> is 1MB. The user may change these values on
+      a per directory or per file basis. For more details, see
+      <xref linkend="dbdoclet.50438209_78664" />.</para>
+      <para>
+      <xref linkend="understandinglustre.fig.filestripe" />, the
+      <literal>stripe_size</literal> for File C is larger than the
+      <literal>stripe_size</literal> for File A, allowing more data to be stored
+      in a single stripe for File C. The
+      <literal>stripe_count</literal> for File A is 3, resulting in data striped
+      across three objects, while the
+      <literal>stripe_count</literal> for File B and File C is 1.</para>
+      <para>No space is reserved on the OST for unwritten data. File A in
+      <xref linkend="understandinglustre.fig.filestripe" />.</para>
        <figure>
-        <title xml:id="understandinglustre.fig.filestripe">File striping on a Lustre file
-          system</title>
+        <title xml:id="understandinglustre.fig.filestripe">File striping on a
+        Lustre file system</title>
          <mediaobject>
            <imageobject>
-            <imagedata scalefit="1" width="100%" fileref="./figures/File_Striping.png"/>
+            <imagedata scalefit="1" width="100%"
+            fileref="./figures/File_Striping.png" />
            </imageobject>
            <textobject>
-            <phrase>File striping pattern across three OSTs for three different data files. The file
-              is sparse and missing chunk 6. </phrase>
+            <phrase>File striping pattern across three OSTs for three different
+            data files. The file is sparse and missing chunk 6.</phrase>
            </textobject>
          </mediaobject>
        </figure>
-      <para>The maximum file size is not limited by the size of a single target. In a Lustre file
-        system, files can be striped across multiple objects (up to 2000), and each object can be
-        up to 16 TB in size with ldiskfs, or up to 256PB with ZFS. This leads to a maximum file size of 31.25 PB for ldiskfs or 8EB with ZFS. Note that
-        a Lustre file system can support files up to 2^63 bytes (8EB), limited
-       only by the space available on the OSTs.</para>
+      <para>The maximum file size is not limited by the size of a single
+      target. In a Lustre file system, files can be striped across multiple
+      objects (up to 2000), and each object can be up to 16 TB in size with
+      ldiskfs, or up to 256PB with ZFS. This leads to a maximum file size of
+      31.25 PB for ldiskfs or 8EB with ZFS. Note that a Lustre file system can
+      support files up to 2^63 bytes (8EB), limited only by the space available
+      on the OSTs.</para>
        <note>
-        <para>Versions of the Lustre software prior to Release 2.2 limited the  maximum stripe count
-          for a single file to 160 OSTs.</para>
+        <para>Versions of the Lustre software prior to Release 2.2 limited the
+        maximum stripe count for a single file to 160 OSTs.</para>
        </note>
-      <para>Although a single file can only be striped over 2000 objects, Lustre file systems can
-        have thousands of OSTs. The I/O bandwidth to access a single file is the aggregated I/O
-        bandwidth to the objects in a file, which can be as much as a bandwidth of up to 2000
-        servers. On systems with more than 2000 OSTs, clients can do I/O using multiple files to
-        utilize the full file system bandwidth.</para>
-      <para>For more information about striping, see <xref linkend="managingstripingfreespace"
-        />.</para>
+      <para>Although a single file can only be striped over 2000 objects,
+      Lustre file systems can have thousands of OSTs. The I/O bandwidth to
+      access a single file is the aggregated I/O bandwidth to the objects in a
+      file, which can be as much as a bandwidth of up to 2000 servers. On
+      systems with more than 2000 OSTs, clients can do I/O using multiple files
+      to utilize the full file system bandwidth.</para>
+      <para>For more information about striping, see
+      <xref linkend="managingstripingfreespace" />.</para>
      </section>
    </section>
  </chapter>
diff --git a/UpgradingLustre.xml b/UpgradingLustre.xml

index 1409d3d..637785f 100644 (file)
--- a/UpgradingLustre.xml
+++ b/UpgradingLustre.xml
@@ -1,144 +1,198 @@
-<?xml version='1.0' encoding='UTF-8'?><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="upgradinglustre">
+<?xml version='1.0' encoding='utf-8'?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
+xml:id="upgradinglustre">
    <title xml:id="upgradinglustre.title">Upgrading a Lustre File System</title>
-  <para>This chapter describes interoperability between Lustre software releases. It also provides
-    procedures for upgrading from Lustre software release 1.8  to Lustre softeware release 2.x ,
-    from  a Lustre software release 2.x  to a more recent Lustre software release 2.x (major release
-    upgrade), and from a a Lustre software release 2.x.y  to a more recent Lustre software release
-    2.x.y (minor release upgrade). It includes the following sections:</para>
+  <para>This chapter describes interoperability between Lustre software
+  releases. It also provides procedures for upgrading from Lustre software
+  release 1.8 to Lustre softeware release 2.x , from a Lustre software release
+  2.x to a more recent Lustre software release 2.x (major release upgrade), and
+  from a a Lustre software release 2.x.y to a more recent Lustre software
+  release 2.x.y (minor release upgrade). It includes the following
+  sections:</para>
    <itemizedlist>
      <listitem>
-      <para><xref linkend="dbdoclet.50438205_82079"/></para>
+      <para>
+        <xref linkend="dbdoclet.50438205_82079" />
+      </para>
      </listitem>
      <listitem>
-      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="Upgrading_2.x"/></para>
+      <para>
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="Upgrading_2.x" />
+      </para>
      </listitem>
      <listitem>
-      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="Upgrading_2.x.x"/></para>
+      <para>
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="Upgrading_2.x.x" />
+      </para>
      </listitem>
    </itemizedlist>
    <section xml:id="dbdoclet.50438205_82079">
-      <title><indexterm>
-        <primary>Lustre</primary>
-        <secondary>upgrading</secondary>
-        <see>upgrading</see>
-      </indexterm><indexterm>
-        <primary>upgrading</primary>
-      </indexterm>Release Interoperability and Upgrade Requirements</title>
-    <para><emphasis role="italic"><emphasis role="bold">Lustre software release 2.x (major)
-          upgrade:</emphasis></emphasis><itemizedlist>
+    <title>
+    <indexterm>
+      <primary>Lustre</primary>
+      <secondary>upgrading</secondary>
+      <see>upgrading</see>
+    </indexterm>
+    <indexterm>
+      <primary>upgrading</primary>
+    </indexterm>Release Interoperability and Upgrade Requirements</title>
+    <para>
+      <emphasis role="italic">
+        <emphasis role="bold">Lustre software release 2.x (major)
+        upgrade:</emphasis>
+      </emphasis>
+      <itemizedlist>
          <listitem>
-          <para>All servers must be upgraded at the same time, while some or all clients may be
-            upgraded.</para>
+          <para>All servers must be upgraded at the same time, while some or
+          all clients may be upgraded.</para>
          </listitem>
          <listitem>
-          <para>All servers must be be upgraded to a Linux kernel supported by the Lustre software.
-            See the Linux Test Matrix at <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-              linkend="LustreTestMatrixTable"/> for a list of tested Lustre distributions.</para>
+          <para>All servers must be be upgraded to a Linux kernel supported by
+          the Lustre software. See the Linux Test Matrix at 
+          <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+          linkend="LustreTestMatrixTable" />for a list of tested Lustre
+          distributions.</para>
          </listitem>
          <listitem>
-          <para>Clients to be upgraded to the Lustre software release 2.4 or higher must be running
-            a compatible Linux distribution. See the Linux Test Matrix at <xref
-              xmlns:xlink="http://www.w3.org/1999/xlink" linkend="LustreTestMatrixTable"/> for a
-            list of tested Linux distributions.</para>
+          <para>Clients to be upgraded to the Lustre software release 2.4 or
+          higher must be running a compatible Linux distribution. See the Linux
+          Test Matrix at 
+          <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+          linkend="LustreTestMatrixTable" />for a list of tested Linux
+          distributions.</para>
          </listitem>
-      </itemizedlist></para>
-    <para><emphasis role="italic"><emphasis role="bold">Lustre software release 2.x.y release
-          (minor) upgrade:</emphasis></emphasis></para>
+      </itemizedlist>
+    </para>
+    <para>
+      <emphasis role="italic">
+        <emphasis role="bold">Lustre software release 2.x.y release (minor)
+        upgrade:</emphasis>
+      </emphasis>
+    </para>
      <itemizedlist>
        <listitem>
-        <para>All servers must be upgraded at the same time, while some or all clients may be
-          upgraded.</para>
+        <para>All servers must be upgraded at the same time, while some or all
+        clients may be upgraded.</para>
        </listitem>
        <listitem>
-        <para>Rolling upgrades are supported for minor releases allowing individual servers and
-          clients to be upgraded without stopping the Lustre file system.</para>
+        <para>Rolling upgrades are supported for minor releases allowing
+        individual servers and clients to be upgraded without stopping the
+        Lustre file system.</para>
        </listitem>
      </itemizedlist>
    </section>
    <section xml:id="Upgrading_2.x">
-    <title><indexterm>
-        <primary>upgrading</primary>
-        <secondary>major release (2.x to 2.x)</secondary>
-      </indexterm><indexterm>
-        <primary>wide striping</primary>
-      </indexterm><indexterm>
-        <primary>MDT</primary>
-        <secondary>multiple MDSs</secondary>
-      </indexterm><indexterm>
-        <primary>large_xattr</primary>
-        <secondary>ea_inode</secondary>
-      </indexterm><indexterm>
-        <primary>wide striping</primary>
-        <secondary>large_xattr</secondary>
-        <tertiary>ea_inode</tertiary>
-      </indexterm>Upgrading to Lustre Software Release 2.x (Major Release)</title>
-    <para> The procedure for upgrading from a Lustre software release 2.x to a more recent 2.x
-      release of the Lustre software is described in this section. </para>
+    <title>
+    <indexterm>
+      <primary>upgrading</primary>
+      <secondary>major release (2.x to 2.x)</secondary>
+    </indexterm>
+    <indexterm>
+      <primary>wide striping</primary>
+    </indexterm>
+    <indexterm>
+      <primary>MDT</primary>
+      <secondary>multiple MDSs</secondary>
+    </indexterm>
+    <indexterm>
+      <primary>large_xattr</primary>
+      <secondary>ea_inode</secondary>
+    </indexterm>
+    <indexterm>
+      <primary>wide striping</primary>
+      <secondary>large_xattr</secondary>
+      <tertiary>ea_inode</tertiary>
+    </indexterm>Upgrading to Lustre Software Release 2.x (Major
+    Release)</title>
+    <para>The procedure for upgrading from a Lustre software release 2.x to a
+    more recent 2.x release of the Lustre software is described in this
+    section.</para>
      <note>
-      <para>This procedure can also be used to upgrade Lustre software release 1.8.6-wc1 or later to
-        any Lustre software release 2.x. To upgrade other versions of Lustre software release 1.8.x,
-        contact your support provider.</para>
+      <para>This procedure can also be used to upgrade Lustre software release
+      1.8.6-wc1 or later to any Lustre software release 2.x. To upgrade other
+      versions of Lustre software release 1.8.x, contact your support
+      provider.</para>
      </note>
      <note>
-      <para condition="l22">In Lustre software release 2.2, a feature has been added that allows
-        striping across up to 2000 OSTs. By default, this "wide striping" feature is disabled. It is
-        activated by setting the <literal>large_xattr</literal> or <literal>ea_inode</literal>
-        option on the MDT using either
-          <literal>mkfs.lustre</literal> or <literal>tune2fs</literal>. For example after upgrading
-        an existing file system to Lustre software release 2.2 or later, wide striping can be
-        enabled by running the following command on the MDT device before mounting
-        it:<screen>tune2fs -O large_xattr</screen>Once the wide striping feature is enabled and in
-        use on the MDT, it is not possible to directly downgrade the MDT file system to an earlier
-        version of the Lustre software that does not support wide striping. To disable wide striping:<orderedlist>
-          <listitem>
-            <para>Delete all wide-striped files. </para>
-            <para>OR </para>
-            <para>Use <literal>lfs_migrate</literal> with the option <literal>-c</literal>
-              <replaceable>stripe_count</replaceable> (set <replaceable>stripe_count</replaceable>
-              to 160) to move the files to another location.</para>
-          </listitem>
-          <listitem>
-            <para>Unmount the MDT.</para>
-          </listitem>
-          <listitem>
-            <para>Run the following command to turn off the <literal>large_xattr</literal>
-              option:<screen>tune2fs -O ^large_xattr</screen></para>
-          </listitem>
-        </orderedlist>
-        Using either <literal>mkfs.lustre</literal> or <literal>tune2fs</literal> with 
-          <literal>large_xattr</literal> or <literal>ea_inode</literal> option reseults in 
-          <literal>ea_inode</literal> in the file system feature list.
-      </para></note>
+      <para condition="l22">In Lustre software release 2.2, a feature has been
+      added that allows striping across up to 2000 OSTs. By default, this "wide
+      striping" feature is disabled. It is activated by setting the 
+      <literal>large_xattr</literal> or 
+      <literal>ea_inode</literal> option on the MDT using either 
+      <literal>mkfs.lustre</literal> or 
+      <literal>tune2fs</literal>. For example after upgrading an existing file
+      system to Lustre software release 2.2 or later, wide striping can be
+      enabled by running the following command on the MDT device before
+      mounting it:
+      <screen>tune2fs -O large_xattr</screen>
+      Once the wide striping feature is enabled and in use on the MDT, it is
+      not possible to directly downgrade the MDT file system to an earlier 
+      version of the Lustre software that does not support wide striping. To 
+      disable wide striping:
+      <orderedlist>
+        <listitem>
+          <para>Delete all wide-striped files.</para>
+          <para>OR</para>
+          <para>Use 
+          <literal>lfs_migrate</literal> with the option 
+          <literal>-c</literal>
+          <replaceable>stripe_count</replaceable>(set 
+          <replaceable>stripe_count</replaceable>to 160) to move the files to
+          another location.</para>
+        </listitem>
+        <listitem>
+          <para>Unmount the MDT.</para>
+        </listitem>
+        <listitem>
+          <para>Run the following command to turn off the 
+          <literal>large_xattr</literal> option:
+          <screen>tune2fs -O ^large_xattr</screen></para>
+        </listitem>
+      </orderedlist>Using either 
+      <literal>mkfs.lustre</literal> or 
+      <literal>tune2fs</literal> with 
+      <literal>large_xattr</literal> or 
+      <literal>ea_inode</literal> option reseults in 
+      <literal>ea_inode</literal> in the file system feature list.</para>
+    </note>
      <note condition="l23">
-      <para>To generate a list of all files with more than 160 stripes use <literal>lfs
-          find</literal> with the <literal>--stripe-count</literal>
-        option:<screen>lfs find ${mountpoint} --stripe-count=+160</screen></para>
+      <para>To generate a list of all files with more than 160 stripes use 
+      <literal>lfs find</literal> with the 
+      <literal>--stripe-count</literal> option:
+      <screen>lfs find ${mountpoint} --stripe-count=+160</screen></para>
      </note>
      <note condition="l24">
-      <para>In Lustre software release 2.4, a new feature allows using multiple MDTs, which can each
-        serve one or more remote sub-directories in the file system. The <literal>root</literal>
-        directory is always located on MDT0. </para>
-      <para>Note that clients running a release prior to the Lustre software release 2.4 can only
-        see the namespace hosted by MDT0 and will return an IO error if an attempt is made to access
-        a directory on another MDT.</para>
+      <para>In Lustre software release 2.4, a new feature allows using multiple
+      MDTs, which can each serve one or more remote sub-directories in the file
+      system. The 
+      <literal>root</literal> directory is always located on MDT0.</para>
+      <para>Note that clients running a release prior to the Lustre software
+      release 2.4 can only see the namespace hosted by MDT0 and will return an
+      IO error if an attempt is made to access a directory on another
+      MDT.</para>
      </note>
-    <para>To upgrade a Lustre software release 2.x to a more recent major release, complete these
-      steps:</para>
+    <para>To upgrade a Lustre software release 2.x to a more recent major
+    release, complete these steps:</para>
      <orderedlist>
        <listitem>
-        <para>Create a complete, restorable file system backup. </para>
+        <para>Create a complete, restorable file system backup.</para>
          <caution>
-          <para>Before installing the Lustre software, back up ALL data. The Lustre software
-            contains kernel modifications that interact with storage devices and may introduce
-            security issues and data loss if not installed, configured, or administered properly. If
-            a full backup of the file system is not practical, a device-level backup of the MDT file
-            system is recommended. See  <xref linkend="backupandrestore"/> for a procedure.</para>
+          <para>Before installing the Lustre software, back up ALL data. The
+          Lustre software contains kernel modifications that interact with
+          storage devices and may introduce security issues and data loss if
+          not installed, configured, or administered properly. If a full backup
+          of the file system is not practical, a device-level backup of the MDT
+          file system is recommended. See 
+          <xref linkend="backupandrestore" />for a procedure.</para>
          </caution>
        </listitem>
        <listitem>
-        <para>Shut down the file system by unmounting all clients and servers in the order shown
-          below (unmounting a block device causes the Lustre software to be shut down on that
-          node):</para>
+        <para>Shut down the file system by unmounting all clients and servers
+        in the order shown below (unmounting a block device causes the Lustre
+        software to be shut down on that node):</para>
          <orderedlist numeration="loweralpha">
            <listitem>
              <para>Unmount the clients. On each client node, run:</para>
@@ -156,30 +210,37 @@
        </listitem>
        <listitem>
          <para>Upgrade the Linux operating system on all servers to a compatible
-          (tested) Linux distribution and reboot. See the Linux Test Matrix at <xref
-            xmlns:xlink="http://www.w3.org/1999/xlink" linkend="LustreTestMatrixTable"/>.</para>
+        (tested) Linux distribution and reboot. See the Linux Test Matrix at 
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="LustreTestMatrixTable" />.</para>
        </listitem>
        <listitem>
-        <para>Upgrade the Linux operating system on all clients to Red Hat Enterprise Linux 6 or
-          other compatible (tested) distribution and reboot. See the Linux Test Matrix at <xref
-            xmlns:xlink="http://www.w3.org/1999/xlink" linkend="LustreTestMatrixTable"/>.</para>
+        <para>Upgrade the Linux operating system on all clients to Red Hat
+        Enterprise Linux 6 or other compatible (tested) distribution and
+        reboot. See the Linux Test Matrix at 
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="LustreTestMatrixTable" />.</para>
        </listitem>
        <listitem>
-        <para>Download the Lustre server RPMs for your platform from the <link
-            xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">Lustre Releases</link>
-          repository. See <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-            linkend="table_cnh_5m3_gk"/> for a list of required packages.</para>
+        <para>Download the Lustre server RPMs for your platform from the 
+        <link xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">
+        Lustre Releases</link>repository. See 
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="table_cnh_5m3_gk" />for a list of required packages.</para>
        </listitem>
        <listitem>
-        <para>Install the Lustre server packages on all Lustre servers (MGS, MDSs, and OSSs).</para>
+        <para>Install the Lustre server packages on all Lustre servers (MGS,
+        MDSs, and OSSs).</para>
          <orderedlist numeration="loweralpha">
            <listitem>
-            <para>Log onto a Lustre server as the <literal>root</literal> user</para>
+            <para>Log onto a Lustre server as the 
+            <literal>root</literal> user</para>
            </listitem>
            <listitem>
-            <para>Use the <literal>yum</literal> command to install the packages:</para>
+            <para>Use the 
+            <literal>yum</literal> command to install the packages:</para>
              <para>
-              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ...</screen>
+              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ... </screen>
              </para>
            </listitem>
            <listitem>
@@ -194,28 +255,33 @@
          </orderedlist>
        </listitem>
        <listitem>
-        <para>Download the Lustre client RPMs for your platform from the <link
-            xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">Lustre Releases</link>
-          repository. See <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-            linkend="table_j3r_ym3_gk"/> for a list of required packages.</para>
+        <para>Download the Lustre client RPMs for your platform from the 
+        <link xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">
+        Lustre Releases</link>repository. See 
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="table_j3r_ym3_gk" />for a list of required packages.</para>
          <note>
-          <para>The version of the kernel running on a Lustre client must be the same as the version
-            of the <literal>lustre-client-modules-</literal><replaceable>ver</replaceable> package
-            being installed. If not, a compatible kernel must be installed on the client before the
-            Lustre client packages are installed.</para>
+          <para>The version of the kernel running on a Lustre client must be
+          the same as the version of the 
+          <literal>lustre-client-modules-</literal>
+          <replaceable>ver</replaceable>package being installed. If not, a
+          compatible kernel must be installed on the client before the Lustre
+          client packages are installed.</para>
          </note>
        </listitem>
        <listitem>
-        <para>Install the Lustre client packages on each of the Lustre clients to be
-          upgraded.</para>
+        <para>Install the Lustre client packages on each of the Lustre clients
+        to be upgraded.</para>
          <orderedlist numeration="loweralpha">
            <listitem>
-            <para>Log onto a Lustre client as the <literal>root</literal> user.</para>
+            <para>Log onto a Lustre client as the 
+            <literal>root</literal> user.</para>
            </listitem>
            <listitem>
-            <para>Use the <literal>yum</literal> command to install the packages:</para>
+            <para>Use the 
+            <literal>yum</literal> command to install the packages:</para>
              <para>
-              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ...</screen>
+              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ... </screen>
              </para>
            </listitem>
            <listitem>
@@ -230,144 +296,166 @@
          </orderedlist>
        </listitem>
        <listitem>
-        <para>(Optional) For upgrades to Lustre software release 2.2 or higher, to enable wide
-          striping on an existing MDT, run the following command on the MDT
-          :<screen>mdt# tune2fs -O large_xattr <replaceable>device</replaceable></screen></para>
-        <para>For more information about wide striping, see <xref
-            xmlns:xlink="http://www.w3.org/1999/xlink" linkend="section_syy_gcl_qk"/>.</para>
+        <para>(Optional) For upgrades to Lustre software release 2.2 or higher,
+        to enable wide striping on an existing MDT, run the following command
+        on the MDT :
+        <screen>mdt# tune2fs -O large_xattr <replaceable>device</replaceable></screen></para>
+        <para>For more information about wide striping, see 
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="section_syy_gcl_qk" />.</para>
        </listitem>
        <listitem>
-        <para>(Optional) For upgrades to Lustre software release 2.4 or higher, to format an
-          additional MDT, complete these steps:<orderedlist numeration="loweralpha">
-            <listitem>
-              <para>Determine the index used for the first MDT (each MDT must have unique index).
-                Enter:<screen>client$ lctl dl | grep mdc
+        <para>(Optional) For upgrades to Lustre software release 2.4 or higher,
+        to format an additional MDT, complete these steps:
+        <orderedlist numeration="loweralpha">
+          <listitem>
+            <para>Determine the index used for the first MDT (each MDT must
+            have unique index). Enter:
+            <screen>client$ lctl dl | grep mdc
  36 UP mdc lustre-MDT0000-mdc-ffff88004edf3c00 
        4c8be054-144f-9359-b063-8477566eb84e 5</screen></para>
-              <para>In this example, the next available index is 1.</para>
-            </listitem>
-            <listitem>
-              <para>Add the new block device as a new MDT at the next available index by entering
-                (on one
-                line):<screen>mds# mkfs.lustre --reformat --fsname=<replaceable>filesystem_name</replaceable> --mdt \
-    --mgsnode=<replaceable>mgsnode</replaceable> --index <replaceable>1</replaceable> <replaceable>/dev/mdt1_device</replaceable>
-               </screen></para>
-            </listitem>
-          </orderedlist></para>
+            <para>In this example, the next available index is 1.</para>
+          </listitem>
+          <listitem>
+            <para>Add the new block device as a new MDT at the next available
+            index by entering (on one line):
+            <screen>mds# mkfs.lustre --reformat --fsname=<replaceable>filesystem_name</replaceable> --mdt \
+    --mgsnode=<replaceable>mgsnode</replaceable> --index <replaceable>1</replaceable> 
+<replaceable>/dev/mdt1_device</replaceable></screen></para>
+          </listitem>
+        </orderedlist></para>
        </listitem>
        <listitem>
-        <para>(Optional) If you are upgrading to Lustre software release 2.3 or higher from Lustre
-          software release 2.2 or earlier and want to enable the quota feature, complete these
-          steps: <orderedlist numeration="loweralpha">
-            <listitem>
-              <para>Before setting up the file system, enter on both the MDS and
-                OSTs:<screen>tunefs.lustre --quota</screen></para>
-            </listitem>
-            <listitem>
-              <para>When setting up the file system,
-                enter:<screen>conf_param $FSNAME.quota.mdt=$QUOTA_TYPE
+        <para>(Optional) If you are upgrading to Lustre software release 2.3 or
+        higher from Lustre software release 2.2 or earlier and want to enable
+        the quota feature, complete these steps: 
+        <orderedlist numeration="loweralpha">
+          <listitem>
+            <para>Before setting up the file system, enter on both the MDS and
+            OSTs:
+            <screen>tunefs.lustre --quota</screen></para>
+          </listitem>
+          <listitem>
+            <para>When setting up the file system, enter:
+            <screen>conf_param $FSNAME.quota.mdt=$QUOTA_TYPE
  conf_param $FSNAME.quota.ost=$QUOTA_TYPE</screen></para>
-            </listitem>
-          </orderedlist></para>
+          </listitem>
+        </orderedlist></para>
        </listitem>
        <listitem>
-        <para>(Optional) If you are upgrading from Lustre software release 1.8, you must manually
-          enable the FID-in-dirent feature. On the MDS,
-          enter:<screen>tune2fs –O dirdata /dev/<replaceable>mdtdev</replaceable></screen></para>
+        <para>(Optional) If you are upgrading from Lustre software release 1.8,
+        you must manually enable the FID-in-dirent feature. On the MDS, enter:
+        <screen>tune2fs –O dirdata /dev/<replaceable>mdtdev</replaceable></screen></para>
          <warning>
-          <para>This step is not reversible. Do not complete this step until you are sure you will
-            not be downgrading the Lustre software.</para>
+          <para>This step is not reversible. Do not complete this step until
+          you are sure you will not be downgrading the Lustre software.</para>
          </warning>
-        <para>This step only enables FID-in-dirent for newly created files. If you are upgrading to
-          Lustre software release 2.4, you can use LFSCK 1.5 to enable FID-in-dirent for existing
-          files. For more information about FID-in-dirent and related functionalities in LFSCK 1.5,
-          see <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-            linkend="understandinglustre.storageio"/>.</para>
+        <para condition='l24'>This step only enables FID-in-dirent for newly 
+        created files. If you are upgrading to Lustre software release 2.4, 
+        you can use LFSCK to enable FID-in-dirent for existing files. For 
+        more information about FID-in-dirent and related functionalities in 
+        LFSCK, see <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="understandinglustre.storageio" />.</para>
        </listitem>
        <listitem>
-        <para>Start the Lustre file system by starting the components in the order shown in the
-          following steps:</para>
+        <para>Start the Lustre file system by starting the components in the
+        order shown in the following steps:</para>
          <orderedlist numeration="loweralpha">
            <listitem>
-            <para>Mount the MGT. On the MGS, run<screen>mgs# mount -a -t lustre</screen></para>
+            <para>Mount the MGT. On the MGS, run
+            <screen>mgs# mount -a -t lustre</screen></para>
            </listitem>
            <listitem>
-            <para>Mount the MDT(s). On each MDT, run:<screen>mds# mount -a -t lustre</screen></para>
+            <para>Mount the MDT(s). On each MDT, run:
+            <screen>mds# mount -a -t lustre</screen></para>
            </listitem>
            <listitem>
              <para>Mount all the OSTs. On each OSS node, run:</para>
              <screen>oss# mount -a -t lustre</screen>
              <note>
-              <para>This command assumes that all the OSTs are listed in the
-                  <literal>/etc/fstab</literal> file. OSTs that are not listed in the
-                  <literal>/etc/fstab</literal> file, must be mounted individually by running the
-                mount command:</para>
-              <screen> mount -t lustre <replaceable>/dev/block_device</replaceable> <replaceable>/mount_point</replaceable> </screen>
+              <para>This command assumes that all the OSTs are listed in the 
+              <literal>/etc/fstab</literal> file. OSTs that are not listed in
+              the 
+              <literal>/etc/fstab</literal> file, must be mounted individually
+              by running the mount command:</para>
+              <screen>mount -t lustre <replaceable>/dev/block_device</replaceable><replaceable>/mount_point</replaceable></screen>
              </note>
            </listitem>
            <listitem>
-            <para>Mount the file system on the clients. On each client node, run:</para>
+            <para>Mount the file system on the clients. On each client node,
+            run:</para>
              <screen>client# mount -a -t lustre</screen>
            </listitem>
          </orderedlist>
        </listitem>
      </orderedlist>
      <note>
-      <para>The mounting order described in the steps above must be followed for the intial mount
-        and registration of a Lustre file system after an upgrade.  For a normal start of a Lustre
-        file system, the  mounting order is MGT, OSTs, MDT(s), clients.</para>
+      <para>The mounting order described in the steps above must be followed
+      for the intial mount and registration of a Lustre file system after an
+      upgrade. For a normal start of a Lustre file system, the mounting order
+      is MGT, OSTs, MDT(s), clients.</para>
      </note>
-    <para>If you have a problem upgrading a Lustre file system, see <xref
-        xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438198_30989"/> for some ways
-      to get help.</para>
+    <para>If you have a problem upgrading a Lustre file system, see 
+    <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+    linkend="dbdoclet.50438198_30989" />for some ways to get help.</para>
    </section>
    <section xml:id="Upgrading_2.x.x">
-    <title><indexterm>
-        <primary>upgrading</primary>
-        <secondary>2.X.y to 2.X.y (minor release)</secondary>
-      </indexterm>Upgrading to Lustre Software Release 2.x.y (Minor Release)</title>
-    <para>Rolling upgrades are supported for upgrading from any Lustre software release 2.x.y to a
-      more recent Lustre software release 2.X.y. This allows the Lustre file system to continue to
-      run while individual servers (or their failover partners) and clients are upgraded one at a
-      time. The procedure for upgrading a Lustre software release 2.x.y to a more recent minor
-      release is described in this section.</para>
-    <para>To upgrade Lustre software release 2.x.y to a more recent minor release, complete these
-      steps:</para>
+    <title>
+    <indexterm>
+      <primary>upgrading</primary>
+      <secondary>2.X.y to 2.X.y (minor release)</secondary>
+    </indexterm>Upgrading to Lustre Software Release 2.x.y (Minor
+    Release)</title>
+    <para>Rolling upgrades are supported for upgrading from any Lustre software
+    release 2.x.y to a more recent Lustre software release 2.X.y. This allows
+    the Lustre file system to continue to run while individual servers (or
+    their failover partners) and clients are upgraded one at a time. The
+    procedure for upgrading a Lustre software release 2.x.y to a more recent
+    minor release is described in this section.</para>
+    <para>To upgrade Lustre software release 2.x.y to a more recent minor
+    release, complete these steps:</para>
      <orderedlist>
        <listitem>
-        <para>Create a complete, restorable file system backup. </para>
+        <para>Create a complete, restorable file system backup.</para>
          <caution>
-          <para>Before installing the Lustre software, back up ALL data. The Lustre software
-            contains kernel modifications that interact with storage devices and may introduce
-            security issues and data loss if not installed, configured, or administered properly. If
-            a full backup of the file system is not practical, a device-level backup of the MDT file
-            system is recommended. See  <xref linkend="backupandrestore"/> for a procedure.</para>
+          <para>Before installing the Lustre software, back up ALL data. The
+          Lustre software contains kernel modifications that interact with
+          storage devices and may introduce security issues and data loss if
+          not installed, configured, or administered properly. If a full backup
+          of the file system is not practical, a device-level backup of the MDT
+          file system is recommended. See 
+          <xref linkend="backupandrestore" />for a procedure.</para>
          </caution>
        </listitem>
        <listitem>
-        <para>Download the Lustre server RPMs for your platform from the <link
-            xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">Lustre Releases</link>
-          repository. See <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-            linkend="table_cnh_5m3_gk"/> for a list of required packages.</para>
+        <para>Download the Lustre server RPMs for your platform from the 
+        <link xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">
+        Lustre Releases</link>repository. See 
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="table_cnh_5m3_gk" />for a list of required packages.</para>
        </listitem>
        <listitem>
-        <para>For a rolling upgrade, complete any procedures required to keep the Lustre file system
-          running while the server to be upgraded is offline, such as failing over a primary server
-          to its secondary partner. </para>
+        <para>For a rolling upgrade, complete any procedures required to keep
+        the Lustre file system running while the server to be upgraded is
+        offline, such as failing over a primary server to its secondary
+        partner.</para>
        </listitem>
        <listitem>
-        <para>Unmount the Lustre server to be upgraded (MGS, MDS, or OSS)</para>
+        <para>Unmount the Lustre server to be upgraded (MGS, MDS, or
+        OSS)</para>
        </listitem>
        <listitem>
          <para>Install the Lustre server packages on the Lustre server.</para>
          <orderedlist numeration="loweralpha">
            <listitem>
-            <para>Log onto the Lustre server as the <literal>root</literal> user</para>
+            <para>Log onto the Lustre server as the 
+            <literal>root</literal> user</para>
            </listitem>
            <listitem>
-            <para>Use the <literal>yum</literal> command to install the packages:</para>
+            <para>Use the 
+            <literal>yum</literal> command to install the packages:</para>
              <para>
-              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ...</screen>
+              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ... </screen>
              </para>
            </listitem>
            <listitem>
@@ -378,7 +466,8 @@ conf_param $FSNAME.quota.ost=$QUOTA_TYPE</screen></para>
            </listitem>
            <listitem>
              <para>Mount the Lustre server to restart the Lustre software on the
-              server:<screen>server# mount -a -t lustre</screen></para>
+            server:
+            <screen>server# mount -a -t lustre</screen></para>
            </listitem>
            <listitem>
              <para>Repeat these steps on each Lustre server.</para>
@@ -386,22 +475,25 @@ conf_param $FSNAME.quota.ost=$QUOTA_TYPE</screen></para>
          </orderedlist>
        </listitem>
        <listitem>
-        <para>Download the Lustre client RPMs for your platform from the <link
-            xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">Lustre Releases</link>
-          repository. See <xref xmlns:xlink="http://www.w3.org/1999/xlink"
-            linkend="table_j3r_ym3_gk"/> for a list of required packages.</para>
+        <para>Download the Lustre client RPMs for your platform from the 
+        <link xl:href="https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases">
+        Lustre Releases</link>repository. See 
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+        linkend="table_j3r_ym3_gk" />for a list of required packages.</para>
        </listitem>
        <listitem>
-        <para>Install the Lustre client packages on each of the Lustre clients to be
-          upgraded.</para>
+        <para>Install the Lustre client packages on each of the Lustre clients
+        to be upgraded.</para>
          <orderedlist numeration="loweralpha">
            <listitem>
-            <para>Log onto a Lustre client as the <literal>root</literal> user.</para>
+            <para>Log onto a Lustre client as the 
+            <literal>root</literal> user.</para>
            </listitem>
            <listitem>
-            <para>Use the <literal>yum</literal> command to install the packages:</para>
+            <para>Use the 
+            <literal>yum</literal> command to install the packages:</para>
              <para>
-              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ...</screen>
+              <screen># yum --nogpgcheck install pkg1.rpm pkg2.rpm ... </screen>
              </para>
            </listitem>
            <listitem>
@@ -412,7 +504,8 @@ conf_param $FSNAME.quota.ost=$QUOTA_TYPE</screen></para>
            </listitem>
            <listitem>
              <para>Mount the Lustre client to restart the Lustre software on the
-              client:<screen>client# mount -a -t lustre</screen></para>
+            client:
+            <screen>client# mount -a -t lustre</screen></para>
            </listitem>
            <listitem>
              <para>Repeat these steps on each Lustre client.</para>
@@ -420,8 +513,9 @@ conf_param $FSNAME.quota.ost=$QUOTA_TYPE</screen></para>
          </orderedlist>
        </listitem>
      </orderedlist>
-    <para>If you have a problem upgrading a Lustre file system, see <xref
-        xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438198_30989"/> for some
-      suggestions for how to get help.</para>
+    <para>If you have a problem upgrading a Lustre file system, see 
+    <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+    linkend="dbdoclet.50438198_30989" />for some suggestions for how to get
+    help.</para>
    </section>
  </chapter>
author	Richard Henwood <richard.henwood@intel.com>
	Fri, 5 Dec 2014 20:20:37 +0000 (14:20 -0600)
committer	Richard Henwood <richard.henwood@intel.com>
	Wed, 11 Feb 2015 20:34:02 +0000 (20:34 +0000)
BackupAndRestore.xml		patch \| blob \| history
Glossary.xml		patch \| blob \| history
TroubleShootingRecovery.xml		patch \| blob \| history
UnderstandingLustre.xml		patch \| blob \| history
UpgradingLustre.xml		patch \| blob \| history