LUDOC-388 stats: Correct extents_stats information

[doc/manual.git] / LustreProc.xml
diff --git a/LustreProc.xml b/LustreProc.xml

index ca1c8d7..1d33d99 100644 (file)
--- a/LustreProc.xml
+++ b/LustreProc.xml
@@ -799,13 +799,14 @@ getxattr               61169 samples [regs]
          <para>By default, statistics are not collected in the <literal>offset_stats</literal>,
              <literal>extents_stats</literal>, and <literal>extents_stats_per_process</literal> files
            to reduce monitoring overhead when this information is not needed.  The collection of
-          statistics in all three of these files is activated by writing anything into any one of
-          the files.</para>
+          statistics in all three of these files is activated by writing
+          anything, except for 0 (zero) and "disable", into any one of the
+          files.</para>
        </note>
        <para><emphasis role="italic"><emphasis role="bold">Example:</emphasis></emphasis></para>
        <screen># lctl get_param llite.testfs-f57dee0.offset_stats
  snapshot_time: 1155748884.591028 (secs.usecs)
-             RANGE   RANGE    SMALLEST   LARGEST   
+             RANGE   RANGE    SMALLEST   LARGEST
  R/W   PID    START   END      EXTENT     EXTENT    OFFSET
  R     8385   0       128      128        128       0
  R     8385   0       224      224        224       -128
@@ -902,20 +903,22 @@ R     8385   500     600      100        100       -610</screen>
          <para>By default, statistics are not collected in the <literal>offset_stats</literal>,
              <literal>extents_stats</literal>, and <literal>extents_stats_per_process</literal> files
            to reduce monitoring overhead when this information is not needed.  The collection of
-          statistics in all three of these files is activated by writing anything into any one of
-          the files.</para>
+          statistics in all three of these files is activated by writing
+          anything, except for 0 (zero) and "disable", into any one of the
+          files.</para>
        </note>
        <section remap="h3">
          <title>Client-Based I/O Extent Size Survey</title>
-        <para>The <literal>extent_stats</literal> histogram in the <literal>llite</literal>
-          directory shows the statistics for the sizes of the read/write I/O extents. This file does
-          not maintain the per-process statistics.</para>
+        <para>The <literal>extents_stats</literal> histogram in the
+          <literal>llite</literal> directory shows the statistics for the sizes
+          of the read/write I/O extents. This file does not maintain the per
+          process statistics.</para>
          <para><emphasis role="italic"><emphasis role="bold">Example:</emphasis></emphasis></para>
          <screen># lctl get_param llite.testfs-*.extents_stats
  snapshot_time:                     1213828728.348516 (secs.usecs)
                         read           |            write
  extents          calls  %      cum%   |     calls  %     cum%
- 
+
  0K - 4K :        0      0      0      |     2      2     2
  4K - 8K :        0      0      0      |     0      0     2
  8K - 16K :       0      0      0      |     0      0     2
@@ -930,10 +933,10 @@ extents          calls  %      cum%   |     calls  %     cum%
            was read. The table shows cumulative extents organized according to size with statistics
            provided separately for reads and writes. Each row in the table shows the number of RPCs
            for reads and writes respectively (<literal>calls</literal>), the relative percentage of
-          total calls (<literal>%</literal>), and the cumulative percentage to that point in the
-          table of calls (<literal>cum %</literal>). </para>
-        <para> The file can be cleared by issuing the following
-          command:<screen># lctl set_param llite.testfs-*.extents_stats=0</screen></para>
+          total calls (<literal>%</literal>), and the cumulative percentage to
+          that point in the table of calls (<literal>cum %</literal>). </para>
+        <para> The file can be cleared by issuing the following command:
+        <screen># lctl set_param llite.testfs-*.extents_stats=1</screen></para>
        </section>
        <section>
          <title>Per-Process Client I/O Statistics</title>
@@ -1246,86 +1249,88 @@ write RPCs in flight: 0
            <primary>proc</primary>
            <secondary>readahead</secondary>
          </indexterm>Tuning File Readahead and Directory Statahead</title>
-      <para>File readahead and directory statahead enable reading of data into memory before a
-        process requests the data. File readahead reads file content data into memory and directory
-        statahead reads metadata into memory. When readahead and statahead work well, a process that
-        accesses data finds that the information it needs is available immediately when requested in
-        memory without the delay of network I/O.</para>
-      <para condition="l22">In Lustre software release 2.2.0, the directory statahead feature was
-        improved to enhance directory traversal performance. The improvements primarily addressed
-        two issues: <orderedlist>
-          <listitem>
-            <para>A race condition existed between the statahead thread and other VFS operations
-              while processing asynchronous <literal>getattr</literal> RPC replies, causing
-              duplicate entries in dcache. This issue was resolved by using statahead local dcache.
-            </para>
-          </listitem>
-          <listitem>
-            <para>File size/block attributes pre-fetching was not supported, so the traversing
-              thread had to send synchronous glimpse size RPCs to OST(s). This issue was resolved by
-              using asynchronous glimpse lock (AGL) RPCs to pre-fetch file size/block attributes
-              from OST(s).</para>
-          </listitem>
-        </orderedlist>
+      <para>File readahead and directory statahead enable reading of data
+      into memory before a process requests the data. File readahead prefetches
+      file content data into memory for <literal>read()</literal> related
+      calls, while directory statahead fetches file metadata into memory for
+      <literal>readdir()</literal> and <literal>stat()</literal> related
+      calls.  When readahead and statahead work well, a process that accesses
+      data finds that the information it needs is available immediately in
+      memory on the client when requested without the delay of network I/O.
        </para>
        <section remap="h4">
          <title>Tuning File Readahead</title>
-        <para>File readahead is triggered when two or more sequential reads by an application fail
-          to be satisfied by data in the Linux buffer cache. The size of the initial readahead is 1
-          MB. Additional readaheads grow linearly and increment until the readahead cache on the
-          client is full at 40 MB.</para>
+        <para>File readahead is triggered when two or more sequential reads
+        by an application fail to be satisfied by data in the Linux buffer
+        cache. The size of the initial readahead is 1 MB. Additional
+        readaheads grow linearly and increment until the readahead cache on
+        the client is full at 40 MB.</para>
          <para>Readahead tunables include:</para>
          <itemizedlist>
            <listitem>
-            <para><literal>llite.<replaceable>fsname-instance</replaceable>.max_read_ahead_mb</literal>
-              - Controls the maximum amount of data readahead on a file. Files are read ahead in
-              RPC-sized chunks (1 MB or the size of the <literal>read()</literal> call, if larger)
-              after the second sequential read on a file descriptor. Random reads are done at the
-              size of the <literal>read()</literal> call only (no readahead). Reads to
-              non-contiguous regions of the file reset the readahead algorithm, and readahead is not
-              triggered again until sequential reads take place again. </para>
-            <para>To disable readahead, set this tunable to 0. The default value is 40 MB.</para>
+            <para><literal>llite.<replaceable>fsname-instance</replaceable>.max_read_ahead_mb</literal> -
+              Controls the maximum amount of data readahead on a file.
+              Files are read ahead in RPC-sized chunks (1 MB or the size of
+              the <literal>read()</literal> call, if larger) after the second
+              sequential read on a file descriptor. Random reads are done at
+              the size of the <literal>read()</literal> call only (no
+              readahead). Reads to non-contiguous regions of the file reset
+              the readahead algorithm, and readahead is not triggered again
+              until sequential reads take place again.
+            </para>
+            <para>To disable readahead, set
+            <literal>max_read_ahead_mb=0</literal>. The default value is 40 MB.
+            </para>
            </listitem>
            <listitem>
-            <para><literal>llite.<replaceable>fsname-instance</replaceable>.max_read_ahead_whole_mb</literal>
-              - Controls the maximum size of a file that is read in its entirety, regardless of the
-              size of the <literal>read()</literal>.</para>
+            <para><literal>llite.<replaceable>fsname-instance</replaceable>.max_read_ahead_whole_mb</literal> -
+              Controls the maximum size of a file that is read in its entirety,
+              regardless of the size of the <literal>read()</literal>.  This
+              avoids multiple small read RPCs on relatively small files, when
+              it is not possible to efficiently detect a sequential read
+              pattern before the whole file has been read.
+            </para>
            </listitem>
          </itemizedlist>
        </section>
        <section>
          <title>Tuning Directory Statahead and AGL</title>
-        <para>Many system commands, such as <literal>ls –l</literal>, <literal>du</literal>, and
-            <literal>find</literal>, traverse a directory sequentially. To make these commands run
-          efficiently, the directory statahead and asynchronous glimpse lock (AGL) can be enabled to
-          improve the performance of traversing.</para>
+        <para>Many system commands, such as <literal>ls –l</literal>,
+        <literal>du</literal>, and <literal>find</literal>, traverse a
+        directory sequentially. To make these commands run efficiently, the
+        directory statahead can be enabled to improve the performance of
+        directory traversal.</para>
          <para>The statahead tunables are:</para>
          <itemizedlist>
            <listitem>
-            <para><literal>statahead_max</literal> - Controls whether directory statahead is enabled
-              and the maximum statahead window size (i.e., how many files can be pre-fetched by the
-              statahead thread). By default, statahead is enabled and the value of
-                <literal>statahead_max</literal> is 32.</para>
-            <para>To disable statahead, run:</para>
+            <para><literal>statahead_max</literal> -
+            Controls the maximum number of file attributes that will be
+            prefetched by the statahead thread. By default, statahead is
+            enabled and <literal>statahead_max</literal> is 32 files.</para>
+            <para>To disable statahead, set <literal>statahead_max</literal>
+            to zero via the following command on the client:</para>
              <screen>lctl set_param llite.*.statahead_max=0</screen>
-            <para>To set the maximum statahead window size (<replaceable>n</replaceable>),
-              run:</para>
+            <para>To change the maximum statahead window size on a client:</para>
              <screen>lctl set_param llite.*.statahead_max=<replaceable>n</replaceable></screen>
-            <para>The maximum value of <replaceable>n</replaceable> is 8192.</para>
-            <para>The AGL can be controlled by entering:</para>
-            <screen>lctl set_param llite.*.statahead_agl=<replaceable>n</replaceable></screen>
-            <para>The default value for <replaceable>n</replaceable> is 1, which enables the AGL. If
-                <replaceable>n</replaceable> is 0, the AGL is disabled.</para>
+            <para>The maximum <literal>statahead_max</literal> is 8192 files.
+            </para>
+            <para>The directory statahead thread will also prefetch the file
+            size/block attributes from the OSTs, so that all file attributes
+            are available on the client when requested by an application.
+            This is controlled by the asynchronous glimpse lock (AGL) setting.
+            The AGL behaviour can be disabled by setting:</para>
+            <screen>lctl set_param llite.*.statahead_agl=0</screen>
            </listitem>
            <listitem>
-            <para><literal>statahead_stats</literal> - A read-only interface that indicates the
-              current statahead and AGL statistics, such as how many times statahead/AGL has been
-              triggered since the last mount, how many statahead/AGL failures have occurred due to
-              an incorrect prediction or other causes.</para>
+            <para><literal>statahead_stats</literal> -
+            A read-only interface that provides current statahead and AGL
+            statistics, such as how many times statahead/AGL has been triggered
+            since the last mount, how many statahead/AGL failures have occurred
+            due to an incorrect prediction or other causes.</para>
              <note>
-              <para>The AGL is affected by statahead because the inodes processed by AGL are built
-                by the statahead thread, which means the statahead thread is the input of the AGL
-                pipeline. So if statahead is disabled, then the AGL is disabled by force.</para>
+              <para>AGL behaviour is affected by statahead since the inodes
+              processed by AGL are built by the statahead thread.  If
+              statahead is disabled, then AGL is also disabled.</para>
              </note>
            </listitem>
          </itemizedlist>
@@ -1542,7 +1547,7 @@ obdfilter.lol-OST0001.sync_on_lock_cancel=never</screen>
            minimum setting is 1 and maximum setting is 256.</para>
          <para>To set the <literal>max_rpcs_in_flight</literal> parameter, run
            the following command on the Lustre client:</para>
-        <screen>client$ lctl set_param mdc.*.max_rcps_in_flight=16</screen>
+        <screen>client$ lctl set_param mdc.*.max_rpcs_in_flight=16</screen>
          <para>The MDC <literal>max_mod_rpcs_in_flight</literal> parameter
            defines the maximum number of file system modifying RPCs that can be
            sent in parallel by a client to a MDT target. For example, the Lustre
@@ -1552,7 +1557,7 @@ obdfilter.lol-OST0001.sync_on_lock_cancel=never</screen>
            256.</para>
          <para>To set the <literal>max_mod_rpcs_in_flight</literal> parameter,
            run the following command on the Lustre client:</para>
-        <screen>client$ lctl set_param mdc.*.max_mod_rcps_in_flight=12</screen>
+        <screen>client$ lctl set_param mdc.*.max_mod_rpcs_in_flight=12</screen>
          <para>The <literal>max_mod_rpcs_in_flight</literal> value must be
            strictly less than the <literal>max_rpcs_in_flight</literal> value.
            It must also be less or equal to the MDT
@@ -1821,9 +1826,9 @@ req_timeout               6 samples [sec] 1 10 15 105
                messages or enable printing of <literal>D_NETERROR</literal> messages to the console
                using:<screen>lctl set_param printk=+neterror</screen></para>
              <para>Congested routers can be a source of spurious LND timeouts. To avoid this
-              situation, increase the number of LNET router buffers to reduce back-pressure and/or
+              situation, increase the number of LNet router buffers to reduce back-pressure and/or
                increase LND timeouts on all nodes on all connected networks. Also consider increasing
-              the total number of LNET router nodes in the system so that the aggregate router
+              the total number of LNet router nodes in the system so that the aggregate router
                bandwidth matches the aggregate server bandwidth.</para>
            </listitem>
            <listitem>
@@ -1912,12 +1917,12 @@ req_timeout               6 samples [sec] 1 10 15 105
    <section remap="h3">
      <title><indexterm>
          <primary>proc</primary>
-        <secondary>LNET</secondary>
+        <secondary>LNet</secondary>
        </indexterm><indexterm>
-        <primary>LNET</primary>
+        <primary>LNet</primary>
          <secondary>proc</secondary>
-      </indexterm>Monitoring LNET</title>
-    <para>LNET information is located in <literal>/proc/sys/lnet</literal> in these files:<itemizedlist>
+      </indexterm>Monitoring LNet</title>
+    <para>LNet information is located in <literal>/proc/sys/lnet</literal> in these files:<itemizedlist>
          <listitem>
            <para><literal>peers</literal> - Shows all NIDs known to this node and provides
              information on the queue state.</para>
@@ -2032,7 +2037,7 @@ nid                refs   state  max  rtr  min   tx    min   queue
              </tgroup>
            </informaltable>
            <para>Credits are initialized to allow a certain number of operations (in the example
-            above the table, eight as shown in the <literal>max</literal> column. LNET keeps track
+            above the table, eight as shown in the <literal>max</literal> column. LNet keeps track
              of the minimum number of credits ever seen over time showing the peak congestion that
              has occurred during the time monitored. Fewer available credits indicates a more
              congested resource. </para>
@@ -2047,7 +2052,7 @@ nid                refs   state  max  rtr  min   tx    min   queue
              credits (<literal>rtr/tx</literal>) that is less than <literal>max</literal> indicates
              operations are in progress. If the ratio <literal>rtr/tx</literal> is greater than
                <literal>max</literal>, operations are blocking.</para>
-          <para>LNET also limits concurrent sends and number of router buffers allocated to a single
+          <para>LNet also limits concurrent sends and number of router buffers allocated to a single
              peer so that no peer can occupy all these resources.</para>
          </listitem>
          <listitem>