</tbody>
</tgroup>
</informaltable>
+ <para>Buddy group cache information found in
+ <literal>/proc/fs/ldiskfs/<replaceable>disk_device</replaceable>/mb_groups</literal> may
+ be useful for assessing on-disk fragmentation. For
+ example:<screen>cat /proc/fs/ldiskfs/loop0/mb_groups
+#group: free free frags first pa [ 2^0 2^1 2^2 2^3 2^4 2^5 2^6 2^7 2^8 2^9
+ 2^10 2^11 2^12 2^13]
+#0 : 2936 2936 1 42 0 [ 0 0 0 1 1 1 1 2 0 1
+ 2 0 0 0 ]</screen></para>
+ <para>In this example, the columns show:<itemizedlist>
+ <listitem>
+ <para>#group number</para>
+ </listitem>
+ <listitem>
+ <para>Available blocks in the group</para>
+ </listitem>
+ <listitem>
+ <para>Blocks free on a disk</para>
+ </listitem>
+ <listitem>
+ <para>Number of free fragments</para>
+ </listitem>
+ <listitem>
+ <para>First free block in the group</para>
+ </listitem>
+ <listitem>
+ <para>Number of preallocated chunks (not blocks)</para>
+ </listitem>
+ <listitem>
+ <para>A series of available chunks of different sizes</para>
+ </listitem>
+ </itemizedlist></para>
</section>
<section>
<title>Monitoring Lustre File System I/O</title>
<para>The header information includes:</para>
<itemizedlist>
<listitem>
- <para><literal>snapshot_time</literal> - UNIX* epoch instant the file was read.</para>
+ <para><literal>snapshot_time</literal> - UNIX epoch instant the file was read.</para>
</listitem>
<listitem>
<para><literal>read RPCs in flight</literal> - Number of read RPCs issued by the OSC, but
<para>For information about optimizing the client I/O RPC stream, see <xref
xmlns:xlink="http://www.w3.org/1999/xlink" linkend="TuningClientIORPCStream"/>.</para>
</section>
- <section remap="h3">
- <title><indexterm>
- <primary>proc</primary>
- <secondary>read/write survey</secondary>
- </indexterm>Monitoring Client Read-Write Offset Statistics</title>
- <para>The <literal>offset_stats</literal> parameter maintains statistics for occurrences of a
- series of read or write calls from a process that did not access the next sequential
- location. The <literal>OFFSET</literal> field is reset to 0 (zero) whenever a different file
- is read or written.</para>
- <para>Read/write offset statistics are "off" by default. The statistics can be activated by
- writing anything into the <literal>offset_stats</literal> file.</para>
- <para>The <literal>offset_stats</literal> file can be cleared by
- entering:<screen>lctl set_param llite.*.offset_stats=0</screen></para>
- <para><emphasis role="italic"><emphasis role="bold">Example:</emphasis></emphasis></para>
- <screen># lctl get_param llite.testfs-f57dee0.offset_stats
-snapshot_time: 1155748884.591028 (secs.usecs)
- RANGE RANGE SMALLEST LARGEST
-R/W PID START END EXTENT EXTENT OFFSET
-R 8385 0 128 128 128 0
-R 8385 0 224 224 224 -128
-W 8385 0 250 50 100 0
-W 8385 100 1110 10 500 -150
-W 8384 0 5233 5233 5233 0
-R 8385 500 600 100 100 -610</screen>
- <para>In this example, <literal>snapshot_time</literal> is the UNIX epoch instant the file was
- read. The tabular data is described in the table below.</para>
- <informaltable frame="all">
- <tgroup cols="2">
- <colspec colname="c1" colwidth="50*"/>
- <colspec colname="c2" colwidth="50*"/>
- <thead>
- <row>
- <entry>
- <para><emphasis role="bold">Field</emphasis></para>
- </entry>
- <entry>
- <para><emphasis role="bold">Description</emphasis></para>
- </entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry>
- <para>R/W</para>
- </entry>
- <entry>
- <para>Indicates if the non-sequential call was a read or write</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>PID </para>
- </entry>
- <entry>
- <para>Process ID of the process that made the read/write call.</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>RANGE START/RANGE END</para>
- </entry>
- <entry>
- <para>Range in which the read/write calls were sequential.</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>SMALLEST EXTENT </para>
- </entry>
- <entry>
- <para>Smallest single read/write in the corresponding range (in bytes).</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>LARGEST EXTENT </para>
- </entry>
- <entry>
- <para>Largest single read/write in the corresponding range (in bytes).</para>
- </entry>
- </row>
- <row>
- <entry>
- <para>OFFSET </para>
- </entry>
- <entry>
- <para>Difference between the previous range end and the current range start.</para>
- </entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- <para><emphasis role="italic"><emphasis role="bold">Analysis:</emphasis></emphasis></para>
- <para>This data provides an indication of how contiguous or fragmented the data is. For
- example, the fourth entry in the example above shows the writes for this RPC were sequential
- in the range 100 to 1110 with the minimum write 10 bytes and the maximum write 500 bytes.
- The range started with an offset of -150 from the <literal>RANGE END</literal> of the
- previous entry in the example.</para>
- </section>
<section xml:id="lustreproc.clientstats" remap="h3">
<title><indexterm>
<primary>proc</primary>
<para>The <literal>stats</literal> file maintains statistics accumulate during typical
operation of a client across the VFS interface of the Lustre file system. Only non-zero
parameters are displayed in the file. </para>
- <para>Client statistics are enabled by default. The statistics can be cleared by echoing an
- empty string into the <literal>stats</literal> file or by using the command:
- <screen>lctl set_param llite.*.stats=0</screen></para>
+ <para>Client statistics are enabled by default.</para>
<note>
<para>Statistics for all mounted file systems can be discovered by
entering:<screen>lctl get_param llite.*.stats</screen></para>
setxattr 19059 samples [regs]
getxattr 61169 samples [regs]
</screen>
+ <para> The statistics can be cleared by echoing an empty string into the
+ <literal>stats</literal> file or by using the command:
+ <screen>lctl set_param llite.*.stats=0</screen></para>
<para>The statistics displayed are described in the table below.</para>
<informaltable frame="all">
<tgroup cols="2">
<title><indexterm>
<primary>proc</primary>
<secondary>read/write survey</secondary>
+ </indexterm>Monitoring Client Read-Write Offset Statistics</title>
+ <para>When the <literal>offset_stats</literal> parameter is set, statistics are maintained for
+ occurrences of a series of read or write calls from a process that did not access the next
+ sequential location. The <literal>OFFSET</literal> field is reset to 0 (zero) whenever a
+ different file is read or written.</para>
+ <note>
+ <para>By default, statistics are not collected in the <literal>offset_stats</literal>,
+ <literal>extents_stats</literal>, and <literal>extents_stats_per_process</literal> files
+ to reduce monitoring overhead when this information is not needed. The collection of
+ statistics in all three of these files is activated by writing anything into any one of
+ the files.</para>
+ </note>
+ <para><emphasis role="italic"><emphasis role="bold">Example:</emphasis></emphasis></para>
+ <screen># lctl get_param llite.testfs-f57dee0.offset_stats
+snapshot_time: 1155748884.591028 (secs.usecs)
+ RANGE RANGE SMALLEST LARGEST
+R/W PID START END EXTENT EXTENT OFFSET
+R 8385 0 128 128 128 0
+R 8385 0 224 224 224 -128
+W 8385 0 250 50 100 0
+W 8385 100 1110 10 500 -150
+W 8384 0 5233 5233 5233 0
+R 8385 500 600 100 100 -610</screen>
+ <para>In this example, <literal>snapshot_time</literal> is the UNIX epoch instant the file was
+ read. The tabular data is described in the table below.</para>
+ <para>The <literal>offset_stats</literal> file can be cleared by
+ entering:<screen>lctl set_param llite.*.offset_stats=0</screen></para>
+ <informaltable frame="all">
+ <tgroup cols="2">
+ <colspec colname="c1" colwidth="50*"/>
+ <colspec colname="c2" colwidth="50*"/>
+ <thead>
+ <row>
+ <entry>
+ <para><emphasis role="bold">Field</emphasis></para>
+ </entry>
+ <entry>
+ <para><emphasis role="bold">Description</emphasis></para>
+ </entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry>
+ <para>R/W</para>
+ </entry>
+ <entry>
+ <para>Indicates if the non-sequential call was a read or write</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para>PID </para>
+ </entry>
+ <entry>
+ <para>Process ID of the process that made the read/write call.</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para>RANGE START/RANGE END</para>
+ </entry>
+ <entry>
+ <para>Range in which the read/write calls were sequential.</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para>SMALLEST EXTENT </para>
+ </entry>
+ <entry>
+ <para>Smallest single read/write in the corresponding range (in bytes).</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para>LARGEST EXTENT </para>
+ </entry>
+ <entry>
+ <para>Largest single read/write in the corresponding range (in bytes).</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para>OFFSET </para>
+ </entry>
+ <entry>
+ <para>Difference between the previous range end and the current range start.</para>
+ </entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </informaltable>
+ <para><emphasis role="italic"><emphasis role="bold">Analysis:</emphasis></emphasis></para>
+ <para>This data provides an indication of how contiguous or fragmented the data is. For
+ example, the fourth entry in the example above shows the writes for this RPC were sequential
+ in the range 100 to 1110 with the minimum write 10 bytes and the maximum write 500 bytes.
+ The range started with an offset of -150 from the <literal>RANGE END</literal> of the
+ previous entry in the example.</para>
+ </section>
+ <section remap="h3">
+ <title><indexterm>
+ <primary>proc</primary>
+ <secondary>read/write survey</secondary>
</indexterm>Monitoring Client Read-Write Extent Statistics</title>
<para>For in-depth troubleshooting, client read-write extent statistics can be accessed to
obtain more detail about read/write I/O extents for the file system or for a particular
process.</para>
+ <note>
+ <para>By default, statistics are not collected in the <literal>offset_stats</literal>,
+ <literal>extents_stats</literal>, and <literal>extents_stats_per_process</literal> files
+ to reduce monitoring overhead when this information is not needed. The collection of
+ statistics in all three of these files is activated by writing anything into any one of
+ the files.</para>
+ </note>
<section remap="h3">
<title>Client-Based I/O Extent Size Survey</title>
- <para>The <literal>rw_extent_stats</literal> histogram in the <literal>llite</literal>
- directory shows the statistics for the sizes of the read?write I/O extents. This file does
- not maintain the per-process statistics. The file can be cleared by issuing the following
- command:<screen># lctl set_param llite.testfs-*.extents_stats=0</screen></para>
+ <para>The <literal>extent_stats</literal> histogram in the <literal>llite</literal>
+ directory shows the statistics for the sizes of the read/write I/O extents. This file does
+ not maintain the per-process statistics.</para>
<para><emphasis role="italic"><emphasis role="bold">Example:</emphasis></emphasis></para>
<screen># lctl get_param llite.testfs-*.extents_stats
snapshot_time: 1213828728.348516 (secs.usecs)
for reads and writes respectively (<literal>calls</literal>), the relative percentage of
total calls (<literal>%</literal>), and the cumulative percentage to that point in the
table of calls (<literal>cum %</literal>). </para>
+ <para> The file can be cleared by issuing the following
+ command:<screen># lctl set_param llite.testfs-*.extents_stats=0</screen></para>
</section>
<section>
<title>Per-Process Client I/O Statistics</title>
statahead reads metadata into memory. When readahead and statahead work well, a process that
accesses data finds that the information it needs is available immediately when requested in
memory without the delay of network I/O.</para>
- <para condition="l22">In Lustre release 2.2.0, the directory statahead feature was improved to
- enhance directory traversal performance. The improvements primarily addressed two
- issues:
- <orderedlist>
- <listitem>
- <para>A race condition existed between the statahead thread and other VFS operations while
- processing asynchronous <literal>getattr</literal> RPC replies, causing duplicate
- entries in dcache. This issue was resolved by using statahead local dcache. </para>
- </listitem>
- <listitem>
- <para>File size/block attributes pre-fetching was not supported, so the traversing thread
- had to send synchronous glimpse size RPCs to OST(s). This issue was resolved by using
- asynchronous glimpse lock (AGL) RPCs to pre-fetch file size/block attributes from
- OST(s).</para>
- </listitem>
- </orderedlist>
- </para>
+ <para condition="l22">In Lustre software release 2.2.0, the directory statahead feature was
+ improved to enhance directory traversal performance. The improvements primarily addressed
+ two issues: <orderedlist>
+ <listitem>
+ <para>A race condition existed between the statahead thread and other VFS operations
+ while processing asynchronous <literal>getattr</literal> RPC replies, causing
+ duplicate entries in dcache. This issue was resolved by using statahead local dcache.
+ </para>
+ </listitem>
+ <listitem>
+ <para>File size/block attributes pre-fetching was not supported, so the traversing
+ thread had to send synchronous glimpse size RPCs to OST(s). This issue was resolved by
+ using asynchronous glimpse lock (AGL) RPCs to pre-fetch file size/block attributes
+ from OST(s).</para>
+ </listitem>
+ </orderedlist>
+ </para>
<section remap="h4">
<title>Tuning File Readahead</title>
<para>File readahead is triggered when two or more sequential reads by an application fail
<para>To re-enable the writethrough cache on one OST, run:</para>
<screen>root@oss1# lctl set_param obdfilter.{OST_name}.writethrough_cache_enable=1</screen>
<para>To check if the writethrough cache is enabled, run:</para>
- <screen>root@oss1# lctl set_param obdfilter.*.writethrough_cache_enable=1</screen>
+ <screen>root@oss1# lctl get_param obdfilter.*.writethrough_cache_enable</screen>
</listitem>
<listitem>
<para><literal>readcache_max_filesize</literal> - Controls the maximum size of a file