Server-Side Advice and Hinting
</title>
<section><title>Overview</title>
- <para>Use the <literal>lfs ladvise</literal> command give file access
+ <para>Use the <literal>lfs ladvise</literal> command to give file access
advices or hints to servers.</para>
<screen>lfs ladvise [--advice|-a ADVICE ] [--background|-b]
[--start|-s START[kMGT]]
cache</para>
<para><literal>dontneed</literal> to cleanup data cache on
server</para>
+ <para><literal>lockahead</literal> Request an LDLM extent lock
+ of the given mode on the given byte range </para>
+ <para><literal>noexpand</literal> Disable extent lock expansion
+ behavior for I/O to this file descriptor</para>
+ <para><literal>unset</literal> Unset/clear a previous advice
+ (Currently only supports LU_ADVISE_LOCKNOEXPAND)</para>
</entry>
</row>
<row>
<literal>-e</literal> option.</para>
</entry>
</row>
+ <row>
+ <entry>
+ <para><literal>-m</literal>, <literal>--mode=</literal>
+ <literal>MODE</literal></para>
+ </entry>
+ <entry>
+ <para>Lockahead request mode <literal>{READ,WRITE}</literal>.
+ Request a lock with this mode.</para>
+ </entry>
+ </row>
</tbody>
</tgroup>
</informaltable>
random IO is a net benefit. Fetching that data into each client cache with
fadvise() may not be, due to much more data being sent to the client.
</para>
+ <para>
+ <literal>ladvise lockahead</literal> is different in that it attempts to
+ control LDLM locking behavior by explicitly requesting LDLM locks in
+ advance of use. This does not directly affect caching behavior, instead
+ it is used in special cases to avoid pathological results (lock exchange)
+ from the normal LDLM locking behavior.
+ </para>
+ <para>
+ Note that the <literal>noexpand</literal> and <literal>unset</literal>
+ advices work on a specific file descriptor, so using them via lfs has no
+ effect. They must be used on a particular file descriptor which is used
+ for i/o to have any effect.
+ </para>
<para>The main difference between the Linux <literal>fadvise()</literal>
system call and <literal>lfs ladvise</literal> is that
<literal>fadvise()</literal> is only a client side mechanism that does
cache of the file in the memory.</para>
<screen>client1$ lfs ladvise -a dontneed -s 0 -e 1048576000 /mnt/lustre/file1
</screen>
+ <para>The following example requests an LDLM read lock on the first
+ 1 MiB of <literal>/mnt/lustre/file1</literal>. This will attempt to
+ request a lock from the OST holding that region of the file.</para>
+ <screen>client1$ lfs ladvise -a lockahead -m READ -s 0 -e 1M /mnt/lustre/file1
+ </screen>
+ <para>The following example requests an LDLM write lock on
+ [3 MiB, 10 MiB] of <literal>/mnt/lustre/file1</literal>. This will
+ attempt to request a lock from the OST holding that region of the
+ file.</para>
+ <screen>client1$ lfs ladvise -a lockahead -m WRITE -s 3M -e 10M /mnt/lustre/file1
+ </screen>
</section>
</section>
<section condition="l29">
<entry>
<para>Additional arguments for future advice types and
should be set to zero if not explicitly required for a given
- advice type.</para>
+ advice type. Advice-specific names for these fields
+ follow.</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para> <literal>lla_lockahead_mode</literal></para>
+ </entry>
+ <entry>
+ <para>When using LU_ADVISE_LOCKAHEAD, the 'lla_value1' field
+ is used to communicate the requested lock mode, and can be
+ referred to as lla_lockahead_mode.</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para> <literal>lla_peradvice_flags</literal></para>
+ </entry>
+ <entry>
+ <para>When using advices which support them, the 'lla_value2'
+ field is used to communicate per-advice flags and can be
+ referred to as 'lla_peradvice_flags'.</para>
+ </entry>
+ </row>
+ <row>
+ <entry>
+ <para> <literal>lla_lockahead_result</literal></para>
+ </entry>
+ <entry>
+ <para>When using LU_ADVISE_LOCKAHEAD, the 'lla_value3' field
+ is used to communicate the result of the request, and can be
+ referred to as lla_lockahead_result.</para>
</entry>
</row>
</tbody>
<emphasis>fadvise()</emphasis> may not be beneficial, due to much more
data being sent to the clients.
</para>
+ <para>
+ LU_LADVISE_LOCKAHEAD merits a special comment. While it is possible
+ and encouraged to use it directly in your application to avoid lock
+ contention (primarily for writing to a single file from multiple
+ clients), it will also be available in the MPI-I/O / MPICH library
+ from ANL for use with the i/o aggregation mode of that library. This
+ is intended (eventually) to be the primary way this feature is used.
+ </para>
+ <para>
+ At the time of writing, this support is proposed as a patch but is
+ not yet merged in to the public ANL code base. Users are encouraged
+ to check their MPICH documentation and/or check with their library
+ provider about support.
+ </para>
<para>While conceptually similar to the
<emphasis>posix_fadvise</emphasis> and Linux
<emphasis>fadvise</emphasis> system calls, the main difference of