lustre/doc/llapi_ladvise.3

   1 .TH llapi_ladvise 3 "2015 Dec 15" "Lustre User API"
   2 .SH NAME
   3 llapi_ladvise \- give IO advice/hints on a Lustre file to the server
   4 .SH SYNOPSIS
   5 .nf
   6 .B #include <lustre/lustreapi.h>
   7 .sp
   8 .BI "int llapi_ladvise(int " fd ", unsigned long long " flags ,
   9 .BI "                  int " num_advise ",
  10 .BI "                  struct llapi_lu_ladvise *" ladvise ");"
  11 .sp
  12 .fi
  13 .SH DESCRIPTION
  14 .LP
  15 .B llapi_ladvise()
  16 passes an array of
  17 .I num_advise
  18 I/O hints (up to a maximum of
  19 .BR LAH_COUNT_MAX
  20 items) in
  21 .I ladvise
  22 for the file descriptor
  23 .I fd
  24 from an application to one or more Lustre servers.  Optionally,
  25 .I flags
  26 can modify how the advice will be processed via bitwise-or'd values:
  27 .TP
  28 .B LF_ASYNC
  29 Client returns to userspace immediately after submitting ladvise RPCs, leaving
  30 server threads to handle the advices asynchronously.
  31 .TP
  32 .B LF_UNSET
  33 Unset/clear a previous advice (Currently only supports LU_ADVISE_LOCKNOEXPAND).
  34 .PP
  35 Each of the
  36 .I ladvise
  37 elements is an
  38 .B llapi_lu_ladvise
  39 structure, which contains the following fields:
  40 .PP
  41 .in +4n
  42 .nf
  43 struct llapi_lu_ladvise {
  44         __u16 lla_advice;       /* advice type */
  45         __u16 lla_value1;       /* values for different advice types */
  46         __u32 lla_value2;
  47         __u64 lla_start;        /* first byte of extent for advice */
  48         __u64 lla_end;          /* last byte of extent for advice */
  49         __u32 lla_value3;
  50         __u32 lla_value4;
  51 };
  52 .fi
  53 .in
  54 .TP
  55 .I lla_ladvice
  56 specifies the advice for the given file range, currently one of:
  57 .TP
  58 .B LU_LADVISE_WILLREAD
  59 Prefetch data into server cache using optimum I/O size for the server.
  60 .TP
  61 .B LU_LADVISE_DONTNEED
  62 Clean cached data for the specified file range(s) on the server.
  63 .TP
  64 .B LU_LADVISE_LOCKAHEAD
  65 Request an LDLM extent lock of the given mode on the given byte range.
  66 .TP
  67 .B LU_LADVISE_NOEXPAND
  68 Disable extent lock expansion behavior for I/O to this file descriptor.
  69 .TP
  70 .I lla_start
  71 is the offset in bytes for the start of this advice.
  72 .TP
  73 .I lla_end
  74 is the offset in bytes (non-inclusive) for the end of this advice.
  75 .TP
  76 .IR lla_value1 , " lla_value2" , " lla_value3" , " lla_value4"
  77 additional arguments for future advice types and should be
  78 set to zero if not explicitly required for a given advice type.
  79 Advice-specific names for these fields follow.
  80 .TP
  81 .IR lla_lockahead_mode
  82 When using LU_ADVISE_LOCKAHEAD, the 'lla_value1' field is used to
  83 communicate the requested lock mode, and can be referred to as
  84 lla_lockahead_mode.
  85 .TP
  86 .IR lla_peradvice_flags
  87 When using advices which support them, the 'lla_value2' field is
  88 used to communicate per-advice flags and can be referred to as
  89 lla_peradvice_flags.  Both LF_ASYNC and LF_UNSET are supported
  90 as peradvice flags.
  91 .TP
  92 .IR lla_lockahead_result
  93 When using LU_ADVISE_LOCKAHEAD, the 'lla_value3' field is used to
  94 communicate the result of the request, and can be referred to as lla_lockahead_result.
  95 .PP
  96 .PP
  97 .B llapi_ladvise()
  98 forwards the advice to Lustre servers without guaranteeing how and when
  99 servers will react to the advice. Actions may or may not be triggered when the
 100 advices are recieved, depending on the type of the advice as well as the
 101 real-time decision of the affected server-side components.
 102
 103 Typical usage of
 104 .B llapi_ladvise()
 105 is to enable applications and users (via
 106 .BR "lfs ladvise" (1))
 107 with external knowledge about application I/O patterns to intervene in
 108 server-side I/O handling. For example, if a group of different clients
 109 are doing small random reads of a file, prefetching pages into OSS cache
 110 with big linear reads before the random IO is a net benefit. Fetching
 111 that data into each client cache with
 112 .B fadvise()
 113 may not be, due to much more data being sent to the clients.
 114
 115 LU_LADVISE_LOCKAHEAD merits a special comment. While it is possible and
 116 encouraged to use it directly in your application to avoid lock contention
 117 (primarily for writing to a single file from multiple clients), it will
 118 also be available in the MPI-I/O / MPICH library from ANL for use with the
 119 i/o aggregation mode of that library. This is intended (eventually) to be
 120 the primary way this feature is used.
 121
 122 While conceptually similar to the
 123 .BR posix_fadvise (2)
 124 and Linux
 125 .BR fadvise (2)
 126 system calls, the main difference of
 127 .B llapi_ladvise()
 128 is that
 129 .BR fadvise() / posix_fadvise()
 130 are client side mechanisms that do not pass advice to the filesystem, while
 131 .B llapi_ladvise()
 132 sends advice or hints to one or more Lustre servers on which the file
 133 is stored. In some cases it may be desirable to use both interfaces.
 134 .PP
 135 .SH RETURN VALUES
 136 .PP
 137 .B llapi_ladvise()
 138 return 0 on success, or -1 if an error occurred (in which case, errno is set
 139 appropriately).
 140 .SH ERRORS
 141 .TP 15
 142 .SM ENOMEM
 143 Insufficient memory to complete operation.
 144 .TP
 145 .SM EINVAL
 146 One or more invalid arguments are given.
 147 .TP
 148 .SM EFAULT
 149 memory region pointed by
 150 .I ladvise
 151 is not properly mapped.
 152 .TP
 153 .SM ENOTSUPP
 154 Advice type is not supported.
 155 .SH "SEE ALSO"
 156 .BR lfs-ladvise (1),
 157 .BR lustreapi (7)