1 <?xml version='1.0' encoding='UTF-8'?><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="lsom"
3 <title xml:id="lsom.title">Lazy Size on MDT (LSoM)</title>
4 <para>This chapter describes Lazy Size on MDT (LSoM).</para>
5 <section xml:id="dbdoclet.lsomintro">
8 <primary>lsom</primary>
9 <secondary>intro</secondary>
10 </indexterm>Introduction to Lazy Size on MDT (LSoM)</title>
11 <para>In the Lustre file system, MDSs store the ctime, mtime, owner,
12 and other file attributes. The OSSs store the size and number of
13 blocks used for each file. To obtain the correct file size, the client
14 must contact each OST that the file is stored across, which means
15 multiple RPCs to get the size and blocks for a file when a file is
16 striped over multiple OSTs. The Lazy Size on MDT (LSoM) feature stores
17 the file size on the MDS and avoids the need to fetch the file size
18 from the OST(s) in cases where the application understands that the
19 size may not be accurate. Lazy means there is no guarantee of the
20 accuracy of the attributes stored on the MDS.</para>
21 <para>Since many Lustre installations use SSD for MDT storage, the
22 motivation for the LSoM work is to speed up the time it takes to get
23 the size of a file from the Lustre file system by storing that data on
24 the MDTs. We expect this feature to be initially used by Lustre policy
25 engines that scan the backend MDT storage, make decisions based on broad
26 size categories, and do not depend on a totally accurate file size.
27 Examples include Lester, Robinhood, Zester, and various vendor offerings.
28 Future improvements will allow the LSoM data to be accessed by tools such
29 as <literal>lfs find</literal>.</para>
31 <section xml:id="dbdoclet.enablelsom">
32 <title><indexterm><primary>lsom</primary>
33 <secondary>enablelsom</secondary></indexterm>Enable LSoM</title>
34 <para> LSoM is always enabled and nothing needs to be done to enable the
35 feature for fetching the LSoM data when scanning the MDT inodes with a
36 policy engine. It is also possible to access the LSoM data on the client
37 via the <literal>lfs getsom</literal> command. Because the LSoM data is
38 currently accessed on the client via the xattr interface, the
39 <literal>xattr_cache</literal> will cache the file size and block count on
40 the client as long as the inode is cached. In most cases this is
41 desirable, since it improves access to the LSoM data. However, it also
42 means that the LSoM data may be stale if the file size is changed after the
43 xattr is first accessed or if the xattr is accessed shortly after the file
44 is first created.</para>
45 <para>If it is necessary to access up-to-date LSoM data that has gone
46 stale, it is possible to flush the xattr cache from the client by
47 cancelling the MDC locks via
48 <literal>ldlm set_param ldlm.namespaces.*mdc*.lru_size=clear</literal>.
49 Otherwise, the file attributes will be dropped from the client cache if
50 the file has not been accessed before the LDLM lock timeout. The timeout
52 <literal>lctl get_param ldlm.namespaces.*mdc*.lru_max_age</literal>.</para>
53 <para>If repeated access to LSoM attributes for files that are recently
54 created or frequently modified from a specific client, such as an HSM agent
55 node, it is possible to disable xattr caching on a client via:
56 <literal>lctl set_param llite.*.xattr_cache=0</literal>. This may cause
57 extra overhead when accessing files, and is not recommended for normal
60 <section xml:id="dbdoclet.usercmds">
61 <title><indexterm><primary>lsom</primary>
62 <secondary>usercommands</secondary></indexterm>User Commands</title>
63 <para>Lustre provides the <literal>lfs getsom</literal> command to list
64 file attributes that are stored on the MDT.</para>
65 <para>The <literal>llsom_sync</literal> command allows the user to sync
66 the file attributes on the MDT with the valid/up-to-date data on the
67 OSTs. <literal>llsom_sync</literal> is called on the client with the
68 Lustre file system mount point. <literal>llsom_sync</literal> uses Lustre
69 MDS changelogs and, thus, a changelog user must be registered to use this
71 <section xml:id="dbdoclet.lfsgetsom">
72 <title><indexterm><primary>lsom</primary>
73 <secondary>lfsgetsom</secondary></indexterm>lfs getsom for LSoM data
75 <para>The <literal>lfs getsom</literal> command lists file attributes
76 that are stored on the MDT. <literal>lfs getsom</literal> is called
77 with the full path and file name for a file on the Lustre file
78 system. If no flags are used, then all file attributes stored on the
79 MDS will be shown.</para>
80 <section><title>lfs getsom Command</title>
81 <para><screen>lfs getsom [-s] [-b] [-f] <filename></screen>
82 The various <literal>lfs getsom</literal> options are listed and
83 described below.</para>
84 <informaltable frame="all">
86 <colspec colname="c1" colwidth="3*" />
87 <colspec colname="c2" colwidth="7*" />
92 <emphasis role="bold">Option</emphasis>
97 <emphasis role="bold">Description</emphasis>
106 <literal>-s</literal>
110 <para>Only show the size value of the LSoM data for a given
111 file. This is an optional flag</para>
117 <literal>-b</literal>
121 <para>Only show the blocks value of the LSoM data for a
122 given file. This is an optional flag</para>
128 <literal>-f</literal>
132 <para>Only show the flag value of the LSoM data for a given
133 file. This is an optional flag. Valid flags are:</para>
134 <para>SOM_FL_UNKNOWN = 0x0000 - Unknown or no SoM data,
135 must get size from OSTs.</para>
136 <para>SOM_FL_STRICT = 0x0001 - Known strictly correct, FLR
137 file (SoM guaranteed)</para>
138 <para>SOM_FL_STALE = 0x0002 - Known stale -was right at
139 some point in the past, but it is known (or likely) to be
140 incorrect now (e.g. opened for write)</para>
141 <para>SOM_FL_LAZY= 0x0004 - Approximate, may never have
142 been strictly correct, need to sync SOM data to achieve
143 eventual consistency.</para>
151 <section><title>Syncing LSoM data</title>
152 <para>The <literal>llsom_sync</literal> command allows the user to sync
153 the file attributes on the MDT with the valid/up-to-date data on the
154 OSTs. <literal>llsom_sync</literal> is called on the client with the
155 client mount point for the Lustre file system.
156 <literal>llsom_sync</literal> uses Lustre MDS changelogs and, thus, a
157 changelog user must be registered to use this utility.</para>
158 <section><title>llsom_sync Command</title>
159 <screen>llsom_sync --mdt|-m <mdt> --user|-u <user_id>
160 [--daemonize|-d] [--verbose|-v] [--interval|-i] [--min-age|-a]
161 [--max-cache|-c] [--sync|-s] <lustre_mount_point></screen>
162 <para>The various <literal>llsom_sync</literal> options are
163 listed and described below.</para>
164 <informaltable frame="all">
166 <colspec colname="c1" colwidth="3*" />
167 <colspec colname="c2" colwidth="7*" />
171 <para><emphasis role="bold">Option</emphasis></para>
174 <para><emphasis role="bold">Description</emphasis></para>
181 <para><literal>--mdt | -m <mdt></literal></para>
184 <para>The metadata device which need to be synced the LSoM
185 xattr of files. A changelog user must be registered for
186 this device.Required flag.</para>
191 <para><literal>--user | -u <user_id></literal></para>
194 <para>The changelog user id for the MDT device. Required
200 <para><literal>--daemonize | -d</literal></para>
203 <para>Optional flag to “daemonize” the program. In daemon
204 mode, the utility will scan, process the changelog records
205 and sync the LSoM xattr for files periodically.</para>
210 <para><literal>--verbose | -v</literal></para>
213 <para>Optional flag to produce verbose output.</para>
218 <para><literal>--interval | -i</literal></para>
221 <para>Optional flag for the time interval to scan the
222 Lustre changelog and process the log record in daemon
228 <para><literal>--min-age | -a</literal></para>
231 <para>Optional flag for the time that
232 <literal>llsom_sync</literal> tool will not try to sync the
233 LSoM data for any files closed less than this many seconds
234 old. The default min-age value is 600s(10 minutes).</para>
239 <para><literal>--max-cache | -c</literal></para>
242 <para>Optional flag for the total memory used for the FID
243 cache which can be with a suffix [KkGgMm].The default
244 max-cache value is 256MB. For the parameter value < 100,
245 it is taken as the percentage of total memory size used for
246 the FID cache instead of the cache size.</para>
251 <para><literal>--sync | -s</literal></para>
254 <para>Optional flag to sync file data to make the dirty
255 data out of cache to ensure the blocks count is correct
256 when update the file LSoM xattr. This option could hurt
257 server performance significantly if thousands of fsync
258 requests are sent.</para>