1 <?xml version='1.0' encoding='UTF-8'?>
2 <chapter xmlns="http://docbook.org/ns/docbook"
3 xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
4 xml:id="dataonmdt" condition="l2B">
5 <title xml:id="dataonmdt.title">Data on MDT (DoM)</title>
6 <para>This chapter describes Data on MDT (DoM).</para>
7 <section xml:id="domintro">
10 <primary>dom</primary>
13 <primary>dom</primary>
14 <secondary>intro</secondary>
15 </indexterm>Introduction to Data on MDT (DoM)</title>
16 <para>The Lustre Data on MDT (DoM) feature improves small file IO by
17 placing small files directly on the MDT, and also improves large file IO
18 by avoiding the OST being affected by small random IO that can cause
19 device seeking and hurt the streaming IO performance. Therefore, users
20 can expect more consistent performance for both small file IO and mixed IO
22 <para>The layout of a DoM file is stored on disk as a composite layout
23 and is a special case of Progressive File Layout (PFL). Please see
24 <xref linkend="pfl" /> for more information on PFL. For DoM files, the
25 file layout is composed of the component of the file, which is placed on
26 an MDT, and the rest of components are placed on OSTs, if needed. The
27 first component is placed on the MDT in the MDT object data blocks.
28 This component always has one stripe with size equal to the component
29 size. Such a component with an MDT layout can be only the first component
30 in composite layout. The rest of components are placed over OSTs as usual
31 with a RAID0 layout. The OST components are not instantiated until
32 a client writes or truncates the file beyond the size of the MDT
35 <para>When specifying a DoM layout, it might be assumed that the
36 remaning layout will automatically go to the OSTs,
37 but this is not the case. As with regular PFL layouts, if an
38 <literal>EOF</literal> component is <emphasis>not</emphasis> present,
39 then writes beyond the end of the last existing component will fail
40 with error <literal>ENODATA</literal> ("No data available").
41 For example, creating a DoM file with a component end at 1 MB will
42 not be writable beyond 1 MiB:</para>
44 $ lfs setstripe -E 1M -L mdt /mnt/testfs/domdir
45 $ dd if=/dev/zero of=/mnt/testfs/domdir/testfile bs=1M
46 dd: error writing '/myth/tmp/pfl-mdt-only': No data available
49 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00186441 s, 562 MB/s
51 <para>To allow the file to grow beyond 1 MB, add one or more regular
52 OST components with an <literal>EOF</literal> component at the end:
55 lfs setstripe -E 1M -L mdt -E 1G -c1 -E eof -c4 /mnt/testfs/domdir
59 <section xml:id="usercommands">
62 <primary>dom</primary>
65 <primary>dom</primary>
66 <secondary>usercommands</secondary>
67 </indexterm>User Commands</title>
68 <para>Lustre provides the <literal>lfs setstripe</literal> command for
69 users to create DoM files. Also, as usual,
70 <literal>lfs getstripe</literal> command can be used to list the
71 striping/component information for a given file, while
72 <literal>lfs find</literal> command can be used to search the directory
73 tree rooted at the given directory or file name for the files that match
74 the given DoM component parameters, e.g. layout type.</para>
75 <section xml:id="lfssetstripe">
78 <primary>dom</primary>
79 <secondary>lfssetstripe</secondary>
80 </indexterm>lfs setstripe for DoM files</title>
81 <para>The <literal>lfs setstripe</literal> command is used to create
83 <section><title>Command</title>
85 lfs setstripe --component-end|-E end1 --layout|-L mdt \
86 [--component-end|-E end2 [STRIPE_OPTIONS] ...] <filename>
88 The command above creates a file with the special composite
89 layout, which defines the first component as an MDT component. The
90 MDT component must start from offset 0 and ends at
91 <replaceable>end1</replaceable>. The
92 <replaceable>end1</replaceable> is also the stripe size of this
93 component, and is limited by the
94 <literal>lod.*.dom_stripesize</literal> of the MDT the file is
95 created on. No other options are required for this component.
96 The rest of the components use the normal syntax for composite
99 <note><para>If the next component doesn't specify striping, such
101 <screen>lfs setstripe -E 1M -L mdt -E EOF <filename></screen>
102 Then that component get its settings from the default filesystem
103 striping.</para></note>
105 <section><title>Example</title>
106 <para>The command below creates a file with a DoM layout. The first
107 component has an <literal>mdt</literal> layout and is placed on the
108 MDT, covering [0, 1M). The second component covers [1M, EOF) and is
109 striped over all available OSTs.</para>
110 <para><screen>client$ lfs setstripe -E 1M -L mdt -E -1 -S 4M -c -1 \
111 /mnt/lustre/domfile</screen></para>
112 <para>The resulting layout is illustrated by
113 <xref xmlns:xlink="http://www.w3.org/1999/xlink"
114 linkend="dataonmdt.fig.layout1" />.</para>
115 <figure xml:id="dataonmdt.fig.layout1">
116 <title>Resulting file layout</title>
119 <imagedata scalefit="1" width="50%"
120 fileref="./figures/DoM_Layout1.png" />
123 <phrase>Resulting file layout</phrase>
127 <para>The resulting can also be checked with
128 <literal>lfs getstripe</literal> as shown below:</para>
129 <screen>client$ lfs getstripe /mnt/lustre/domfile
136 lcme_extent.e_start: 0
137 lcme_extent.e_end: 1048576
139 lmm_stripe_size: 1048576
147 lcme_extent.e_start: 1048576
148 lcme_extent.e_end: EOF
150 lmm_stripe_size: 4194304
152 lmm_layout_gen: 65535
153 lmm_stripe_offset: -1</screen>
154 <para>The output above shows that the first component has size 1MB and
155 pattern is 'mdt'. The second component is not instantiated yet, which
156 is seen by <literal>lcme_flags: 0</literal>.</para>
157 <para>If more than 1MB of data is written to the file, then
158 <literal>lfs getstripe</literal> output is changed accordingly:</para>
159 <screen>client$ lfs getstripe /mnt/lustre/domfile
166 lcme_extent.e_start: 0
167 lcme_extent.e_end: 1048576
169 lmm_stripe_size: 1048576
177 lcme_extent.e_start: 1048576
178 lcme_extent.e_end: EOF
180 lmm_stripe_size: 4194304
185 - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
186 - 1: { l_ost_idx: 1, l_fid: [0x100010000:0x2:0x0] }</screen>
187 <para>The output above shows that the second component now has objects
188 on OSTs with a 4MB stripe.</para>
191 <section><title>Setting a default DoM layout to an existing directory
193 <para>A DoM layout can be set on an existing directory as well. When
194 set, all the files created after that will inherit this layout by
196 <section><title>Command</title>
197 <screen>lfs setstripe --component-end|-E end1 --layout|-L mdt \
198 [--component-end|-E end2 [STRIPE_OPTIONS] ...] <dirname></screen>
200 <section><title>Example</title>
201 <screen>client$ mkdir /mnt/lustre/domdir
202 client$ touch /mnt/lustre/domdir/normfile
203 client$ lfs setstripe -E 1M -L mdt -E -1 /mnt/lustre/domdir/
204 client$ lfs getstripe -d /mnt/lustre/domdir
210 lcme_extent.e_start: 0
211 lcme_extent.e_end: 1048576
212 stripe_count: 0 stripe_size: 1048576 \
213 pattern: mdt stripe_offset: -1
217 lcme_extent.e_start: 1048576
218 lcme_extent.e_end: EOF
219 stripe_count: 1 stripe_size: 1048576 \
220 pattern: raid0 stripe_offset: -1
222 <para>In the output above, it can be seen that the directory has
223 a default layout with a DoM component.</para>
224 <para>The following example will check layouts of files in that
226 <screen>client$ touch /mnt/lustre/domdir/domfile
227 client$ lfs getstripe /mnt/lustre/domdir/normfile
228 /mnt/lustre/domdir/normfile
230 lmm_stripe_size: 1048576
234 obdidx objid objid group
238 client$ lfs getstripe /mnt/lustre/domdir/domfile
239 /mnt/lustre/domdir/domfile
245 lcme_extent.e_start: 0
246 lcme_extent.e_end: 1048576
248 lmm_stripe_size: 1048576
256 lcme_extent.e_start: 1048576
257 lcme_extent.e_end: EOF
259 lmm_stripe_size: 1048576
261 lmm_layout_gen: 65535
262 lmm_stripe_offset: -1</screen>
263 <para>We can see that first file
264 <emphasis role="bold">normfile</emphasis> in that directory has an
265 ordinary layout, whereas the file <emphasis role="bold">domfile
266 </emphasis> inherits the directory default layout and is a DoM
268 <note><para>The directory default layout setting will be inherited
269 by new files even if the server DoM size limit will be set to a
270 lower value.</para></note>
273 <section xml:id="domstripesize">
276 <primary>dom</primary>
279 <primary>dom</primary>
280 <secondary>domstripesize</secondary>
281 </indexterm>DoM Stripe Size Restrictions</title>
282 <para>The maximum size of a DoM component is restricted in several
283 ways to protect the MDT from being eventually filled with large files.
285 <section><title>LFS limits for DoM component size</title>
286 <para><literal>lfs setstripe</literal> allows for setting the
287 component size for MDT layouts up to 1GB (this is a compile-time
288 limit to avoid improper configuration), however, the size must
289 also be aligned by 64KB due to the minimum stripe size in Lustre
290 (see <xref linkend="settinguplustresystem.tab2"/>
291 <literal>Minimum stripe size</literal>). There is also a limit
292 imposed on each file by <literal>lfs setstripe -E end</literal>
293 that may be smaller than the MDT-imposed limit if this is better
294 for a particular usage.</para>
296 <section><title>MDT Server Limits</title>
297 <para>The <literal>lod.$fsname-MDTxxxx.dom_stripesize</literal>
298 is used to control the per-MDT maximum size for a DoM component.
299 Larger DoM components specified by the user will be truncated to
300 the MDT-specified limit, and as such may be different on each
301 MDT to balance DoM space usage on each MDT separately, if needed.
302 It is 1MB by default and can be changed with the
303 <literal>lctl</literal> tool. For more information on setting
304 <literal>dom_stripesize</literal> please see
305 <xref linkend="dom_stripesize" />.</para>
308 <section xml:id="domlfsgetstripe">
311 <primary>dom</primary>
314 <primary>dom</primary>
315 <secondary>lfsgetstripe</secondary>
316 </indexterm>lfs getstripe for DoM files</title>
317 <para>The <literal>lfs getstripe</literal> command is used to list
318 the striping/component information for a given file. For DoM files, it
319 can be used to check its layout and size.</para>
320 <section><title>Command</title>
321 <para><screen>lfs getstripe [--component-id|-I [comp_id]] [--layout|-L] \
322 [--stripe-size|-S] <dirname|filename></screen></para>
324 <section><title>Examples</title>
325 <screen>client$ lfs getstripe -I1 /mnt/lustre/domfile
332 lcme_extent.e_start: 0
333 lcme_extent.e_end: 1048576
335 lmm_stripe_size: 1048576
339 lmm_objects:</screen>
340 <para>Short info about the layout and size of DoM component can
341 be obtained with the use of the <literal>-L</literal> option
342 along with <literal>-S</literal> or <literal>-E</literal> options:
343 <screen>client$ lfs getstripe -I1 -L -S /mnt/lustre/domfile
344 lmm_stripe_size: 1048576
346 client$ lfs getstripe -I1 -L -E /mnt/lustre/domfile
347 lcme_extent.e_end: 1048576
348 lmm_pattern: mdt</screen>
349 Both commands return layout type and its size. The stripe size is
350 equal to the extent size of component in case of DoM files, so
351 both can be used to get size on the MDT.</para>
354 <section xml:id="domlfsfind">
357 <primary>dom</primary>
360 <primary>dom</primary>
361 <secondary>lfsfind</secondary>
362 </indexterm>lfs find for DoM files</title>
363 <para>The <literal>lfs find</literal> command can be used to search
364 the directory tree rooted at the given directory or file name for the
365 files that match the given parameters. The command below shows the new
366 parameters for DoM files and their usages are similar to the
367 <literal>lfs getstripe</literal> command.</para>
368 <section><title>Command</title>
369 <para><screen>lfs find <directory|filename> [--layout|-L] [...]
372 <section><title>Examples</title>
373 <para>Find all files with DoM layout under directory
374 <literal>/mnt/lustre</literal>:
375 <screen>client$ lfs find -L mdt /mnt/lustre
378 /mnt/lustre/domdir/domfile
380 client$ lfs find -L mdt -type f /mnt/lustre
382 /mnt/lustre/domdir/domfile
384 client$ lfs find -L mdt -type d /mnt/lustre
385 /mnt/lustre/domdir</screen>
386 By using this command you can find all DoM objects, only DoM
387 files, or only directories with default DoM layout.</para>
388 <para>Find the DoM files/dirs with a particular stripe size:
389 <screen>client$ lfs find -L mdt -S -1200K -type f /mnt/lustre
391 /mnt/lustre/domdir/domfile
393 client$ lfs find -L mdt -S +200K -type f /mnt/lustre
395 /mnt/lustre/domdir/domfile</screen>
396 The first command finds all DoM files with stripe size less
397 than 1200KB. The second command above does the same for files
398 with a stripe size greater than 200KB. In both cases, all DoM
399 files are found because their DoM size is 1MB.</para>
402 <section xml:id="dom_stripesize">
405 <primary>dom</primary>
408 <primary>dom</primary>
409 <secondary>dom_stripesize</secondary>
410 </indexterm>The dom_stripesize parameter</title>
411 <para>The MDT controls the default maximum DoM size on the server via
412 the parameter <literal>dom_stripesize</literal> in the LOD device.
413 The <literal>dom_stripesize</literal> can be set differently for each
414 MDT, if necessary. The default value of the parameter is 1MB and can
415 be changed with <literal>lctl</literal> tool.</para>
416 <section><title>Get Command</title>
417 <para><screen>lctl get_param lod.*MDT<index>*.dom_stripesize
420 <section><title>Get Examples</title>
421 <para>The commands below get the maximum allowed DoM size on the
422 server. The final command is an attempt to create a file with a
423 larger size than the parameter setting and correctly fails.
424 <screen>mds# lctl get_param lod.*MDT0000*.dom_stripesize
425 lod.lustre-MDT0000-mdtlov.dom_stripesize=1048576
427 mds# lctl get_param -n lod.*MDT0000*.dom_stripesize
430 client$ lfs setstripe -E 2M -L mdt /mnt/lustre/dom2mb
431 Create composite file /mnt/lustre/dom2mb failed. Invalid argument
432 error: setstripe: create composite file '/mnt/lustre/dom2mb' failed:
433 Invalid argument</screen></para>
435 <section><title>Temporary Set Command</title>
436 <para>To temporarily set the value of the parameter, the
437 <literal>lctl set_param</literal> is used:
438 <screen>lctl set_param lod.*MDT<index>*.dom_stripesize=<value>
441 <section><title>Temporary Set Examples</title>
442 <para>The example below shows a change to the default DoM limit on
443 the server to 64KB and try to create a file with 1MB DoM size
445 <screen>mds# lctl set_param -n lod.*MDT0000*.dom_stripesize=64K
446 mds# lctl get_param -n lod.*MDT0000*.dom_stripesize
449 client$ lfs setstripe -E 1M -L mdt /mnt/lustre/dom
450 Create composite file /mnt/lustre/dom failed. Invalid argument
451 error: setstripe: create composite file '/mnt/lustre/dom' failed:
452 Invalid argument</screen></para>
454 <section><title>Persistent Set Command</title>
455 <para>To persistently set the value of the parameter on a
457 <literal>lctl set_param -P</literal> command is used:
459 lctl set_param -P lod.<replaceable>fsname</replaceable>-MDT<replaceable>index</replaceable>.dom_stripesize=<replaceable>value</replaceable>
461 This can also use a wildcard '<literal>*</literal>' for the
462 <replaceable>index</replaceable> to apply to all MDTs.
465 <section><title>Persistent Set Examples</title>
466 <para>The new value of the parameter is saved in the MGS
467 parameters log permanently:
469 mgs# lctl set_param -P lod.lustre-MDT0000.dom_stripesize=512K
470 mds# lctl get_param -n lod.*MDT0000*.dom_stripesize
473 and are applied on the matching MDTs within a few seconds.
477 <section xml:id="disabledom">
480 <primary>dom</primary>
483 <primary>dom</primary>
484 <secondary>disabledom</secondary>
485 </indexterm>Disable DoM</title>
486 <para>When <literal>lctl set_param</literal> (whether with
487 <literal>-P</literal> or not) sets
488 <literal>dom_stripesize</literal> to <literal>0</literal>, DoM
489 component creation will be disabled on the specified server(s), and
490 any <emphasis>new</emphasis> layouts with a specified DoM component
491 will have that component removed from the file layout. Existing
492 files and layouts with DoM components on that MDT are not changed.
494 <note><para>DoM files can still be created in existing directories
495 with a default DoM layout.</para></note>
500 vim:expandtab:shiftwidth=2:tabstop=8: