From: Vitaly Fertman Date: Wed, 23 Oct 2019 14:58:18 +0000 (-0400) Subject: LUDOC-436 sel: add self-extending layout documentation X-Git-Tag: 2.13.0~2 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=3473eeb8ca9f7bc70d9c758445723764434d45f2;p=doc%2Fmanual.git LUDOC-436 sel: add self-extending layout documentation This patch adds the feature documentation for the Self-Extending Layout feature added in LU-10070. Signed-off-by: Joseph Gmitter Signed-off-by: Vitaly Fertman Change-Id: I22e7fab4dfc868374c288b1034ab5cbbb6f58367 Reviewed-on: https://review.whamcloud.com/36561 Tested-by: jenkins Reviewed-by: Andreas Dilger --- diff --git a/ManagingStripingFreeSpace.xml b/ManagingStripingFreeSpace.xml index 55f7ae2..b9965f5 100644 --- a/ManagingStripingFreeSpace.xml +++ b/ManagingStripingFreeSpace.xml @@ -1279,6 +1279,575 @@ $ lfs setstripe -c 1 /mnt/testfs/testdir/dir_3comp/commnfile flag ^init here. + +
+ + <indexterm><primary>striping</primary><secondary>SEL</secondary> + </indexterm>Self-Extending Layout (SEL) + The Lustre Self-Extending Layout (SEL) feature is an extension of the + feature, which allows the MDS to change the defined + PFL layout dynamically. With this feature, the MDS monitors the used space + on OSTs and swaps the OSTs for the current file when they are low on space. + This avoids ENOSPC problems for SEL files when + applications are writing to them. + Whereas PFL delays the instantiation of some components until an IO + operation occurs on this region, SEL allows splitting such non-instantiated + components in two parts: an “extendable” component and an “extension” + component. The extendable component is a regular PFL component, covering + just a part of the region, which is small originally. The extension (or SEL) + component is a new component type which is always non-instantiated and + unassigned, covering the other part of the region. When a write reaches this + unassigned space, and the client calls the MDS to have it instantiated, the + MDS makes a decision as to whether to grant additional space to the extendable + component. The granted region moves from the head of the extension + component to the tail of the extendable component, thus the extendable + component grows and the SEL one is shortened. Therefore, it allows the file + to continue on the same OSTs, or in the case where space is low on one of + the current OSTs, to modify the layout to switch to a new component on new + OSTs. In particular, it lets IO automatically spill over to a large HDD OST + pool once a small SSD OST pool is getting low on space. + The default extension policy modifies the layout in the following + ways: + + + Extension: continue on the same OSTs – used when not low on space + on any of the OSTs of the current component; a particular extent is + granted to the extendable component. + + + Spill over: switch to next component OSTs – it is used only for + not the last component when at least one + of the current OSTs is low on space; the whole region of the SEL + component moves to the next component and the SEL component is removed + in its turn. + + + Repeating: create a new component with the same layout but on + free OSTs – it is used only for the last component when + at least one of the current OSTs is low on space; a new + component has the same layout but instantiated on different OSTs (from + the same pool) which have enough space. + + + Forced extension: continue with the current component OSTs despite + the low on space condition – it is used only for the last component when + a repeating attempt detected low on space condition as well - spillover + is impossible and there is no sense in the repeating. + + + The SEL feature does not require clients to understand the SEL + format of already created files, only the MDS support is needed which is + introduced in Lustre 2.13. However, old clients will have some limitations + as the Lustre tools will not support it. +
+ <literal>lfs setstripe</literal> + The lfs setstripe command is used to create files + with composite layouts, as well as add or delete components to or from an + existing file. It is extended to support SEL components. +
+ Create a SEL file + Command + lfs setstripe +[--component-end|-E end1] [STRIPE_OPTIONS] ... filename + +STRIPE OPTIONS: +--extension-size, --ext-size, -z <ext_size> + The -z option is added to specify the size of + the region which is granted to the extendable component on each + iteration. While declaring any component, this option turns the declared + component to a pair of components: extendable and extension ones. + Example + The following command creates 2 pairs of extendable and + extension components: + # lfs setstripe -E 1G -z 64M -E -1 -z 256M /mnt/lustre/file +
+ Example: create a SEL file + + + + + + Example: create a SEL file + + +
+
+ As usual, only the first PFL component is instantiated at + the creation time, thus it is immediately extended to the extension + size (64M for the first component), whereas the third component is left + zero-length. + # lfs getstripe /mnt/lustre/file +/mnt/lustre/file + lcm_layout_gen: 4 + lcm_mirror_count: 1 + lcm_entry_count: 4 + lcme_id: 1 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 0 + lcme_extent.e_end: 67108864 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 0 + lmm_objects: + - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x5:0x0] } + + lcme_id: 2 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 67108864 + lcme_extent.e_end: 1073741824 + lmm_stripe_count: 0 + lmm_extension_size: 67108864 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + + lcme_id: 3 + lcme_mirror_id: 0 + lcme_flags: 0 + lcme_extent.e_start: 1073741824 + lcme_extent.e_end: 1073741824 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + + lcme_id: 4 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 1073741824 + lcme_extent.e_end: EOF + lmm_stripe_count: 0 + lmm_extension_size: 268435456 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 +
+
+ Create a SEL layout template + Similar to PFL, it is possible to set a SEL layout template to + a directory. After that, all the files created under it will inherit this + layout by default. + # lfs setstripe -E 1G -z 64M -E -1 -z 256M /mnt/lustre/dir +# ./lustre/utils/lfs getstripe /mnt/lustre/dir +/mnt/lustre/dir + lcm_layout_gen: 0 + lcm_mirror_count: 1 + lcm_entry_count: 4 + lcme_id: N/A + lcme_mirror_id: N/A + lcme_flags: 0 + lcme_extent.e_start: 0 + lcme_extent.e_end: 67108864 + stripe_count: 1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1 + + lcme_id: N/A + lcme_mirror_id: N/A + lcme_flags: extension + lcme_extent.e_start: 67108864 + lcme_extent.e_end: 1073741824 + stripe_count: 1 extension_size: 67108864 pattern: raid0 stripe_offset: -1 + + lcme_id: N/A + lcme_mirror_id: N/A + lcme_flags: 0 + lcme_extent.e_start: 1073741824 + lcme_extent.e_end: 1073741824 + stripe_count: 1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1 + + lcme_id: N/A + lcme_mirror_id: N/A + lcme_flags: extension + lcme_extent.e_start: 1073741824 + lcme_extent.e_end: EOF + stripe_count: 1 extension_size: 268435456 pattern: raid0 stripe_offset: -1 + +
+
+
+ <literal>lfs getstripe</literal> + lfs getstripe commands can be used to list the + striping/component information for a given SEL file. Here, only those parameters + new for SEL files are shown. + Command + lfs getstripe +[--extension-size|--ext-size|-z] filename + The -z option is added to print the extension + size in bytes. For composite files this is the extension size of the + first extension component. If a particular component is identified by + other options (--component-id, --component-start, + etc...), this component extension size is printed. + Example 1: List a SEL component information + + Suppose we already have a composite file + /mnt/lustre/file, created by the following command: + # lfs setstripe -E 1G -z 64M -E -1 -z 256M /mnt/lustre/file + The 2nd component could be listed with the following command: + # lfs getstripe -I2 /mnt/lustre/file +/mnt/lustre/file + lcm_layout_gen: 4 + lcm_mirror_count: 1 + lcm_entry_count: 4 + lcme_id: 2 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 67108864 + lcme_extent.e_end: 1073741824 + lmm_stripe_count: 0 + lmm_extension_size: 67108864 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + + As you can see the SEL components are marked by the + extension flag and lmm_extension_size field + keeps the specified extension size. + Example 2: List the extension size + Having the same file as in the above example, the extension size of + the second component could be listed with: + # lfs getstripe -z -I2 /mnt/lustre/file +67108864 + Example 3: Extension + Having the same file as in the above example, suppose there is a + write which crosses the end of the first component (64M), and then another + write another write which crosses the end of the first component (128M) again, + the layout changes as following: +
+ Example: an extension of a SEL file + + + + + + Example: an extension of a SEL file + + +
+ The layout can be printed out by the following command: + # lfs getstripe /mnt/lustre/file +/mnt/lustre/file + lcm_layout_gen: 6 + lcm_mirror_count: 1 + lcm_entry_count: 4 + lcme_id: 1 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 0 + lcme_extent.e_end: 201326592 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 0 + lmm_objects: + - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x5:0x0] } + + lcme_id: 2 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 201326592 + lcme_extent.e_end: 1073741824 + lmm_stripe_count: 0 + lmm_extension_size: 67108864 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + + lcme_id: 3 + lcme_mirror_id: 0 + lcme_flags: 0 + lcme_extent.e_start: 1073741824 + lcme_extent.e_end: 1073741824 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + + lcme_id: 4 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 1073741824 + lcme_extent.e_end: EOF + lmm_stripe_count: 0 + lmm_extension_size: 268435456 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + Example 4: Spillover + In case where OST0 is low on space and an IO + happens to a SEL component, a spillover happens: the full region of the + SEL component is added to the next component, e.g. in the example above + the next layout modification will look like: +
+ Example: a spillover in a SEL file + + + + + + Example: a spillover in a SEL file + + +
+ Despite the fact the third component was [1G, 1G] originally, + while it is not instantiated, instead of getting extended backward, it is + moved backward to the start of the previous SEL component (192M) and + extended on its extension size (256M) from that position, thus it becomes + [192M, 448M]. + # lfs getstripe /mnt/lustre/file +/mnt/lustre/file + lcm_layout_gen: 7 + lcm_mirror_count: 1 + lcm_entry_count: 3 + lcme_id: 1 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 0 + lcme_extent.e_end: 201326592 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 0 + lmm_objects: + - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x5:0x0] } + + lcme_id: 3 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 201326592 + lcme_extent.e_end: 469762048 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 1 + lmm_objects: + - 0: { l_ost_idx: 1, l_fid: [0x100010000:0x8:0x0] } + + lcme_id: 4 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 469762048 + lcme_extent.e_end: EOF + lmm_stripe_count: 0 + lmm_extension_size: 268435456 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + Example 5: Repeating + Suppose in the example above, OST0 got + enough free space back but OST1 is low on space, + the following write to the last SEL component leads to a new component + allocation before the SEL component, which repeats the previous + component layout but instantiated on free OSTs: +
+ Example: repeat a SEL component + + + + + + Example: repeat a SEL component + + + +
+ # lfs getstripe /mnt/lustre/file +/mnt/lustre/file + lcm_layout_gen: 9 + lcm_mirror_count: 1 + lcm_entry_count: 4 + lcme_id: 1 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 0 + lcme_extent.e_end: 201326592 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 0 + lmm_objects: + - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x5:0x0] } + + lcme_id: 3 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 201326592 + lcme_extent.e_end: 469762048 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 1 + lmm_objects: + - 0: { l_ost_idx: 1, l_fid: [0x100010000:0x8:0x0] } + + lcme_id: 8 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 469762048 + lcme_extent.e_end: 738197504 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 65535 + lmm_stripe_offset: 0 + lmm_objects: + - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x6:0x0] } + + lcme_id: 4 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 738197504 + lcme_extent.e_end: EOF + lmm_stripe_count: 0 + lmm_extension_size: 268435456 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 + Example 6: Forced extension + Suppose in the example above, both OST0 and + OST1 are low on space, the following write to the + last SEL component will behave as an extension as there is no sense to + repeat. +
+ Example: forced extension in a SEL file + + + + + + Example: forced extension in a SEL file. + + + +
+ # lfs getstripe /mnt/lustre/file +/mnt/lustre/file + lcm_layout_gen: 11 + lcm_mirror_count: 1 + lcm_entry_count: 4 + lcme_id: 1 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 0 + lcme_extent.e_end: 201326592 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 0 + lmm_objects: + - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x5:0x0] } + + lcme_id: 3 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 201326592 + lcme_extent.e_end: 469762048 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: 1 + lmm_objects: + - 0: { l_ost_idx: 1, l_fid: [0x100010000:0x8:0x0] } + + lcme_id: 8 + lcme_mirror_id: 0 + lcme_flags: init + lcme_extent.e_start: 469762048 + lcme_extent.e_end: 1006632960 + lmm_stripe_count: 1 + lmm_stripe_size: 1048576 + lmm_pattern: raid0 + lmm_layout_gen: 65535 + lmm_stripe_offset: 0 + lmm_objects: + - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x6:0x0] } + + lcme_id: 4 + lcme_mirror_id: 0 + lcme_flags: extension + lcme_extent.e_start: 1006632960 + lcme_extent.e_end: EOF + lmm_stripe_count: 0 + lmm_extension_size: 268435456 + lmm_pattern: raid0 + lmm_layout_gen: 0 + lmm_stripe_offset: -1 +
+
+ <literal>lfs find</literal> + lfs find commands can be used to search for + the files that match the given SEL component paremeters. Here, only + those parameters new for the SEL files are shown. + lfs find +[[!] --extension-size|--ext-size|-z [+-]ext-size[KMG] +[[!] --component-flags=extension] + The -z option is added to specify the extension + size to search for. The files which have any component with the + extension size matched the given criteria are printed out. As always + “+” and “-“ signs are allowed to specify the least and the most size. + + A new extension component flag is added. Only + files which have at least one SEL component are printed. + The negative search for flags searches the files which + have a non-SEL component (not files + which do not have any SEL component). + + Example + # lfs setstripe --extension-size 64M -c 1 -E -1 /mnt/lustre/file + +# lfs find --comp-flags extension /mnt/lustre/* +/mnt/lustre/file + +# lfs find ! --comp-flags extension /mnt/lustre/* +/mnt/lustre/file + +# lfs find -z 64M /mnt/lustre/* +/mnt/lustre/file + +# lfs find -z +64M /mnt/lustre/* + +# lfs find -z -64M /mnt/lustre/* + +# lfs find -z +63M /mnt/lustre/* +/mnt/lustre/file + +# lfs find -z -65M /mnt/lustre/* +/mnt/lustre/file + +# lfs find -z 65M /mnt/lustre/* + +# lfs find ! -z 64M /mnt/lustre/* + +# lfs find ! -z +64M /mnt/lustre/* +/mnt/lustre/file + +# lfs find ! -z -64M /mnt/lustre/* +/mnt/lustre/file + +# lfs find ! -z +63M /mnt/lustre/* + +# lfs find ! -z -65M /mnt/lustre/* + +# lfs find ! -z 65M /mnt/lustre/* +/mnt/lustre/file +
+
<indexterm> <primary>space</primary> diff --git a/figures/SEL_Createfile.png b/figures/SEL_Createfile.png new file mode 100644 index 0000000..5ec4471 Binary files /dev/null and b/figures/SEL_Createfile.png differ diff --git a/figures/SEL_extension.png b/figures/SEL_extension.png new file mode 100644 index 0000000..799bc8e Binary files /dev/null and b/figures/SEL_extension.png differ diff --git a/figures/SEL_forced.png b/figures/SEL_forced.png new file mode 100644 index 0000000..f04c2e7 Binary files /dev/null and b/figures/SEL_forced.png differ diff --git a/figures/SEL_repeating.png b/figures/SEL_repeating.png new file mode 100644 index 0000000..4380eef Binary files /dev/null and b/figures/SEL_repeating.png differ diff --git a/figures/SEL_spillover.png b/figures/SEL_spillover.png new file mode 100644 index 0000000..7a8b83a Binary files /dev/null and b/figures/SEL_spillover.png differ