X-Git-Url: https://git.whamcloud.com/?a=blobdiff_plain;f=ManagingStripingFreeSpace.xml;h=68f88db4e3ac3c42cb174a7a7bc94e60b9322caf;hb=654501d481668b43200e7ec5b8a8254353c2160c;hp=26945dd6a97fbd3516c6f0d1b26f93993ee2aa6c;hpb=e6fccfdd0c802fda15ca28de3916a2128f4a431c;p=doc%2Fmanual.git
diff --git a/ManagingStripingFreeSpace.xml b/ManagingStripingFreeSpace.xml
index 26945dd..68f88db 100644
--- a/ManagingStripingFreeSpace.xml
+++ b/ManagingStripingFreeSpace.xml
@@ -20,7 +20,7 @@
-
+
@@ -54,11 +54,14 @@
objects on OSTs with more free space. (This can reduce I/O performance until space usage is
rebalanced again.) For a more detailed description of how striping is allocated, see .
- Files can only be striped over a finite number of OSTs. Prior to Lustre
- software release 2.2, the maximum number of OSTs that a file could be striped across was
- limited to 160. As of Lustre software release 2.2, the maximum number of OSTs is 2000. For
- more information, see .
+ Files can only be striped over a finite number of OSTs, based on the
+ maximum size of the attributes that can be stored on the MDT. If the MDT
+ is ldiskfs-based without the ea_inode feature, a file
+ can be striped across at most 160 OSTs. With a ZFS-based MDT, or if the
+ ea_inode feature is enabled for an ldiskfs-based MDT,
+ a file can be striped across up to 2000 OSTs. For more information, see
+ .
+
@@ -361,6 +364,891 @@ osc.lustre-OST0002-osc.ost_conn_uuid=192.168.20.1@tcp
provided in the section .
+
+
+ striping
+ PFL
+ Progressive File Layout(PFL)
+ The Lustre Progressive File Layout (PFL) feature simplifies the use
+ of Lustre so that users can expect reasonable performance for a variety of
+ normal file IO patterns without the need to explicitly understand their IO
+ model or Lustre usage details in advance. In particular, users do not
+ necessarily need to know the size or concurrency of output files in
+ advance of their creation and explicitly specify an optimal layout for
+ each file in order to achieve good performance for both highly concurrent
+ shared-single-large-file IO or parallel IO to many smaller per-process
+ files.
+ The layout of a PFL file is stored on disk as composite
+ layout. A PFL file is essentially an array of
+ sub-layout components, with each sub-layout component
+ being a plain layout covering different and non-overlapped extents of
+ the file. For PFL files, the file layout is composed of a series of
+ components, therefore it's possible that there are some file extents are
+ not described by any components.
+ An example of how data blocks of PFL files are mapped to OST objects
+ of components is shown in the following PFL object mapping diagram:
+
+ The PFL file in has 3
+ components and shows the mapping for the blocks of a 2055MB file.
+ The stripe size for the first two components is 1MB, while the stripe size
+ for the third component is 4MB. The stripe count is increasing for each
+ successive component. The first component only has two 1MB blocks and the
+ single object has a size of 2MB. The second component holds the next 254MB
+ of the file spread over 4 separate OST objects in RAID-0, each one will
+ have a size of 256MB / 4 objects = 64MB per object. Note the first two
+ objects obj 2,0 and obj 2,1
+ have a 1MB hole at the start where the data is stored in the first
+ component. The final component holds the next 1800MB spread over 32 OST
+ objects. There is a 256MB / 32 = 8MB hole at the start each one for the
+ data stored in the first two components. Each object will be
+ 2048MB / 32 objects = 64MB per object, except the
+ obj 3,0 that holds an extra 4MB chunk and
+ obj 3,1 that holds an extra 3MB chunk. If more data
+ was written to the file, only the objects in component 3 would increase
+ in size.
+ When a file range with defined but not instantiated component is
+ accessed, clients will send a Layout Intent RPC to the MDT, and the MDT
+ would instantiate the objects of the components covering that range.
+
+ Next, some commands for user to operate PFL files are introduced and
+ some examples of possible composite layout are illustrated as well.
+ Lustre provides commands
+ lfs setstripe and lfs migrate for
+ users to operate PFL files. lfs setstripe commands
+ are used to create PFL files, add or delete components to or from an
+ existing composite file; lfs migrate commands are used
+ to re-layout the data in existing files using the new layout parameter by
+ copying the data from the existing OST(s) to the new OST(s). Also,
+ as introduced in the previous sections, lfs getstripe
+ commands can be used to list the striping/component information for a
+ given PFL file, and lfs find commands can be used to
+ search the directory tree rooted at the given directory or file name for
+ the files that match the given PFL component parameters.
+ Using PFL files requires both the client and server to
+ understand the PFL file layout, which isn't available for Lustre 2.9 and
+ earlier. And it will not prevent older clients from accessing non-PFL
+ files in the filesystem.
+
+ lfs setstripe
+ lfs setstripe commands are used to create PFL
+ files, add or delete components to or from an existing composite file.
+ (Suppose we have 8 OSTs in the following examples and stripe size is 1MB
+ by default.)
+
+ Create a PFL file
+ Command
+ lfs setstripe
+[--component-end|-E end1] [STRIPE_OPTIONS]
+[--component-end|-E end2] [STRIPE_OPTIONS] ... filename
+ The -E option is used to specify the end offset
+ (in bytes or using a suffix âkMGTPâ, e.g. 256M) of each component, and
+ it also indicates the following STRIPE_OPTIONS are
+ for this component. Each component defines the stripe pattern of the
+ file in the range of [start, end). The first component must start from
+ offset 0 and all components must be adjacent with each other, no holes
+ are allowed, so each extent will start at the end of previous extent.
+ A -1 end offset or eof indicates
+ this is the last component extending to the end of file.
+ Example
+ $ lfs setstripe -E 4M -c 1 -E 64M -c 4 -E -1 -c -1 -i 4 \
+/mnt/testfs/create_comp
+ This command creates a file with composite layout illustrated in
+ the following figure. The first component has 1 stripe and covers
+ [0, 4M), the second component has 4 stripes and covers [4M, 64M), and
+ the last component stripes start at OST4, cross over all available
+ OSTs and covers [64M, EOF).
+
+ The composite layout can be output by the following command:
+ $ lfs getstripe /mnt/testfs/create_comp
+/mnt/testfs/create_comp
+ lcm_layout_gen: 3
+ lcm_entry_count: 3
+ lcme_id: 1
+ lcme_flags: init
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 4194304
+ lmm_stripe_count: 1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 0
+ lmm_objects:
+ - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
+
+ lcme_id: 2
+ lcme_flags: 0
+ lcme_extent.e_start: 4194304
+ lcme_extent.e_end: 67108864
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: -1
+ lcme_id: 3
+ lcme_flags: 0
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: -1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 4
+ Only the first componentâs OST objects of the PFL file are
+ instantiated when the layout is being set. Other instantiation is
+ delayed to later write/truncate operations.
+ If we write 128M data to this PFL file, the second and third
+ components will be instantiated:
+ $ dd if=/dev/zero of=/mnt/testfs/create_comp bs=1M count=128
+$ lfs getstripe /mnt/testfs/create_comp
+/mnt/testfs/create_comp
+ lcm_layout_gen: 5
+ lcm_entry_count: 3
+ lcme_id: 1
+ lcme_flags: init
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 4194304
+ lmm_stripe_count: 1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 0
+ lmm_objects:
+ - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
+
+ lcme_id: 2
+ lcme_flags: init
+ lcme_extent.e_start: 4194304
+ lcme_extent.e_end: 67108864
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 1
+ lmm_objects:
+ - 0: { l_ost_idx: 1, l_fid: [0x100010000:0x2:0x0] }
+ - 1: { l_ost_idx: 2, l_fid: [0x100020000:0x2:0x0] }
+ - 2: { l_ost_idx: 3, l_fid: [0x100030000:0x2:0x0] }
+ - 3: { l_ost_idx: 4, l_fid: [0x100040000:0x2:0x0] }
+
+ lcme_id: 3
+ lcme_flags: init
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: 8
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 4
+ lmm_objects:
+ - 0: { l_ost_idx: 4, l_fid: [0x100040000:0x3:0x0] }
+ - 1: { l_ost_idx: 5, l_fid: [0x100050000:0x2:0x0] }
+ - 2: { l_ost_idx: 6, l_fid: [0x100060000:0x2:0x0] }
+ - 3: { l_ost_idx: 7, l_fid: [0x100070000:0x2:0x0] }
+ - 4: { l_ost_idx: 0, l_fid: [0x100000000:0x3:0x0] }
+ - 5: { l_ost_idx: 1, l_fid: [0x100010000:0x3:0x0] }
+ - 6: { l_ost_idx: 2, l_fid: [0x100020000:0x3:0x0] }
+ - 7: { l_ost_idx: 3, l_fid: [0x100030000:0x3:0x0] }
+
+
+ Add component(s) to an existing composite file
+ Command
+ lfs setstripe --component-add
+[--component-end|-E end1] [STRIPE_OPTIONS]
+[--component-end|-E end2] [STRIPE_OPTIONS] ... filename
+ The option --component-add is used to add
+ components to an existing composite file. The extent start of
+ the first component to be added is equal to the extent end of last
+ component in the existing file, and all components to be added must
+ be adjacent with each other.
+ If the last existing component is specified by
+ -E -1 or -E eof, which covers
+ to the end of the file, it must be deleted before a new one is added.
+
+ Example
+ $ lfs setstripe -E 4M -c 1 -E 64M -c 4 /mnt/testfs/add_comp
+$ lfs setstripe --component-add -E -1 -c 4 -o 6-7,0,5 \
+/mnt/testfs/add_comp
+ This command adds a new component which starts from the end of
+ the last existing component to the end of file. The layout of this
+ example is illustrated in
+ . The last component
+ stripes across 4 OSTs in sequence OST6, OST7, OST0 and OST5, covers
+ [64M, EOF).
+
+ The layout can be printed out by the following command:
+ $ lfs getstripe /mnt/testfs/add_comp
+/mnt/testfs/add_comp
+ lcm_layout_gen: 5
+ lcm_entry_count: 3
+ lcme_id: 1
+ lcme_flags: init
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 4194304
+ lmm_stripe_count: 1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 0
+ lmm_objects:
+ - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
+
+ lcme_id: 2
+ lcme_flags: init
+ lcme_extent.e_start: 4194304
+ lcme_extent.e_end: 67108864
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 1
+ lmm_objects:
+ - 0: { l_ost_idx: 1, l_fid: [0x100010000:0x2:0x0] }
+ - 1: { l_ost_idx: 2, l_fid: [0x100020000:0x2:0x0] }
+ - 2: { l_ost_idx: 3, l_fid: [0x100030000:0x2:0x0] }
+ - 3: { l_ost_idx: 4, l_fid: [0x100040000:0x2:0x0] }
+
+ lcme_id: 5
+ lcme_flags: 0
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: -1
+ The component ID "lcme_id" changes as layout generation
+ changes. It is not necessarily sequential and does not imply ordering
+ of individual components.
+ Similar to specifying a full-file composite layout at file
+ creation time, --component-add won't instantiate
+ OST objects, the instantiation is delayed to later write/truncate
+ operations. For example, after writing beyond the 64MB start of the
+ file's last component, the new component has had objects allocated:
+
+ $ lfs getstripe -I5 /mnt/testfs/add_comp
+/mnt/testfs/add_comp
+ lcm_layout_gen: 6
+ lcm_entry_count: 3
+ lcme_id: 5
+ lcme_flags: init
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 6
+ lmm_objects:
+ - 0: { l_ost_idx: 6, l_fid: [0x100060000:0x4:0x0] }
+ - 1: { l_ost_idx: 7, l_fid: [0x100070000:0x4:0x0] }
+ - 2: { l_ost_idx: 0, l_fid: [0x100000000:0x5:0x0] }
+ - 3: { l_ost_idx: 5, l_fid: [0x100050000:0x4:0x0] }
+
+
+ Delete component(s) from an existing file
+ Command
+ lfs setstripe --component-del
+[--component-id|-I comp_id | --component-flags comp_flags]
+filename
+ The option --component-del is used to remove
+ the component(s) specified by component ID or flags from an existing
+ file. This operation will result in any data stored in the deleted
+ component will be lost.
+ The ID specified by -I option is the numerical
+ unique ID of the component, which can be obtained by command
+ lfs getstripe -I command, and the flag specified by
+ --component-flags option is a certain type of
+ components, which can be obtained by command
+ lfs getstripe --component-flags. For now, we only
+ have two flags init and ^init
+ for instantiated and un-instantiated components respectively.
+ Deletion must start with the last component because hole is
+ not allowed.
+ Example
+ $ lfs getstripe -I /mnt/testfs/del_comp
+1
+2
+5
+$ lfs setstripe --component-del -I 5 /mnt/testfs/del_comp
+ This example deletes the component with ID 5 from file
+ /mnt/testfs/del_comp. If we still use the last
+ example, the final result is illustrated in
+ .
+
+ If you try to delete a non-last component, you will see the
+ following error:
+ $ lfs setstripe -component-del -I 2 /mnt/testfs/del_comp
+Delete component 0x2 from /mnt/testfs/del_comp failed. Invalid argument
+error: setstripe: delete component of file '/mnt/testfs/del_comp' failed: Invalid argument
+
+
+ Set default PFL layout to an existing directory
+ Similar to create a PFL file, you can set default PFL layout to
+ an existing directory. After that, all the files created will inherit
+ this layout by default.
+ Command
+ lfs setstripe
+[--component-end|-E end1] [STRIPE_OPTIONS]
+[--component-end|-E end2] [STRIPE_OPTIONS] ... dirname
+ Example
+ $ mkdir /mnt/testfs/pfldir
+$ touch /mnt/testfs/pfldir/commonfile
+$ lfs setstripe -E 64M -c 2 -i 0 -E -1 -c 4 -i 0 /mnt/testfs/pfldir
+ When you run lfs getstripe, you will see:
+
+ $ lfs getstripe /mnt/testfs/pfldir
+/mnt/testfs/pfldir
+ lcm_layout_gen: 0
+ lcm_entry_count: 2
+ lcme_id: N/A
+ lcme_flags: 0
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 67108864
+ stripe_count: 2 stripe_size: 1048576 stripe_offset: 0
+ lcme_id: N/A
+ lcme_flags: 0
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ stripe_count: 4 stripe_size: 1048576 stripe_offset: 0
+/mnt/testfs/pfldir/commonfile
+lmm_stripe_count: 1
+lmm_stripe_size: 1048576
+lmm_pattern: 1
+lmm_layout_gen: 0
+lmm_stripe_offset: 0
+ obdidx objid objid group
+ 0 9 0x9 0
+ If you create a file under /mnt/testfs/pfldir,
+ the layout of that file will inherit 2 components from its parent
+ directory:
+ $ touch /mnt/testfs/pfldir/pflfile
+$ lfs getstripe /mnt/testfs/pfldir/pflfile
+/mnt/testfs/pfldir/pflfile
+ lcm_layout_gen: 2
+ lcm_entry_count: 2
+ lcme_id: 1
+ lcme_flags: init
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 67108864
+ lmm_stripe_count: 2
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 0
+ lmm_objects:
+ - 0: { l_ost_idx: 0, l_fid: [0x100000000:0xa:0x0] }
+ - 1: { l_ost_idx: 1, l_fid: [0x100010000:0x9:0x0] }
+
+ lcme_id: 2
+ lcme_flags: 0
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 0
+
+ lfs setstripe --component-add/del can't be run
+ on a directory, because default layout in directory is likea config,
+ which can be arbitrarily changed by lfs setstripe,
+ while layout in file may have data (OST objects) attached. If you want
+ to delete default layout in a directory, please run lfs
+ setstripe -d dirname, like :
+ $ lfs setstripe -d /mnt/testfs/pfldir
+$ lfs getstripe -d /mnt/testfs/pfldir
+/mnt/testfs/pfldir
+stripe_count: 1 stripe_size: 1048576 stripe_offset: -1
+/mnt/testfs/pfldir/commonfile
+lmm_stripe_count: 1
+lmm_stripe_size: 1048576
+lmm_pattern: 1
+lmm_layout_gen: 0
+lmm_stripe_offset: 0
+ obdidx objid objid group
+ 0 9 0x9 0
+
+
+
+
+ lfs migrate
+ lfs migrate commands are used to re-layout the
+ data in the existing files with the new layout parameter by copying the
+ data from the existing OST(s) to the new OST(s).
+ Command
+ lfs migrate [--component-end|-E comp_end] [STRIPE_OPTIONS] ...
+filename
+ The difference between migrate and
+ setstripe is that migrate is to
+ re-layout the data in the existing files, while
+ setstripe is to create new files with the specified
+ layout.
+ Example
+ Case1. Migrate a normal one to a composite
+ layout
+ $ lfs setstripe -c 1 -S 128K /mnt/testfs/norm_to_2comp
+$ dd if=/dev/urandom of=/mnt/testfs/norm_to_2comp bs=1M count=5
+$ lfs getstripe /mnt/testfs/norm_to_2comp --yaml
+/mnt/testfs/norm_to_comp
+lmm_stripe_count: 1
+lmm_stripe_size: 131072
+lmm_pattern: 1
+lmm_layout_gen: 0
+lmm_stripe_offset: 7
+lmm_objects:
+ - l_ost_idx: 7
+ l_fid: 0x100070000:0x2:0x0
+$ lfs migrate -E 1M -S 512K -c 1 -E -1 -S 1M -c 2 \
+/mnt/testfs/norm_to_2comp
+ In this example, a 5MB size file with 1 stripe and 128K stripe size
+ is migrated to a composite layout file with 2 components, illustrated in
+ .
+
+ The stripe information after migration is like:
+ $ lfs getstripe /mnt/testfs/norm_to_2comp
+/mnt/testfs/norm_to_2comp
+ lcm_layout_gen: 4
+ lcm_entry_count: 2
+ lcme_id: 1
+ lcme_flags: init
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 1048576
+ lmm_stripe_count: 1
+ lmm_stripe_size: 524288
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 0
+ lmm_objects:
+ - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
+
+ lcme_id: 2
+ lcme_flags: init
+ lcme_extent.e_start: 1048576
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: 2
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 2
+ lmm_objects:
+ - 0: { l_ost_idx: 2, l_fid: [0x100020000:0x2:0x0] }
+ - 1: { l_ost_idx: 3, l_fid: [0x100030000:0x2:0x0] }
+ Case2. Migrate a composite layout to another
+ composite layout
+ $ lfs setstripe -E 1M -S 512K -c 1 -E -1 -S 1M -c 2 \
+/mnt/testfs/2comp_to_3comp
+$ dd if=/dev/urandom of=/mnt/testfs/norm_to_2comp bs=1M count=5
+$ lfs migrate -E 1M -S 1M -c 2 -E 4M -S 1M -c 2 -E -1 -S 3M -c 3 \
+/mnt/testfs/2comp_to_3comp
+ In this example, a composite layout file with 2 components is
+ migrated a composite layout file with 3 components. If we still use
+ the example in case1, the migration process is illustrated in
+ .
+
+ The stripe information is like:
+ $ lfs getstripe /mnt/testfs/2comp_to_3comp
+/mnt/testfs/2comp_to_3comp
+ lcm_layout_gen: 6
+ lcm_entry_count: 3
+ lcme_id: 1
+ lcme_flags: init
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 1048576
+ lmm_stripe_count: 2
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 4
+ lmm_objects:
+ - 0: { l_ost_idx: 4, l_fid: [0x100040000:0x2:0x0] }
+ - 1: { l_ost_idx: 5, l_fid: [0x100050000:0x2:0x0] }
+
+ lcme_id: 2
+ lcme_flags: init
+ lcme_extent.e_start: 1048576
+ lcme_extent.e_end: 4194304
+ lmm_stripe_count: 2
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 6
+ lmm_objects:
+ - 0: { l_ost_idx: 6, l_fid: [0x100060000:0x2:0x0] }
+ - 1: { l_ost_idx: 7, l_fid: [0x100070000:0x3:0x0] }
+
+ lcme_id: 3
+ lcme_flags: init
+ lcme_extent.e_start: 4194304
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: 3
+ lmm_stripe_size: 3145728
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 0
+ lmm_objects:
+ - 0: { l_ost_idx: 0, l_fid: [0x100000000:0x3:0x0] }
+ - 1: { l_ost_idx: 1, l_fid: [0x100010000:0x2:0x0] }
+ - 2: { l_ost_idx: 2, l_fid: [0x100020000:0x3:0x0] }
+ Case3. Migrate a composite layout to a
+ normal one
+ $ lfs migrate -E 1M -S 1M -c 2 -E 4M -S 1M -c 2 -E -1 -S 3M -c 3 \
+/mnt/testfs/3comp_to_norm
+$ dd if=/dev/urandom of=/mnt/testfs/norm_to_2comp bs=1M count=5
+$ lfs migrate -c 2 -S 2M /mnt/testfs/3comp_to_normal
+ In this example, a composite file with 3 components is migrated to
+ a normal file with 2 stripes and 2M stripe size. If we still use the
+ example in Case2, the migration process is illustrated in
+ .
+
+ The stripe information is like:
+ $ lfs getstripe /mnt/testfs/3comp_to_norm --yaml
+/mnt/testfs/3comp_to_norm
+lmm_stripe_count: 2
+lmm_stripe_size: 2097152
+lmm_pattern: 1
+lmm_layout_gen: 7
+lmm_stripe_offset: 4
+lmm_objects:
+ - l_ost_idx: 4
+ l_fid: 0x100040000:0x3:0x0
+ - l_ost_idx: 5
+ l_fid: 0x100050000:0x3:0x0
+
+
+ lfs getstripe
+ lfs getstripe commands can be used to list the
+ striping/component information for a given PFL file. Here, only those
+ parameters new for PFL files are shown.
+ Command
+ lfs getstripe
+[--component-id|-I [comp_id]]
+[--component-flags [comp_flags]]
+[--component-count]
+[--component-start [+-][N][kMGTPE]]
+[--component-end|-E [+-][N][kMGTPE]]
+dirname|filename
+ Example
+ Suppose we already have a composite file
+ /mnt/testfs/3comp, created by the following
+ command:
+ $ lfs setstripe -E 4M -c 1 -E 64M -c 4 -E -1 -c -1 -i 4 \
+/mnt/testfs/3comp
+ And write some data
+ $ dd if=/dev/zero of=/mnt/testfs/3comp bs=1M count=5
+ Case1. List component ID and its related
+ information
+
+
+ List all the components ID
+ $ lfs getstripe -I /mnt/testfs/3comp
+1
+2
+3
+
+
+ List the detailed striping information of component ID=2
+ $ lfs getstripe -I2 /mnt/testfs/3comp
+/mnt/testfs/3comp
+ lcm_layout_gen: 4
+ lcm_entry_count: 3
+ lcme_id: 2
+ lcme_flags: init
+ lcme_extent.e_start: 4194304
+ lcme_extent.e_end: 67108864
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 5
+ lmm_objects:
+ - 0: { l_ost_idx: 5, l_fid: [0x100050000:0x2:0x0] }
+ - 1: { l_ost_idx: 6, l_fid: [0x100060000:0x2:0x0] }
+ - 2: { l_ost_idx: 7, l_fid: [0x100070000:0x2:0x0] }
+ - 3: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
+
+
+ List the stripe offset and stripe count of component ID=2
+ $ lfs getstripe -I2 -i -c /mnt/testfs/3comp
+ lmm_stripe_count: 4
+ lmm_stripe_offset: 5
+
+
+ Case2. List the component which contains the
+ specified flag
+
+
+ List the flag of each component
+ $ lfs getstripe -component-flag -I /mnt/testfs/3comp
+ lcme_id: 1
+ lcme_flags: init
+ lcme_id: 2
+ lcme_flags: init
+ lcme_id: 3
+ lcme_flags: 0
+
+
+ List component(s) who is not instantiated
+ $ lfs getstripe --component-flags=^init /mnt/testfs/3comp
+/mnt/testfs/3comp
+ lcm_layout_gen: 4
+ lcm_entry_count: 3
+ lcme_id: 3
+ lcme_flags: 0
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: -1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 4
+ lmm_stripe_offset: 4
+
+
+ Case3. List the total number of all the
+ component(s)
+
+
+ List the total number of all the components
+ $ lfs getstripe --component-count /mnt/testfs/3comp
+3
+
+
+ Case4. List the component with the specified
+ extent start or end positions
+
+
+ List the start position in bytes of each component
+ $ lfs getstripe --component-start /mnt/testfs/3comp
+0
+4194304
+67108864
+
+
+ List the start position in bytes of component ID=3
+ $ lfs getstripe --component-start -I3 /mnt/testfs/3comp
+67108864
+
+
+ List the component with start = 64M
+ $ lfs getstripe --component-start=64M /mnt/testfs/3comp
+/mnt/testfs/3comp
+ lcm_layout_gen: 4
+ lcm_entry_count: 3
+ lcme_id: 3
+ lcme_flags: 0
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: -1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 4
+ lmm_stripe_offset: 4
+
+
+ List the component(s) with start > 5M
+ $ lfs getstripe --component-start=+5M /mnt/testfs/3comp
+/mnt/testfs/3comp
+ lcm_layout_gen: 4
+ lcm_entry_count: 3
+ lcme_id: 3
+ lcme_flags: 0
+ lcme_extent.e_start: 67108864
+ lcme_extent.e_end: EOF
+ lmm_stripe_count: -1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 4
+ lmm_stripe_offset: 4
+
+
+ List the component(s) with start < 5M
+ $ lfs getstripe --component-start=-5M /mnt/testfs/3comp
+/mnt/testfs/3comp
+ lcm_layout_gen: 4
+ lcm_entry_count: 3
+ lcme_id: 1
+ lcme_flags: init
+ lcme_extent.e_start: 0
+ lcme_extent.e_end: 4194304
+ lmm_stripe_count: 1
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 4
+ lmm_objects:
+ - 0: { l_ost_idx: 4, l_fid: [0x100040000:0x2:0x0] }
+
+ lcme_id: 2
+ lcme_flags: init
+ lcme_extent.e_start: 4194304
+ lcme_extent.e_end: 67108864
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 5
+ lmm_objects:
+ - 0: { l_ost_idx: 5, l_fid: [0x100050000:0x2:0x0] }
+ - 1: { l_ost_idx: 6, l_fid: [0x100060000:0x2:0x0] }
+ - 2: { l_ost_idx: 7, l_fid: [0x100070000:0x2:0x0] }
+ - 3: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
+
+
+ List the component(s) with start > 3M and end < 70M
+ $ lfs getstripe --component-start=+3M --component-end=-70M \
+/mnt/testfs/3comp
+/mnt/testfs/3comp
+ lcm_layout_gen: 4
+ lcm_entry_count: 3
+ lcme_id: 2
+ lcme_flags: init
+ lcme_extent.e_start: 4194304
+ lcme_extent.e_end: 67108864
+ lmm_stripe_count: 4
+ lmm_stripe_size: 1048576
+ lmm_pattern: 1
+ lmm_layout_gen: 0
+ lmm_stripe_offset: 5
+ lmm_objects:
+ - 0: { l_ost_idx: 5, l_fid: [0x100050000:0x2:0x0] }
+ - 1: { l_ost_idx: 6, l_fid: [0x100060000:0x2:0x0] }
+ - 2: { l_ost_idx: 7, l_fid: [0x100070000:0x2:0x0] }
+ - 3: { l_ost_idx: 0, l_fid: [0x100000000:0x2:0x0] }
+
+
+
+
+ lfs find
+ lfs find commands can be used to search the
+ directory tree rooted at the given directory or file name for the files
+ that match the given PFL component parameters. Here, only those
+ parameters new for PFL files are shown. Their usages are similar to
+ lfs getstripe commands.
+ Command
+ lfs find directory|filename
+[[!] --component-count [+-=]comp_cnt]
+[[!] --component-start [+-=]N[kMGTPE]]
+[[!] --component-end|-E [+-=]N[kMGTPE]]
+[[!] --component-flags=comp_flags]
+ If you use --component-xxx options, only
+ the composite files will be searched; but if you use
+ ! --component-xxx options, all the files will be
+ searched.
+ Example
+ We use the following directory and composite files to show how
+ lfs find works.
+ $ mkdir /mnt/testfs/testdir
+$ lfs setstripe -E 1M -E 10M -E eof /mnt/testfs/testdir/3comp
+$ lfs setstripe -E 4M -E 20M -E 30M -E eof /mnt/testfs/testdir/4comp
+$ mkdir -p /mnt/testfs/testdir/dir_3comp
+$ lfs setstripe -E 6M -E 30M -E eof /mnt/testfs/testdir/dir_3comp
+$ lfs setstripe -E 8M -E eof /mnt/testfs/testdir/dir_3comp/2comp
+$ lfs setstripe -c 1 /mnt/testfs/testdir/dir_3comp/commnfile
+ Case1. Find the files that match the specified
+ component count condition
+ Find the files under directory /mnt/testfs/testdir whose number of
+ components is not equal to 3.
+ $ lfs find /mnt/testfs/testdir ! --component-count=3
+/mnt/testfs/testdir
+/mnt/testfs/testdir/4comp
+/mnt/testfs/testdir/dir_3comp/2comp
+/mnt/testfs/testdir/dir_3comp/commonfile
+ Case2. Find the files/dirs that match the
+ specified component start/end condition
+ Find the file(s) under directory /mnt/testfs/testdir with component
+ start = 4M and end < 70M
+ $ lfs find /mnt/testfs/testdir --component-start=4M -E -30M
+/mnt/testfs/testdir/4comp
+ Case3. Find the files/dirs that match the
+ specified component flag condition
+ Find the file(s) under directory /mnt/testfs/testdir whose component
+ flags contain init
+ $ lfs find /mnt/testfs/testdir --component-flag=init
+/mnt/testfs/testdir/3comp
+/mnt/testfs/testdir/4comp
+/mnt/testfs/testdir/dir_3comp/2comp
+ Since lfs find uses
+ "!" to do negative search, we donât support
+ flag ^init here.
+
+ space
@@ -585,24 +1473,31 @@ File 4: OST6, OST7, OST0
-
+ stripingwide stripingwide stripingLustre Striping Internals
- For Lustre releases prior to Lustre software release 2.2, files can be striped across a
- maximum of 160 OSTs. Lustre inodes use an extended attribute to record the location of each
- object (the object ID and the number of the OST on which it is stored). The size of the
- extended attribute limits the maximum stripe count to 160 objects.
- In Lustre software release 2.2 and subsequent releases, the maximum number
- of OSTs over which files can be striped has been raised to 2000 by allocating a new block on
- which to store the extended attribute that holds the object information. This feature, known
- as "wide striping," only allocates the additional extended attribute data block if the file is
- striped with a stripe count greater than 160. The file layout (object ID, OST number) is
- stored on the new data block with a pointer to this block stored in the original Lustre inode
- for the file. For files smaller than 160 objects, the Lustre inode is used to store the file
- layout.
+ Individual files can only be striped over a finite number of OSTs,
+ based on the maximum size of the attributes that can be stored on the MDT.
+ If the MDT is ldiskfs-based without the ea_inode
+ feature, a file can be striped across at most 160 OSTs. With ZFS-based
+ MDTs, or if the ea_inode feature is enabled for an
+ ldiskfs-based MDT, a file can be striped across up to 2000 OSTs.
+
+ Lustre inodes use an extended attribute to record on which OST each
+ object is located, and the identifier each object on that OST. The size of
+ the extended attribute is a function of the number of stripes.
+ If using an ldiskfs-based MDT, the maximum number of OSTs over which
+ files can be striped can been raised to 2000 by enabling the
+ ea_inode feature on the MDT:
+ tune2fs -O ea_inode /dev/mdtdev
+
+ The maximum stripe count for a single file does not limit the
+ maximum number of OSTs that are in the filesystem as a whole, only the
+ maximum possible size and maximum aggregate bandwidth for the file.
+