X-Git-Url: https://git.whamcloud.com/?a=blobdiff_plain;f=SystemConfigurationUtilities.xml;h=2e6a08fa92e7240c1fb950463827b8625f3ede2b;hb=654501d481668b43200e7ec5b8a8254353c2160c;hp=b7434f7d81eb25f57ee8edce71d42a5fbc3cd659;hpb=512ed0920228514e99f40bbc61b22853fdb72c17;p=doc%2Fmanual.git diff --git a/SystemConfigurationUtilities.xml b/SystemConfigurationUtilities.xml index b7434f7..2e6a08f 100644 --- a/SystemConfigurationUtilities.xml +++ b/SystemConfigurationUtilities.xml @@ -1,5 +1,4 @@ - System Configuration Utilities This chapter includes system configuration utilities and includes the following sections: @@ -66,7 +65,7 @@ <indexterm><primary>e2scan</primary></indexterm> e2scan The e2scan utility is an ext2 file system-modified inode scan program. The e2scan program uses libext2fs to find inodes with ctime or mtime newer than a given time and prints out their pathname. Use e2scan to efficiently generate lists of files that have been modified. The e2scan tool is included in the e2fsprogs package, located at: - http://downloads.whamcloud.com/public/e2fsprogs/latest/ + http://downloads.hpdd.intel.com/public/e2fsprogs/latest/
Synopsis e2scan [options] [-f file] block_device @@ -94,7 +93,7 @@ - -b inode buffer blocks + -b inode buffer blocks Sets the readahead inode blocks to get excellent performance when scanning the block device. @@ -103,7 +102,7 @@ - -o output file + -o output file If an output file is specified, modified pathnames are written to this file. Otherwise, modified parameters are written to stdout. @@ -111,7 +110,7 @@ - -t inode | pathname + -t inode| pathname Sets the e2scan type if type is inode. The e2scan utility prints modified inode numbers to stdout. By default, the type is set as pathname. @@ -120,7 +119,7 @@ - -u + -u Rebuilds the parent database from scratch. Otherwise, the current parent database is used. @@ -137,11 +136,16 @@ l_getidentity The l_getidentity utility handles Lustre user / group cache upcall.
Synopsis - l_getidentity {mdtname} {uid} + l_getidentity ${FSNAME}-MDT{xxxx} {uid}
Description - The group upcall file contains the path to an executable file that, when properly installed, is invoked to resolve a numeric UID to a group membership list. This utility should complete the mds_grp_downcall_data structure and write it to the /proc/fs/lustre/mdt/${FSNAME}-MDT{xxxx}/identity_info pseudo-file. + The group upcall file contains the path to an executable file that is invoked to resolve + a numeric UID to a group membership list. This utility opens + /proc/fs/lustre/mdt/${FSNAME}-MDT{xxxx}/identity_info and writes the + related identity_downcall_data structure (see .) The data is persisted with lctl set_param + mdt.${FSNAME}-MDT{xxxx}.identity_info. The l_getidentity utility is the reference implementation of the user or group cache upcall.
@@ -163,7 +167,8 @@ l_getidentity - mdtname + + ${FSNAME}-MDT{xxxx} Metadata server target name @@ -171,7 +176,7 @@ l_getidentity - uid + uid User identifier @@ -193,8 +198,7 @@ lctl The lctl utility is used for root control and configuration. With lctl you can directly control Lustre via an ioctl interface, allowing various configuration, maintenance and debugging features to be accessed.
Synopsis - lctl -lctl --device <devno> <command [args]> + lctl [--device devno] command [args]
Description @@ -202,37 +206,40 @@ lctl --device <devno> <command [args]> dl dk device -network <up/down> +network up|down list_nids -ping nidhelp +ping nidhelp quit - For a complete list of available commands, type help at the lctl prompt. To get basic help on command meaning and syntax, type helpcommand. Command completion is activated with the TAB key, and command history is available via the up- and down-arrow keys. + For a complete list of available commands, type help at the lctl prompt. To get basic help on command meaning and syntax, type help command. Command completion is activated with the TAB key (depending on compile options), and command history is available via the up- and down-arrow keys. For non-interactive use, use the second invocation, which runs the command after connecting to the device.
Setting Parameters with lctl Lustre parameters are not always accessible using the procfs interface, as it is platform-specific. As a solution, lctl {get,set}_param has been introduced as a platform-independent interface to the Lustre tunables. Avoid direct references to /proc/{fs,sys}/{lustre,lnet}. For future portability, use lctl {get,set}_param . - When the file system is running, use the lctl set_param command to set temporary parameters (mapping to items in /proc/{fs,sys}/{lnet,lustre}). The lctl set_param command uses this syntax: - lctl set_param [-n] <obdtype>.<obdname>.<proc_file_name>=<value> + When the file system is running, use the lctl set_param command on the affected node(s) to temporarily set parameters (mapping to items in /proc/{fs,sys}/{lnet,lustre}). The lctl set_param command uses this syntax: + lctl set_param [-n] [-P] [-d] obdtype.obdname.property=value For example: - $ lctl set_param ldlm.namespaces.*osc*.lru_size=$((NR_CPU*100)) - Many permanent parameters can be set with lctl conf_param. In general, lctl conf_param can be used to specify any parameter settable in a /proc/fs/lustre file, with its own OBD device. The lctl conf_param command uses this syntax: - <obd|fsname>.<obdtype>.<proc_file_name>=<value>) + mds# lctl set_param mdt.testfs-MDT0000.identity_upcall=NONE + Use -P option to set parameters permanently. Option -d deletes permanent parameters. For example: + mgs# lctl set_param -P mdt.testfs-MDT0000.identity_upcall=NONE +mgs# lctl set_param -P -d mdt.testfs-MDT0000.identity_upcall + Many permanent parameters can be set with lctl conf_param. In general, lctl conf_param can be used to specify any OBD device parameter settable in a /proc/fs/lustre file. The lctl conf_param command must be run on the MGS node, and uses this syntax: + obd|fsname.obdtype.property=value) For example: - $ lctl conf_param testfs-MDT0000.mdt.group_upcall=NONE + mgs# lctl conf_param testfs-MDT0000.mdt.identity_upcall=NONE $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - The lctlconf_param command permanently sets parameters in the file system configuration. + The lctl conf_param command permanently sets parameters in the file system configuration for all nodes of the specified type. - To get current Lustre parameter settings, use the lctl get_param command with this syntax: - lctl get_param [-n] <obdtype>.<obdname>.<proc_file_name> + To get current Lustre parameter settings, use the lctl get_param command on the desired node with the same parameter name as lctl set_param: + lctl get_param [-n] obdtype.obdname.parameter For example: - $ lctl get_param -n ost.*.ost_io.timeouts - To list Lustre parameters that are available to set, use the lctl list_param command, with this syntax: - lctl list_param [-n] <obdtype>.<obdname> - For example: - $ lctl list_param obdfilter.lustre-OST0000 - For more information on using lctl to set temporary and permanent parameters, see (Setting Parameters with lctl). + mds# lctl get_param mdt.testfs-MDT0000.identity_upcall + To list Lustre parameters that are available to set, use the lctl list_param command, with this syntax: + lctl list_param [-R] [-F] obdtype.obdname.* + For example, to list all of the parameters on the MDT: + oss# lctl list_param -RF mdt + For more information on using lctl to set temporary and permanent parameters, see . Network Configuration @@ -251,23 +258,23 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - network <up/down>|<tcp/elan/myrinet> + network up|down|tcp|elan - Starts or stops LNET, or selects a network type for other lctl LNET commands. + Starts or stops LNet, or selects a network type for other lctl LNet commands. - list_nids + list_nids - Prints all NIDs on the local node. LNET must be running. + Prints all NIDs on the local node. LNet must be running. - which_nid <nidlist> + which_nid nidlist From a list of NIDs for a remote node, identifies the NID on which interface communication will occur. @@ -275,47 +282,47 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - ping <nid> + ping nid - Checks LNET connectivity via an LNET ping. This uses the fabric appropriate to the specified NID. + Checks LNet connectivity via an LNet ping. This uses the fabric appropriate to the specified NID. - interface_list + interface_list - Prints the network interface information for a given network type. + Prints the network interface information for a given network type. - peer_list + peer_list - Prints the known peers for a given network type. + Prints the known peers for a given network type. - conn_list + conn_list - Prints all the connected remote NIDs for a given network type. + Prints all the connected remote NIDs for a given network type. - active_tx + active_tx - This command prints active transmits. It is only used for the Elan network type. + This command prints active transmits. It is only used for the Elan network type. - route_list + route_list Prints the complete routing table. @@ -346,7 +353,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - device <devname> + device devname   @@ -357,13 +364,13 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - device_list + device_list   - Shows the local Lustre OBDs, a/k/a dl. + Shows the local Lustre OBDs, a/k/a dl. @@ -388,10 +395,10 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - list_param[-F|-R] <param_path ...> + list_param [-F|-R] parameter [parameter ...] - Lists the Lustre or LNET parameter name. + Lists the Lustre or LNet parameter name.   @@ -400,7 +407,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16   - -F + -F Adds '/', '@' or '=' for directories, symlinks and writeable files, respectively. @@ -411,18 +418,18 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16   - -R + -R - Recursively lists all parameters under the specified path. If param_path is unspecified, all parameters are shown. + Recursively lists all parameters under the specified path. If param_path is unspecified, all parameters are shown. - get_param[-n|-N|-F] <param_path ...> + get_param [-n|-N|-F] parameter [parameter ...] - Gets the value of a Lustre or LNET parameter from the specified path. + Gets the value of a Lustre or LNet parameter from the specified path. @@ -430,7 +437,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16   - -n + -n Prints only the parameter value and not the parameter name. @@ -441,7 +448,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16   - -N + -N Prints only matched parameter names and not the values; especially useful when using patterns. @@ -452,18 +459,18 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16   - -F + -F - When -N is specified, adds '/', '@' or '=' for directories, symlinks and writeable files, respectively. + When -N is specified, adds '/', '@' or '=' for directories, symlinks and writeable files, respectively. - set_param[-n]<param_path=value...> + set_param [-n] parameter=value - Sets the value of a Lustre or LNET parameter from the specified path. + Sets the value of a Lustre or LNet parameter from the specified path. @@ -471,7 +478,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16   - -n + -n Disables printing of the key name when printing values. @@ -479,12 +486,12 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - conf_param[-d] <device|fsname>.<parameter>=<value> + conf_param [-d] device|fsname parameter=value Sets a permanent configuration parameter for any device via the MGS. This command must be run on the MGS node. - All writeable parameters under lctl list_param (e.g. lctl list_param -F osc.*.* | grep =) can be permanently set using lctl conf_param, but the format is slightly different. For conf_param, the device is specified first, then the obdtype. Wildcards are not supported. Additionally, failover nodes may be added (or removed), and some system-wide parameters may be set as well (sys.at_max, sys.at_min, sys.at_extra, sys.at_early_margin, sys.at_history, sys.timeout, sys.ldlm_timeout). For system-wide parameters, <device> is ignored. - For more information on setting permanent parameters and lctl conf_param command examples, see (Setting Permanent Parameters). + All writeable parameters under lctl list_param (e.g. lctl list_param -F osc.*.* | grep =) can be permanently set using lctl conf_param, but the format is slightly different. For conf_param, the device is specified first, then the obdtype. Wildcards are not supported. Additionally, failover nodes may be added (or removed), and some system-wide parameters may be set as well (sys.at_max, sys.at_min, sys.at_extra, sys.at_early_margin, sys.at_history, sys.timeout, sys.ldlm_timeout). For system-wide parameters, device is ignored. + For more information on setting permanent parameters and lctl conf_param command examples, see (Setting Permanent Parameters). @@ -492,24 +499,24 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16   - -d <device|fsname>.<parameter> + -d device|fsname.parameter   - Deletes a parameter setting (use the default value at the next restart). A null value for <value> also deletes the parameter setting. + Deletes a parameter setting (use the default value at the next restart). A null value for value also deletes the parameter setting. - activate + activate - Re-activates an import after the deactivate operation. This setting is only effective until the next restart (see conf_param). + Re-activates an import after the deactivate operation. This setting is only effective until the next restart (see conf_param). - deactivate + deactivate Deactivates an import, in particular meaning do not assign new file stripes to an OSC. Running lctl deactivate on the MDS stops new objects from being allocated on the OST. Running lctl deactivate on Lustre clients causes them to return -EIO when accessing objects on the OST instead of waiting for recovery. @@ -517,7 +524,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - abort_recovery + abort_recovery Aborts the recovery process on a re-starting MDT or OST. @@ -527,7 +534,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - Lustre tunables are not always accessible using the procfs interface, as it is platform-specific. As a solution, lctl {get,set,list}_param has been introduced as a platform-independent interface to the Lustre tunables. Avoid direct references to /proc/{fs,sys}/{lustre,lnet}. For future portability, use lctl {get,set,list}_param instead. + Lustre tunables are not always accessible using the procfs interface, as it is platform-specific. As a solution, lctl {get,set,list}_param has been introduced as a platform-independent interface to the Lustre tunables. Avoid direct references to /proc/{fs,sys}/{lustre,lnet}. For future portability, use lctl {get,set,list}_param instead. Virtual Block Device Operations Lustre can emulate a virtual block device upon a regular file. This emulation is needed when you are trying to set up a swap space via the file. @@ -548,15 +555,15 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - blockdev_attach<file name> <device node> + blockdev_attach filename /dev/lloop_device - Attaches a regular Lustre file to a block device. If the device node does not exist, lctl creates it. We recommend that you create the device node by lctl since the emulator uses a dynamical major number. + Attaches a regular Lustre file to a block device. If the device node does not exist, lctl creates it. It is recommend that a device node is created by lctl since the emulator uses a dynamical major number. - blockdev_detach<device node> + blockdev_detach /dev/lloop_device Detaches the virtual block device. @@ -564,7 +571,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - blockdev_info<device node> + blockdev_info /dev/lloop_device Provides information about the Lustre file attached to the device node. @@ -591,18 +598,28 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - changelog_register + changelog_register - Registers a new changelog user for a particular device. Changelog entries are not purged beyond a registered user's set point (see lfs changelog_clear). + Registers a new changelog user for a particular device. + Changelog entries are saved persistently on the MDT with each + filesystem operation, and are only purged beyond all registered + user's minimum set point (see + lfs changelog_clear). This may cause the + Changelog to consume a large amount of space, eventually + filling the MDT, if a changelog user is registered but never + consumes those records. - changelog_deregister<id> + changelog_deregister id - Unregisters an existing changelog user. If the user's "clear" record number is the minimum for the device, changelog records are purged until the next minimum. + Unregisters an existing changelog user. If the + user's "clear" record number is the minimum for + the device, changelog records are purged until the next minimum. + @@ -626,7 +643,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - debug_daemon + debug_daemon Starts and stops the debug daemon, and controls the output filename and size. @@ -634,7 +651,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - debug_kernel[file] [raw] + debug_kernel [file] [raw] Dumps the kernel debug buffer to stdout or a file. @@ -642,7 +659,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - debug_file<input> [output] + debug_file input_file [output_file] Converts the kernel-dumped debug log from binary to plain text format. @@ -650,7 +667,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - clear + clear Clears the kernel debug buffer. @@ -658,7 +675,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - mark<text> + mark text Inserts marker text in the kernel debug buffer. @@ -666,7 +683,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - filter<subsystem id/debug mask> + filter subsystem_id|debug_mask Filters kernel debug messages by subsystem or mask. @@ -674,7 +691,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - show<subsystem id/debug mask> + show subsystem_id|debug_mask Shows specific types of messages. @@ -682,7 +699,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - debug_list<subs/types> + debug_list subsystems|types Lists all subsystem and debug types. @@ -690,7 +707,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - modules<path> + modules path Provides GDB-friendly module information. @@ -720,7 +737,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - --device + --device Device to be used for the operation (specified by name or number). See device_list. @@ -728,7 +745,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16 - --ignore_errors | ignore_errors + --ignore_errors | ignore_errors Ignores errors during script processing. @@ -740,7 +757,7 @@ $ lctl conf_param testfs.llite.max_read_ahead_mb=16
Examples - lctl + lctl $ lctl lctl > dl 0 UP mgc MGC192.168.0.20@tcp btbb24e3-7deb-2ffa-eab0-44dffe00f692 5 @@ -777,7 +794,10 @@ ll_decode_filter_fid
Description - The ll_decode_filter_fid utility decodes and prints the Lustre OST object ID, MDT FID, stripe index for the specified OST object(s), which is stored in the "trusted.fid" attribute on each OST object. This is accessible to ll_decode_filter_fid when the OST filesystem is mounted locally as type ldiskfs for maintenance. + The ll_decode_filter_fid utility decodes and prints the Lustre OST object ID, MDT FID, + stripe index for the specified OST object(s), which is stored in the "trusted.fid" + attribute on each OST object. This is accessible to ll_decode_filter_fid + when the OST file system is mounted locally as type ldiskfs for maintenance. The "trusted.fid" extended attribute is stored on each OST object when it is first modified (data written or attributes set), and is not accessed or modified by Lustre after that time. The OST object ID (objid) is useful in case of OST directory corruption, though normally the ll_recover_lost_found_objs(8) utility is able to reconstruct the entire OST object directory hierarchy. The MDS FID can be useful to determine which MDS inode an OST object is (or was) used by. The stripe index can be used in conjunction with other OST objects to reconstruct the layout of a file even if the MDT inode was lost.
@@ -789,7 +809,7 @@ root@oss1# ll_decode_filter_fid #12345[4,5,8] #123455: objid=614725 seq=0 parent=[0x18d11:0xebba84eb:0x1] #123458: objid=533088 seq=0 parent=[0x21417:0x19734d61:0x0] This shows that the three files in lost+found have decimal object IDs - 690670, 614725, and 533088, respectively. The object sequence number (formerly object group) is 0 for all current OST objects. - The MDT parent inode FIDs are hexdecimal numbers of the form sequence:oid:idx. Since the sequence number is below 0x100000000 in all these cases, the FIDs are in the legacy Inode and Generation In FID (IGIF) namespace and are mapped directly to the MDT inode = seq and generation = oid values; the MDT inodes are 0x751c5, 0x18d11, and 0x21417 respectively. For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the MDT Object Index (OI) file on the MDT to determine the internal inode number. + The MDT parent inode FIDs are hexadecimal numbers of the form sequence:oid:idx. Since the sequence number is below 0x100000000 in all these cases, the FIDs are in the legacy Inode and Generation In FID (IGIF) namespace and are mapped directly to the MDT inode = seq and generation = oid values; the MDT inodes are 0x751c5, 0x18d11, and 0x21417 respectively. For objects with MDT parent sequence numbers above 0x200000000, this indicates that the FID needs to be mapped via the MDT Object Index (OI) file on the MDT to determine the internal inode number. The idx field shows the stripe number of this OST object in the Lustre RAID-0 striped file.
@@ -810,8 +830,11 @@ ll_recover_lost_found_objs
Description - The first time Lustre writes to an object, it saves the MDS inode number and the objid as an extended attribute on the object, so in case of directory corruption of the OST, it is possible to recover the objects. Running e2fsck fixes the corrupted OST directory, but it puts all of the objects into a lost and found directory, where they are inaccessible to Lustre. Use the ll_recover_lost_found_objs utility to recover all (or at least most) objects from a lost and found directory and return them to the O/0/d* directories. - To use ll_recover_lost_found_objs, mount the file system locally (using the -t ldiskfs command), run the utility and then unmount it again. The OST must not be mounted by Lustre when ll_recover_lost_found_objs is run. + The first time Lustre modifies an object, it saves the MDS inode number and the objid as an extended attribute on the object, so in case of directory corruption of the OST, it is possible to recover the objects. Running e2fsck fixes the corrupted OST directory, but it puts all of the objects into a lost and found directory, where they are inaccessible to Lustre. Use the ll_recover_lost_found_objs utility to recover all (or at least most) objects from a lost and found directory and return them to the O/0/d* directories. + To use ll_recover_lost_found_objs, mount the file system locally (using the -t ldiskfs, or -t zfs command), run the utility and then unmount it again. The OST must not be mounted by Lustre when ll_recover_lost_found_objs is run. + This utility is not needed for 2.6 and later, + since the LFSCK online scanning will move objects + from lost+found to the proper place in the OST.
Options @@ -825,14 +848,14 @@ ll_recover_lost_found_objs Option -  Description + Description - -h + -h Prints a help message @@ -840,7 +863,7 @@ ll_recover_lost_found_objs - -v + -v Increases verbosity @@ -848,7 +871,7 @@ ll_recover_lost_found_objs - -d directory + -d directory Sets the lost and found directory path @@ -873,7 +896,7 @@ llobdstat
Description - The llobdstat utility displays a line of OST statistics for the given ost_name every interval seconds. It should be run directly on an OSS node. Type CTRL-C to stop statistics printing. + The llobdstat utility displays a line of OST statistics for the given ost_name every interval seconds. It should be run directly on an OSS node. Type CTRL-C to stop statistics printing.
Example @@ -895,13 +918,13 @@ Timestamp Read-delta ReadRate Write-delta WriteRate
Files - /proc/fs/lustre/obdfilter/<ostname>/stats + /proc/fs/lustre/obdfilter/ostname/stats
<indexterm><primary>llog_reader</primary></indexterm> llog_reader - The llog_reader utility parses Lustre's on-disk configuration logs. + The llog_reader utility translates a Lustre configuration log into human-readable form.
Synopsis llog_reader filename @@ -909,7 +932,7 @@ llog_reader
Description The llog_reader utility parses the binary format of Lustre's on-disk configuration logs. Llog_reader can only read logs; use tunefs.lustre to write to them. - To examine a log file on a stopped Lustre server, mount its backing file system as ldiskfs, then use llog_reader to dump the log file's contents, for example: + To examine a log file on a stopped Lustre server, mount its backing file system as ldiskfs or zfs, then use llog_reader to dump the log file's contents, for example: mount -t ldiskfs /dev/sda /mnt/mgs llog_reader /mnt/mgs/CONFIGS/tfs-client To examine the same log file on a running Lustre server, use the ldiskfs-enabled debugfs utility (called debug.ldiskfs on some distributions) to extract the file, for example: @@ -930,12 +953,12 @@ llstat The llstat utility displays Lustre statistics.
Synopsis - llstat [-c] [-g] [-i interval] stats_file - + llstat [-c] [-g] [-i interval] stats_file +
Description - The llstat utility displays statistics from any of the Lustre statistics files that share a common format and are updated at interval seconds. To stop statistics printing, use ctrl-c. + The llstat utility displays statistics from any of the Lustre statistics files that share a common format and are updated at interval seconds. To stop statistics printing, use ctrl-c.
Options @@ -949,14 +972,14 @@ llstat Option -  Description + Description - -c + -c Clears the statistics file. @@ -964,7 +987,7 @@ llstat - -i + -i Specifies the polling period (in seconds). @@ -972,7 +995,7 @@ llstat - -g + -g Specifies graphable output format. @@ -980,7 +1003,7 @@ llstat - -h + -h Displays help information. @@ -988,10 +1011,10 @@ llstat - stats_file + stats_file - Specifies either the full path to a statistics file or the shorthand reference, mds or ost + Specifies either the full path to a statistics file or the shorthand reference, mds or ost @@ -1007,7 +1030,7 @@ llstat Files The llstat files are located at: /proc/fs/lustre/mdt/MDS/*/stats -/proc/fs/lustre/mds/*/exports/*/stats +/proc/fs/lustre/mdt/*/exports/*/stats /proc/fs/lustre/mdc/*/stats /proc/fs/lustre/ldlm/services/*/stats /proc/fs/lustre/ldlm/namespaces/*/pool/stats @@ -1026,7 +1049,7 @@ llverdev The llverdev verifies a block device is functioning properly over its full size.
Synopsis - llverdev [-c chunksize] [-f] [-h] [-o offset] [-l] [-p] [-r] [-t timestamp] [-v] [-w] device + llverdev [-c chunksize] [-f] [-h] [-o offset] [-l] [-p] [-r] [-t timestamp] [-v] [-w] device
Description @@ -1058,7 +1081,7 @@ llverdev - -c|--chunksize + -c|--chunksize I/O chunk size in bytes (default value is 1048576). @@ -1066,7 +1089,7 @@ llverdev - -f|--force + -f|--force Forces the test to run without a confirmation that the device will be overwritten and all data will be permanently destroyed. @@ -1074,7 +1097,7 @@ llverdev - -h|--help + -h|--help Displays a brief help message. @@ -1082,7 +1105,7 @@ llverdev - -ooffset + -o offset Offset (in kilobytes) of the start of the test (default value is 0). @@ -1090,7 +1113,7 @@ llverdev - -l|--long + -l|--long Runs a full check, writing and then reading and verifying every block on the disk. @@ -1098,7 +1121,7 @@ llverdev - -p|--partial + -p|--partial Runs a partial check, only doing periodic checks across the device (1 GB steps). @@ -1106,23 +1129,25 @@ llverdev - -r|--read + -r|--read - Runs the test in read (verify) mode only, after having previously run the test in -w mode. + Runs the test in read (verify) mode only, after having previously run the test in -w mode. - -ttimestamp + -t timestamp - Sets the test start time as printed at the start of a previously-interrupted test to ensure that validation data is the same across the entire filesystem (default value is the current time()). + Sets the test start time as printed at the start of a previously-interrupted + test to ensure that validation data is the same across the entire file system + (default value is the current time()). - -v|--verbose + -v|--verbose Runs the test in verbose mode, listing each read and write operation. @@ -1130,7 +1155,7 @@ llverdev - -w|--write + -w|--write Runs the test in write (test-pattern) mode (default runs both read and write). @@ -1166,7 +1191,7 @@ lshowmount
Description - The lshowmount utility shows the hosts that have Lustre mounted to a server. Ths utility looks for exports from the MGS, MDS, and obdfilter. + The lshowmount utility shows the hosts that have Lustre mounted to a server. This utility looks for exports from the MGS, MDS, and obdfilter.
Options @@ -1187,7 +1212,7 @@ lshowmount - -e|--enumerate + -e|--enumerate Causes lshowmount to list each client mounted on a separate line instead of trying to compress the list of clients into a hostrange string. @@ -1195,7 +1220,7 @@ lshowmount - -h|--help + -h|--help Causes lshowmount to print out a usage message. @@ -1203,7 +1228,7 @@ lshowmount - -l|--lookup + -l|--lookup Causes lshowmount to try to look up the hostname for NIDs that look like IP addresses. @@ -1211,7 +1236,7 @@ lshowmount - -v|--verbose + -v|--verbose Causes lshowmount to output export information for each service instead of only displaying the aggregate information for all Lustre services on the server. @@ -1223,41 +1248,42 @@ lshowmount
Files - /proc/fs/lustre/mgs/<server>/exports/<uuid>/nid /proc/fs/lustre/mds/<server>/expo\ -rts/<uuid>/nid /proc/fs/lustre/obdfilter/<server>/exports/<uuid>/nid + /proc/fs/lustre/mgs/server/exports/uuid/nid +/proc/fs/lustre/mds/server/exports/uuid/nid +/proc/fs/lustre/obdfilter/server/exports/uuid/nid
<indexterm><primary>lst</primary></indexterm> lst - The lst utility starts LNET self-test. + The lst utility starts LNet self-test.
Synopsis lst
Description - LNET self-test helps site administrators confirm that Lustre Networking (LNET) has been properly installed and configured. The self-test also confirms that LNET and the network software and hardware underlying it are performing as expected. - Each LNET self-test runs in the context of a session. A node can be associated with only one session at a time, to ensure that the session has exclusive use of the nodes on which it is running. A session is create, controlled and monitored from a single node; this is referred to as the self-test console. + LNet self-test helps site administrators confirm that Lustre Networking (LNet) has been properly installed and configured. The self-test also confirms that LNet and the network software and hardware underlying it are performing as expected. + Each LNet self-test runs in the context of a session. A node can be associated with only one session at a time, to ensure that the session has exclusive use of the nodes on which it is running. A session is create, controlled and monitored from a single node; this is referred to as the self-test console. Any node may act as the self-test console. Nodes are named and allocated to a self-test session in groups. This allows all nodes in a group to be referenced by a single name. Test configurations are built by describing and running test batches. A test batch is a named collection of tests, with each test composed of a number of individual point-to-point tests running in parallel. These individual point-to-point tests are instantiated according to the test type, source group, target group and distribution specified when the test is added to the test batch.
Modules - To run LNET self-test, load these modules: libcfs, lnet, lnet_selftest and any one of the klnds (ksocklnd, ko2iblnd...). To load all necessary modules, run modprobe lnet_selftest, which recursively loads the modules on which lnet_selftest depends. - There are two types of nodes for LNET self-test: the console node and test nodes. Both node types require all previously-specified modules to be loaded. (The userspace test node does not require these modules). + To run LNet self-test, load these modules: libcfs, lnet, lnet_selftest and any one of the klnds (ksocklnd, ko2iblnd...). To load all necessary modules, run modprobe lnet_selftest, which recursively loads the modules on which lnet_selftest depends. + There are two types of nodes for LNet self-test: the console node and test nodes. Both node types require all previously-specified modules to be loaded. (The userspace test node does not require these modules). Test nodes can be in either kernel or in userspace. A console user can invite a kernel test node to join the test session by running lst add_group NID, but the user cannot actively add a userspace test node to the test session. However, the console user can passively accept a test node to the test session while the test node runs lst client to connect to the console.
Utilities - LNET self-test includes two user utilities, lst and lstclient. + LNet self-test includes two user utilities, lst and lstclient. lst is the user interface for the self-test console (run on the console node). It provides a list of commands to control the entire test system, such as create session, create test groups, etc. - lstclient is the userspace self-test program which is linked with userspace LNDs and LNET. A user can invoke lstclient to join a self-test session: + lstclient is the userspace self-test program which is linked with userspace LNDs and LNet. A user can invoke lstclient to join a self-test session: lstclient -sesid CONSOLE_NID group NAME
Example Script - This is a sample LNET self-test script which simulates the traffic pattern of a set of Lustre servers on a TCP network, accessed by Lustre clients on an IB network (connected via LNET routers), with half the clients reading and half the clients writing. + This is a sample LNet self-test script which simulates the traffic pattern of a set of Lustre servers on a TCP network, accessed by Lustre clients on an IB network (connected via LNet routers), with half the clients reading and half the clients writing. #!/bin/bash export LST_SESSION=$$ lst new_session read/write @@ -1280,7 +1306,7 @@ lst end_session
<indexterm><primary>lustre_rmmod.sh</primary></indexterm> lustre_rmmod.sh - The lustre_rmmod.sh utility removes all Lustre and LNET modules (assuming no Lustre services are running). It is located in /usr/bin. + The lustre_rmmod.sh utility removes all Lustre and LNet modules (assuming no Lustre services are running). It is located in /usr/bin. The lustre_rmmod.sh utility does not work if Lustre modules are being used or if you have manually run the lctl network up command. @@ -1291,15 +1317,15 @@ lustre_rsync The lustre_rsync utility synchronizes (replicates) a Lustre file system to a target file system.
Synopsis - lustre_rsync --source|-s <src> --target|-t <tgt> - --mdt|-m <mdt> [--user|-u <user id>] - [--xattr|-x <yes|no>] [--verbose|-v] - [--statuslog|-l <log>] [--dry-run] [--abort-on-err] + lustre_rsync --source|-s src --target|-t tgt + --mdt|-m mdt [--user|-u userid] + [--xattr|-x yes|no] [--verbose|-v] + [--statuslog|-l log] [--dry-run] [--abort-on-err] -lustre_rsync --statuslog|-l <log> +lustre_rsync --statuslog|-l log -lustre_rsync --statuslog|-l <log> --source|-s <source> - --target|-t <tgt> --mdt|-m <mdt> +lustre_rsync --statuslog|-l log --source|-s source + --target|-t tgt --mdt|-m mdt
Description @@ -1336,7 +1362,7 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --source=<src> + --source=src The path to the root of the Lustre file system (source) which will be synchronized. This is a mandatory option if a valid status log created during a previous synchronization operation (--statuslog) is not specified. @@ -1344,7 +1370,7 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --target=<tgt> + --target=tgt The path to the root where the source file system will be synchronized (target). This is a mandatory option if the status log created during a previous synchronization operation (--statuslog) is not specified. This option can be repeated if multiple synchronization targets are desired. @@ -1352,7 +1378,7 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --mdt=<mdt> + --mdt=mdt The metadata device to be synchronized. A changelog user must be registered for this device. This is a mandatory option if a valid status log created during a previous synchronization operation (--statuslog) is not specified. @@ -1360,7 +1386,7 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --user=<user id> + --user=userid The changelog user ID for the specified MDT. To use lustre_rsync, the changelog user must be registered. For details, see the changelog_register parameter in the lctl man page. This is a mandatory option if a valid status log created during a previous synchronization operation (--statuslog) is not specified. @@ -1368,15 +1394,15 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --statuslog=<log> + --statuslog=log - A log file to which synchronization status is saved. When lustre_rsync starts, the state of a previous replication is read from here. If the status log from a previous synchronization operation is specified, otherwise mandatory options like --source, --target and --mdt options may be skipped. By specifying options like --source, --target and/or --mdt in addition to the --statuslog option, parameters in the status log can be overriden. Command line options take precedence over options in the status log. + A log file to which synchronization status is saved. When lustre_rsync starts, the state of a previous replication is read from here. If the status log from a previous synchronization operation is specified, otherwise mandatory options like --source, --target and --mdt options may be skipped. By specifying options like --source, --target and/or --mdt in addition to the --statuslog option, parameters in the status log can be overridden. Command line options take precedence over options in the status log. - --xattr<yes|no> + --xattryes|no Specifies whether extended attributes (xattrs) are synchronized or not. The default is to synchronize extended attributes. @@ -1385,7 +1411,7 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --verbose + --verbose Produces a verbose output. @@ -1393,7 +1419,7 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --dry-run + --dry-run Shows the output of lustre_rsync commands (copy, mkdir, etc.) on the target file system without actually executing them. @@ -1401,7 +1427,7 @@ lustre_rsync --statuslog|-l <log> --source|-s <source> - --abort-on-err + --abort-on-err Shows the output of lustre_rsync commands (copy, mkdir, etc.) on the target file system without actually executing them. @@ -1459,11 +1485,11 @@ Changelog records consumed: 42
<indexterm><primary>mkfs.lustre</primary></indexterm> mkfs.lustre - The mkfs.lustre utility formats a disk for a Lustre service. + The mkfs.lustre utility formats a disk for a Lustre service.
Synopsis - mkfs.lustre <target_type> [options] device - where <target_type> is one of the following: + mkfs.lustre target_type [options] device + where target_type is one of the following: @@ -1481,23 +1507,23 @@ mkfs.lustre - --ost + --ost - Object Storage Target (OST) + Object storage target (OST) - --mdt + --mdt - Metadata Storage Target (MDT) + Metadata storage target (MDT) - --network=net,... + --network=net,... Network(s) to which to restrict this OST/MDT. This option can be repeated as necessary. @@ -1505,10 +1531,12 @@ mkfs.lustre - --mgs + --mgs - Configuration Management Service (MGS), one per site. This service can be combined with one --mdt service by specifying both types. + Configuration management service (MGS), one per site. This service can be + combined with one --mdt service by specifying both + types. @@ -1517,13 +1545,17 @@ mkfs.lustre
Description - mkfs.lustre is used to format a disk device for use as part of a Lustre file system. After formatting, a disk can be mounted to start the Lustre service defined by this command. - When the file system is created, parameters can simply be added as a --param option to the mkfs.lustre command. See . + mkfs.lustre is used to format a disk device for use as part of a + Lustre file system. After formatting, a disk can be mounted to start the Lustre service + defined by this command. + When the file system is created, parameters can simply be added as a + --param option to the mkfs.lustre command. See . - - - + + + @@ -1537,60 +1569,68 @@ mkfs.lustre - --backfstype=fstype + --backfstype=fstype - Forces a particular format for the backing file system (such as ext3, ldiskfs). + Forces a particular format for the backing file system such as ldiskfs (the default) or zfs. - --comment=comment + --comment=comment - Sets a user comment about this disk, ignored by Lustre. + Sets a user comment about this disk, ignored by the Lustre software. - --device-size=KB + --device-size=#>KB - Sets the device size for loop devices. + Sets the device size for loop devices. - --dryrun + --dryrun - Only prints what would be done; it does not affect the disk. + Only prints what would be done; it does not affect the disk. - - --failnode=nid,... - - - Sets the NID(s) of a failover partner. This option can be repeated as needed. - CAUTION: Cannot be used with --servicenode. - + --servicenode=nid,... + Sets the NID(s) of all service nodes, including primary and failover partner + service nodes. The --servicenode option cannot be used with + --failnode option. See for + more details. - --servicenode=nid,... + --failnode=nid,... - Sets the NID(s) of all service node, including failover partner as well as primary node service nids. This option can be repeated as needed. - CAUTION: Cannot be used with --failnode. + Sets the NID(s) of a failover service node for a primary server for a target. + The --failnode option cannot be used with + --servicenode option. See + for more details. + When the --failnode option is used, certain + restrictions apply (see ). + - --fsname=filesystem_name + --fsname=filesystem_name - The Lustre file system of which this service/node will be a part. The default file system name is 'lustreâ€. + The Lustre file system of which this service/node will be a part. The default + file system name is lustre.   The file system name is limited to 8 characters. @@ -1599,15 +1639,17 @@ mkfs.lustre - --index=index + + --index=index_number - Specifies the OST or MDT number. This should always be used when formatting OSTs, in order to ensure that there is a simple mapping between the OST index and the OSS node and device it is located on. + Specifies the OST or MDT number (0...N). This allows mapping between the OSS + and MDS node and the device on which the OST or MDT is located. - --mkfsoptions=opts + --mkfsoptions=opts Formats options for the backing file system. For example, ext3 options could be set here. @@ -1615,20 +1657,23 @@ mkfs.lustre - --mountfsoptions=opts + --mountfsoptions=opts Sets the mount options used when the backing file system is mounted. - CAUTION: Unlike earlier versions of mkfs.lustre, this version completely replaces the default mount options with those specified on the command line, and issues a warning on stderr if any default mount options are omitted. + Unlike earlier versions of mkfs.lustre, this version completely replaces + the default mount options with those specified on the command line, and issues a + warning on stderr if any default mount options are omitted. The defaults for ldiskfs are: - OST: errors=remount-ro; - MGS/MDT: errors=remount-ro,iopen_nopriv,user_xattr - Do not alter the default mount options unless you know what you are doing. + MGS/MDT: errors=remount-ro,iopen_nopriv,user_xattr + OST: errors=remount-ro,extents,mballoc + OST: errors=remount-ro + Use care when altering the default mount options. - --network=net,... + --network=net,...   @@ -1637,7 +1682,7 @@ mkfs.lustre - --mgsnode=nid,... + --mgsnode=nid,... Sets the NIDs of the MGS node, required for all targets other than the MGS. @@ -1645,10 +1690,10 @@ mkfs.lustre - --paramkey=value + --param key=value - Sets the permanent parameter key to value value. This option can be repeated as necessary. Typical options might include: + Sets the permanent parameter key to value value. This option can be repeated as necessary. Typical options might include: @@ -1656,7 +1701,7 @@ mkfs.lustre   - --param sys.timeout=40 + --param sys.timeout=40> System obd timeout. @@ -1667,7 +1712,7 @@ mkfs.lustre   - --param lov.stripesize=2M + --param lov.stripesize=2M Default stripe size. @@ -1678,7 +1723,7 @@ mkfs.lustre   - --param lov.stripecount=2 + param lov.stripecount=2 Default stripe count. @@ -1689,7 +1734,7 @@ mkfs.lustre   - --param failover.mode=failout + --param failover.mode=failout Returns errors instead of waiting for recovery. @@ -1697,7 +1742,7 @@ mkfs.lustre - --quiet + --quiet Prints less information. @@ -1705,7 +1750,7 @@ mkfs.lustre - --reformat + --reformat Reformats an existing Lustre disk. @@ -1713,7 +1758,7 @@ mkfs.lustre - --stripe-count-hint=stripes + --stripe-count-hint=stripes Used to optimize the MDT's inode size. @@ -1721,7 +1766,7 @@ mkfs.lustre - --verbose + --verbose Prints more information. @@ -1733,13 +1778,14 @@ mkfs.lustre
Examples - Creates a combined MGS and MDT for file system testfs on, e.g., node cfs21: + Creates a combined MGS and MDT for file system testfs on, e.g., node cfs21: mkfs.lustre --fsname=testfs --mdt --mgs /dev/sda1 - Creates an OST for file system testfs on any node (using the above MGS): + Creates an OST for file system testfs on any node (using the above + MGS): mkfs.lustre --fsname=testfs --mgsnode=cfs21@tcp0 --ost --index=0 /dev/sdb - Creates a standalone MGS on, e.g., node cfs22: + Creates a standalone MGS on, e.g., node cfs22: mkfs.lustre --mgs /dev/sda1 - Creates an MDT for file system myfs1 on any node (using the above MGS): + Creates an MDT for file system myfs1 on any node (using the above MGS): mkfs.lustre --fsname=myfs1 --mdt --mgsnode=cfs22@tcp0 /dev/sda2
@@ -1763,7 +1809,7 @@ mount.lustre The mount.lustre utility starts a Lustre client or target service.
Synopsis - mount -t lustre [-o options] directory + mount -t lustre [-o options] device mountpoint
@@ -1787,19 +1833,44 @@ mount.lustre - <mgsspec>:/<fsname> -   + mgsname:/fsname[/subdir] - Mounts the Lustre file system named fsname on the client by contacting the Management Service at mgsspec on the pathname given by directory. The format for mgsspec is defined below. A mounted client file system appears in fstab(5) and is usable, like any local file system, and provides a full POSIX-compliant interface. + Mounts the Lustre file system named + fsname (optionally starting at + subdirectory subdir within the + filesystem, if specified) on the client at the directory + mountpoint, by contacting the Lustre + Management Service at mgsname. The + format for mgsname is defined below. A + client file system can be listed in fstab(5) + for automatic mount at boot time, is usable like any local file + system, and provides a full POSIX standard-compliant interface. + - <disk_device> + block_device - Starts the target service defined by the mkfs.lustre command on the physical disk disk_device. A mounted target service file system is only useful for df(1) operations and appears in fstab(5) to show the device is in use. + Starts the target service defined by the + mkfs.lustre(8) command on the physical disk + block_device. The + block_device may be specified using + -L label to find + the first block device with that label (e.g. + testfs-MDT0000), or by UUID using the + -U uuid option. + Care should be taken if there is a device-level backup of the + target filesystem on the same node, which would have a + duplicate label and UUID if it has not been changed with + tune2fs(8) or similar. The mounted target + service filesystem mounted at + mountpoint is only useful for + df(1) operations and appears in + /proc/mounts to show the device is in use. + @@ -1825,25 +1896,79 @@ mount.lustre - <mgsspec>:=<mgsnode>[:<mgsnode>] -   + mgsname=mgsnode[:mgsnode] + + + mgsname is a colon-separated + list of mgsnode names where the MGS + service may run. Multiple mgsnode + values can be specified if the MGS service is configured for + HA failover and may be running on any one of the nodes. + + + + + + mgsnode=mgsnid[,mgsnid] - The MGS specification may be a colon-separated list of nodes. + Each mgsnode may specify a + comma-separated list of NIDs, if there are different LNet + interfaces for that mgsnode. + - <mgsnode>:=<mgsnid>[,<mgsnid>] + mgssec=flavor - Each node may be specified by a comma-separated list of NIDs. + Specifies the encryption flavor for the initial network + RPC connection to the MGS. Non-security flavors are: + null, plain, and + gssnull, which respectively disable, or + have no encryption or integrity features for testing purposes. + Kerberos flavors are: krb5n, + krb5a, krb5i, and + krb5p. Shared-secret key flavors are: + skn, ska, + ski, and skpi, see the + for more details. The security + flavor for client-to-server connections is specified in the + filesystem configuration that the client fetches from the MGS. + + + + + + skpath=file|directory + + + + Path to a file or directory with the keyfile(s) to load for + this mount command. Keys are inserted into the + KEY_SPEC_SESSION_KEYRING keyring in the + kernel with a description containing + lustre: and a suffix which depends on + whether the context of the mount command is for an MGS, + MDT/OST, or client. + + + + + + exclude=ostlist + + + Starts a client or MDT with a colon-separated list of + known inactive OSTs that it will not try to connect to. - In addition to the standard mount options, Lustre understands the following client-specific options: + In addition to the standard mount(8) options, Lustre understands + the following client-specific options: @@ -1861,64 +1986,159 @@ mount.lustre - flock + always_ping - Enables full flock support, coherent across all client nodes. + The client will periodically ping the server when it is + idle, even if the server ptlrpc module + is configured with the suppress_pings + option. This allows clients to reliably use the filesystem + even if they are not part of an external client health + monitoring mechanism. + - localflock + flock - Enables local flock support, using only client-local flock (faster, for applications that require flock, but do not run on multiple nodes). + Enables advisory file locking support between + participating applications using the flock(2) + system call. This causes file locking to be coherent across all + client nodes also using this mount option. This is useful if + applications need coherent userspace file locking across + multiple client nodes, but also imposes communications overhead + in order to maintain locking consistency between client nodes. + - noflock + localflock - Disables flock support entirely. Applications calling flock get an error. It is up to the administrator to choose either localflock (fastest, low impact, not coherent between nodes) or flock (slower, performance impact for use, coherent between nodes). + Enables client-local flock(2) support, + using only client-local advisory file locking. This is faster + than using the global flock option, and can + be used for applications that depend on functioning + flock(2) but run only on a single node. + It has minimal overhead using only the Linux kernel's locks. + - user_xattr + noflock - Enables get/set of extended attributes by regular users. See the attr(5) manual page. + Disables flock(2) support entirely, + and is the default option. Applications calling + flock(2) get an + ENOSYS error. It is up to the administrator + to choose either the localflock or + flock mount option based on their + requirements. It is possible to mount clients with different + options, and only those mounted with flock + will be coherent amongst each other. + - nouser_xattr + lazystatfs - Disables use of extended attributes by regular users. Root and system processes can still use extended attributes. + Allows statfs(2) (as used by + df(1) and lfs-df(1)) to + return even if some OST or MDT is unresponsive or has been + temporarily or permanently disabled in the configuration. + This avoids blocking until all of the targets are available. + This is the default behavior since Lustre 2.9.0. + - acl + nolazystatfs - Enables POSIX Access Control List support. See the acl(5) manual page. + Requires that statfs(2) block until all + OSTs and MDTs are available and have returned space usage. + - noacl + user_xattr - Disables Access Control List support. + Enables get/set of extended attributes by regular users + in the user.* namespace. See the + attr(5) manual page for more details. + + + + + + nouser_xattr + + + Disables use of extended attributes in the + user.* namespace by regular users. Root + and system processes can still use extended attributes. + + + + + verbose + + + Enable extra mount/umount console messages. + + + + + noverbose + + + Disable mount/umount console messages. + + + + + user_fid2path + + + Enable FID to path translation by regular + users. Note: This option allows a potential security hole + because it allows regular users direct access to a file by its + File ID, bypassing POSIX path-based permission checks which + could otherwise prevent the user from accessing a file in a + directory that they do not have access to. Regular permission + checks are still performed on the file itself, so the user + cannot access a file to which they have no access rights. + + + + + + nouser_fid2path + + + Disable FID to path translation by + regular users. Root and processes with + CAP_DAC_READ_SEARCH can still perform FID + to path translation. + - In addition to the standard mount options and backing disk type (e.g. ext3) options, Lustre understands the following server-specific options: + In addition to the standard mount options and backing disk type + (e.g. ldiskfs) options, Lustre understands the following server-specific + mount options: @@ -1936,7 +2156,7 @@ mount.lustre - nosvc + nosvc Starts the MGC (and MGS, if co-located) for a target service, not the actual service. @@ -1944,7 +2164,7 @@ mount.lustre - nomsgs + nomgs Starts only the MDT (with a co-located MGS), without starting the MGS. @@ -1952,66 +2172,90 @@ mount.lustre - exclude=<ostlist> - - - Starts a client or MDT with a colon-separated list of known inactive OSTs. - - - - - nosvc - - - Only starts the MGC (and MGS, if co-located) for a target service, not the actual service. - - - - - nomsgs + abort_recov - Starts a MDT with a co-located MGS, without starting the MGS. + Aborts client recovery on that server and starts the target service immediately. - exclude=ostlist + max_sectors_kb=KB - Starts a client or MDT with a (colon-separated) list of known inactive OSTs. + Sets the block device parameter + max_sectors_kb limit for the MDT or OST + target being mounted to specified maximum number of kilobytes. + When max_sectors_kb isn't specified as a + mount option, it will automatically be set to the + max_hw_sectors_kb (up to a maximum of 16MiB) + for that block device. This default behavior is suited for + most users. When max_sectors_kb=0 is used, + the current value for this tunable will be kept. + - abort_recov + md_stripe_cache_size - Aborts client recovery and starts the target service immediately. + Sets the stripe cache size for server-side disk with a striped RAID configuration. - md_stripe_cache_size + recovery_time_soft=timeout - Sets the stripe cache size for server-side disk with a striped RAID configuration. + Allows timeout seconds for clients to + reconnect for recovery after a server crash. This timeout is + incrementally extended if it is about to expire and the server + is still handling new connections from recoverable clients. + + The default soft recovery timeout is 3 times the value + of the Lustre timeout parameter (see + ). The default Lustre + timeout is 100 seconds, which would make the soft recovery + timeout default to 300 seconds (5 minutes). The soft recovery + timeout is set at mount time and will not change if the Lustre + timeout is changed after mount time. + - recovery_time_soft=timeout + recovery_time_hard=timeout - Allows timeout seconds for clients to reconnect for recovery after a server crash. This timeout is incrementally extended if it is about to expire and the server is still handling new connections from recoverable clients. The default soft recovery timeout is 300 seconds (5 minutes). + The server is allowed to incrementally extend its timeout + up to a hard maximum of timeout + seconds. + + The default hard recovery timeout is 9 times the value + of the Lustre timeout parameter (see + ). The default Lustre + timeout is 100 seconds, which would make the hard recovery + timeout default to 900 seconds (15 minutes). The hard recovery + timeout is set at mount time and will not change if the Lustre + timeout is changed after mount time. + - recovery_time_hard=timeout + noscrub - The server is allowed to incrementally extend its timeout up to a hard maximum of timeout seconds. The default hard recovery timeout is set to 900 seconds (15 minutes). + Typically the MDT will detect restoration from a + file-level backup during mount. This mount option prevents + the OI Scrub from starting automatically when the MDT is + mounted. Manually starting LFSCK after mounting provides finer + control over the starting conditions. This mount option also + prevents OI scrub from occurring automatically when OI + inconsistency is detected (see + ). + @@ -2020,8 +2264,15 @@ mount.lustre
Examples - Starts a client for the Lustre file system testfs at mount point /mnt/myfilesystem. The Management Service is running on a node reachable from this client via the cfs21@tcp0 NID. - mount -t lustre cfs21@tcp0:/testfs /mnt/myfilesystem + Starts a client for the Lustre file system + chipfs at mount point + /mnt/chip. The Management Service is running on + a node reachable from this client via the cfs21@tcp0 NID. + mount -t lustre cfs21@tcp0:/chipfs /mnt/chip + Similar to the above example, but mounting a + subdirectory under chipfs as a fileset. + mount -t lustre cfs21@tcp0:/chipfs/v1_0 /mnt/chipv1_0 + Starts the Lustre metadata target service from /dev/sda1 on mount point /mnt/test/mdt. mount -t lustre /dev/sda1 /mnt/test/mdt Starts the testfs-MDT0000 service (using the disk label), but aborts the recovery process. @@ -2079,7 +2330,7 @@ plot-llstat - results_filename + results_filename Output generated by plot-llstat @@ -2087,7 +2338,7 @@ plot-llstat - parameter_index + parameter_index   @@ -2113,14 +2364,14 @@ routerstat The routerstat utility prints Lustre router statistics.
Synopsis - routerstat [interval] + routerstat [interval]
Description - The routerstat utility watches LNET router statistics. If no interval is specified, then statistics are sampled and printed only one time. Otherwise, statistics are sampled and printed at the specified interval (in seconds). + The routerstat utility displays LNet router statistics. If no interval is specified, then statistics are sampled and printed only one time. Otherwise, statistics are sampled and printed at the specified interval (in seconds).
- Options + Output The routerstat output includes the following fields: @@ -2129,7 +2380,7 @@ routerstat - Option + Output Description @@ -2139,50 +2390,117 @@ routerstat - M + M - msgs_alloc(msgs_max) + Number of messages currently being processed by LNet (The maximum number of messages ever processed by LNet concurrently) - E + E - errors + Number of LNet errors - S + S - send_count/send_length + Total size (length) of messages sent in bytes/ Number of messages sent - R + R - recv_count/recv_length + Total size (length) of messages received in bytes/ Number of messages received - F + F - route_count/route_length + Total size (length) of messages routed in bytes/ Number of messages routed - D + D - drop_count/drop_length + Total size (length) of messages dropped in bytes/ Number of messages dropped + + + + + + When an interval is specified, additional lines of statistics are printed including the following fields: + + + + + + + + Output + + + Description + + + + + + + M + + + Number of messages currently being processed by LNet (The maximum number of messages ever processed by LNet concurrently) + + + + + E + + + Number of LNet errors per second + + + + + S + + + Rate of data sent in Mbytes per second/ Count of messages sent per second + + + + + R + + + Rate of data received in Mbytes per second/ Count of messages received per second + + + + + F + + + Rate of data routed in Mbytes per second/ Count of messages routed per second + + + + + D + + + Rate of data dropped in Mbytes per second/ Count of messages dropped per second @@ -2190,6 +2508,21 @@ routerstat
+ Example + # routerstat 1 +M 0(13) E 0 S 117379184/4250 R 878480/4356 F 0/0 D 0/0 +M 0( 13) E 0 S 7.00/ 7 R 0.00/ 14 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 7.00/ 7 R 0.00/ 14 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 8.00/ 8 R 0.00/ 16 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 7.00/ 7 R 0.00/ 14 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 7.00/ 7 R 0.00/ 14 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 7.00/ 7 R 0.00/ 14 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 7.00/ 7 R 0.00/ 14 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 8.00/ 8 R 0.00/ 16 F 0.00/ 0 D 0.00/0 +M 0( 13) E 0 S 7.00/ 7 R 0.00/ 14 F 0.00/ 0 D 0.00/0 +... +
+
Files The routerstat utility extracts statistics data from: /proc/sys/lnet/stats @@ -2201,7 +2534,7 @@ tunefs.lustre The tunefs.lustre utility modifies configuration information on a Lustre target disk.
Synopsis - tunefs.lustre [options] <device> + tunefs.lustre [options] /dev/device
Description @@ -2210,9 +2543,9 @@ tunefs.lustre Changes made here affect a file system only when the target is mounted the next time. With tunefs.lustre, parameters are "additive" -- new parameters are specified in addition to old parameters, they do not replace them. To erase all old tunefs.lustre parameters and just use newly-specified parameters, run: - $ tunefs.lustre --erase-params --param=<new parameters> - The tunefs.lustre command can be used to set any parameter settable in a /proc/fs/lustre file and that has its own OBD device, so it can be specified as <obd|fsname>.<obdtype>.<proc_file_name>=<value>. For example: - $ tunefs.lustre --param mdt.group_upcall=NONE /dev/sda1 + $ tunefs.lustre --erase-params --param=new_parameters + The tunefs.lustre command can be used to set any parameter settable in a /proc/fs/lustre file and that has its own OBD device, so it can be specified as {obd|fsname}.obdtype.proc_file_name=value. For example: + $ tunefs.lustre --param mdt.identity_upcall=NONE /dev/sda1
Options @@ -2234,7 +2567,7 @@ tunefs.lustre - --comment=comment + --comment=comment Sets a user comment about this disk, ignored by Lustre. @@ -2242,7 +2575,7 @@ tunefs.lustre - --dryrun + --dryrun Only prints what would be done; does not affect the disk. @@ -2250,7 +2583,7 @@ tunefs.lustre - --erase-params + --erase-params Removes all previous parameter information. @@ -2258,33 +2591,41 @@ tunefs.lustre - --failnode=nid,... - - - Sets the NID(s) of a failover partner. This option can be repeated as needed. - CAUTION: Cannot be used with --servicenode. - + --servicenode=nid,... + Sets the NID(s) of all service nodes, including primary and failover partner + service nodes. The --servicenode option cannot be used with + --failnode option. See for + more details. - --servicenode=nid,... + --failnode=nid,... - Sets the NID(s) of all service node, including failover partner as well as local service nids. This option can be repeated as needed. - CAUTION: Cannot be used with --failnode. + Sets the NID(s) of a failover service node for a primary server for a target. + The --failnode option cannot be used with + --servicenode option. See + for more details. + When the --failnode option is used, certain + restrictions apply (see ). + - --fsname=filesystem_name + --fsname=filesystem_name - The Lustre file system of which this service will be a part. The default file system name is 'lustreâ€. + The Lustre file system of which this service will be a part. The default file + system name is lustre. - --index=index + --index=index Forces a particular OST or MDT index. @@ -2292,20 +2633,21 @@ tunefs.lustre - --mountfsoptions=opts + --mountfsoptions=opts Sets the mount options used when the backing file system is mounted. - CAUTION: Unlike earlier versions of tunefs.lustre, this version completely replaces the existing mount options with those specified on the command line, and issues a warning on stderr if any default mount options are omitted. + Unlike earlier versions of tunefs.lustre, this version completely replaces the existing mount options with those specified on the command line, and issues a warning on stderr if any default mount options are omitted. The defaults for ldiskfs are: - OST: errors=remount-ro,mballoc,extents; - MGS/MDT: errors=remount-ro,iopen_nopriv,user_xattr + MGS/MDT: errors=remount-ro,iopen_nopriv,user_xattr + OST: errors=remount-ro,extents,mballoc + OST: errors=remount-ro Do not alter the default mount options unless you know what you are doing. - --network=net,... + --network=net,... Network(s) to which to restrict this OST/MDT. This option can be repeated as necessary. @@ -2313,7 +2655,7 @@ tunefs.lustre - --mgs + --mgs Adds a configuration management service to this target. @@ -2321,7 +2663,7 @@ tunefs.lustre - --msgnode=nid,... + --msgnode=nid,... Sets the NID(s) of the MGS node; required for all targets other than the MGS. @@ -2329,7 +2671,7 @@ tunefs.lustre - --nomgs + --nomgs Removes a configuration management service to this target. @@ -2337,7 +2679,7 @@ tunefs.lustre - --quiet + --quiet Prints less information. @@ -2345,7 +2687,7 @@ tunefs.lustre - --verbose + --verbose Prints more information. @@ -2353,17 +2695,33 @@ tunefs.lustre - --writeconf + --writeconf - Erases all configuration logs for the file system to which this MDT belongs, and regenerates them. This is dangerous operation. All clients must be unmounted and servers for this file system should be stopped. All targets (OSTs/MDTs) must then be restarted to regenerate the logs. No clients should be started until all targets have restarted. -   + Erases all configuration logs for the file system to which this MDT belongs, + and regenerates them. This is dangerous operation. All clients must be unmounted + and servers for this file system should be stopped. All targets (OSTs/MDTs) must + then be restarted to regenerate the logs. No clients should be started until all + targets have restarted. The correct order of operations is: - * Unmount all clients on the file system - * Unmount the MDT and all OSTs on the file system - * Run tunefs.lustre --writeconf <device> on every server - * Mount the MDT and OSTs - * Mount the clients + + + Unmount all clients on the file system + + + Unmount the MDT and all OSTs on the file system + + + Run tunefs.lustre --writeconf + device on every server + + + Mount the MDT and OSTs + + + Mount the clients + + @@ -2373,7 +2731,7 @@ tunefs.lustre
Examples Change the MGS's NID address. (This should be done on each target disk, since they should all contact the same MGS.) - tunefs.lustre --erase-param --mgsnode=<new_nid> --writeconf /dev/sda + tunefs.lustre --erase-param --mgsnode=new_nid --writeconf /dev/sda Add a failover NID location for this target. tunefs.lustre --param="failover.node=192.168.0.13@tcp0" /dev/sda
@@ -2403,26 +2761,22 @@ Additional System Configuration Utilities <indexterm><primary>utilities</primary><secondary>application profiling</secondary></indexterm> Application Profiling Utilities The following utilities are located in /usr/bin. - lustre_req_history.sh + lustre_req_history.sh The lustre_req_history.sh utility (run from a client), assembles as much Lustre RPC request history as possible from the local node and from the servers that were contacted, providing a better picture of the coordinated network activity. - llstat.sh - The llstat.sh utility handles a wider range of statistics files, and has command line switches to produce more graphable output. - plot-llstat.sh - The plot-llstat.sh utility plots the output from llstat.sh using gnuplot.
More /proc Statistics for Application Profiling The following utilities provide additional statistics. - vfs_ops_stats + vfs_ops_stats The client vfs_ops_stats utility tracks Linux VFS operation calls into Lustre for a single PID, PPID, GID or everything. /proc/fs/lustre/llite/*/vfs_ops_stats /proc/fs/lustre/llite/*/vfs_track_[pid|ppid|gid] - extents_stats + extents_stats The client extents_stats utility shows the size distribution of I/O calls from the client (cumulative and by process). /proc/fs/lustre/llite/*/extents_stats, extents_stats_per_process - offset_stats + offset_stats The client offset_stats utility shows the read/write seek activity of a client by offsets and ranges. /proc/fs/lustre/llite/*/offset_stats @@ -2432,7 +2786,9 @@ Application Profiling Utilities Per-client statistics tracked on the servers - Each MDT and OST now tracks LDLM and operations statistics for every connected client, for comparisons and simpler collection of distributed job statistics. + Each MDS and OSS now tracks LDLM and operations statistics for + every connected client, for comparisons and simpler collection of + distributed job statistics. /proc/fs/lustre/mds|obdfilter/*/exports/ @@ -2440,8 +2796,9 @@ Application Profiling Utilities Improved MDT statistics - More detailed MDT operations statistics are collected for better profiling. - /proc/fs/lustre/mds/*/stats + More detailed MDT operations statistics are collected for better + profiling. + /proc/fs/lustre/mdt/*/md_stats
@@ -2450,238 +2807,116 @@ Application Profiling Utilities Testing / Debugging Utilities Lustre offers the following test and debugging utilities.
- <indexterm><primary>loadgen</primary></indexterm> -loadgen - The Load Generator (loadgen) is a test program designed to simulate large numbers of Lustre clients connecting and writing to an OST. The loadgen utility is located at lustre/utils/loadgen (in a build directory) or at /usr/sbin/loadgen (from an RPM). - Loadgen offers the ability to run this test: - - - Start an arbitrary number of (echo) clients. - - - Start and connect to an echo server, instead of a real OST. - - - Create/bulk_write/delete objects on any number of echo clients simultaneously. - - - Currently, the maximum number of clients is limited by MAX_OBD_DEVICES and the amount of memory available. -
-
- Usage - The loadgen utility can be run locally on the OST server machine or remotely from any LNET host. The device command can take an optional NID as a parameter; if unspecified, the first local NID found is used. - The obdecho module must be loaded by hand before running loadgen. - # cd lustre/utils/ -# insmod ../obdecho/obdecho.ko -# ./loadgen -loadgen> h -This is a test program used to simulate large numbers of clients. The echo \ -obds are used, so the obdecho module must be loaded. - -Typical usage would be: -loadgen> dev lustre-OST0000 set the target device -loadgen> start 20 start 20 echo clients -loadgen> wr 10 5 have 10 clients do simultaneous brw_write \ -tests 5 times each - -Available commands are: - device - dl - echosrv - start - verbose - wait - write - help - exit - quit - -For more help type: help command-name -loadgen> -loadgen> device lustre-OST0000 192.168.0.21@tcp -Added uuid OSS_UUID: 192.168.0.21@tcp -Target OST name is 'lustre-OST0000' -loadgen> -loadgen> st 4 -start 0 to 4 -./loadgen: running thread #1 -./loadgen: running thread #2 -./loadgen: running thread #3 -./loadgen: running thread #4 -loadgen> wr 4 5 -Estimate 76 clients before we run out of grant space (155872K / 2097152) -1: i0 -2: i0 -4: i0 -3: i0 -1: done (0) -2: done (0) -4: done (0) -3: done (0) -wrote 25MB in 1.419s (17.623 MB/s) -loadgen> - - The loadgen utility prints periodic status messages; message output can be controlled with the verbose command. - To insure that a file can be written to (a requirement of write cache), OSTs reserve ("grants"), chunks of space for each newly-created file. A grant may cause an OST to report that it is out of space, even though there is plenty of space on the disk, because the space is "reserved" by other files. The loadgen utility estimates the number of simultaneous open files as the disk size divided by the grant size and reports that number when the write tests are first started. - Echo Server - The loadgen utility can start an echo server. On another node, loadgen can specify the echo server as the device, thus creating a network-only test environment. - loadgen> echosrv -loadgen> dl - 0 UP obdecho echosrv echosrv 3 - 1 UP ost OSS OSS 3 - - On another node: - loadgen> device echosrv cfs21@tcp -Added uuid OSS_UUID: 192.168.0.21@tcp -Target OST name is 'echosrv' -loadgen> st 1 -start 0 to 1 -./loadgen: running thread #1 -loadgen> wr 1 -start a test_brw write test on X clients for Y iterations -usage: write <num_clients> <num_iter> [<delay>] -loadgen> wr 1 1 -loadgen> -1: i0 -1: done (0) -wrote 1MB in 0.029s (34.023 MB/s) - - Scripting - The threads all perform their actions in non-blocking mode; use the wait command to block for the idle state. For example: - #!/bin/bash -./loadgen << EOF -device lustre-OST0000 -st 1 -wr 1 10 -wait -quit -EOF - - Feature Requests - The loadgen utility is intended to grow into a more comprehensive test tool; feature requests are encouraged. The current feature requests include: - - - Locking simulation - - - - - Many (echo) clients cache locks for the specified resource at the same time. - - - Many (echo) clients enqueue locks for the specified resource simultaneously. - - - - - obdsurvey functionality - - - - - Fold the Lustre I/O kit's obdsurvey script functionality into loadgen - - - - -
-
- <indexterm><primary>llog_reader</primary></indexterm> -llog_reader - The llog_reader utility translates a Lustre configuration log into human-readable form. -
-
- Synopsis - llog_reader filename - -
-
- Description - llog_reader parses the binary format of Lustre's on-disk configuration logs. It can only read the logs. Use tunefs.lustre to write to them. - To examine a log file on a stopped Lustre server, mount its backing file system as ldiskfs, then use llog_reader to dump the log file's contents. For example: - mount -t ldiskfs /dev/sda /mnt/mgs -llog_reader /mnt/mgs/CONFIGS/tfs-client - - To examine the same log file on a running Lustre server, use the ldiskfs-enabled debugfs utility (called debug.ldiskfs on some distributions) to extract the file. For example: - debugfs -c -R 'dump CONFIGS/tfs-client /tmp/tfs-client' /dev/sda -llog_reader /tmp/tfs-client - - - Although they are stored in the CONFIGS directory, mountdata files do not use the config log format and will confuse llog_reader. - - See Also - -
-
<indexterm><primary>lr_reader</primary></indexterm> lr_reader - The lr_reader utility translates a last received (last_rcvd) file into human-readable form. - The following utilites are part of the Lustre I/O kit. For more information, see . + The lr_reader utility translates the content of the last_rcvd and reply_data files into human-readable form. + The following utilities are part of the Lustre I/O kit. For more information, see .
- <indexterm><primary>sgpdd_survey</primary></indexterm> -sgpdd_survey - The sgpdd_survey utility tests 'bare metal' performance, bypassing as much of the kernel as possible. The sgpdd_survey tool does not require Lustre, but it does require the sgp_dd package. + <indexterm> + <primary>sgpdd-survey</primary> + </indexterm> sgpdd-survey + The sgpdd-survey utility tests 'bare metal' performance, + bypassing as much of the kernel as possible. The sgpdd-survey tool does + not require Lustre, but it does require the sgp_dd package. - The sgpdd_survey utility erases all data on the device. + The sgpdd-survey utility erases all data on the device.
- <indexterm><primary>obdfilter_survey</primary></indexterm>obdfilter_survey - The obdfilter_survey utility is a shell script that tests performance of isolated OSTS, the network via echo clients, and an end-to-end test. + <indexterm> + <primary>obdfilter-survey</primary> + </indexterm>obdfilter-survey + The obdfilter-survey utility is a shell script that tests + performance of isolated OSTS, the network via echo clients, and an end-to-end test.
<indexterm><primary>ior-survey</primary></indexterm>ior-survey The ior-survey utility is a script used to run the IOR benchmark. Lustre includes IOR version 2.8.6.
- <indexterm><primary>ost_survey</primary></indexterm>ost_survey - The ost_survey utility is an OST performance survey that tests client-to-disk performance of the individual OSTs in a Lustre file system. + <indexterm> + <primary>ost-survey</primary> + </indexterm>ost-survey + The ost-survey utility is an OST performance survey that tests + client-to-disk performance of the individual OSTs in a Lustre file system.
<indexterm><primary>stats-collect</primary></indexterm>stats-collect The stats-collect utility contains scripts used to collect application profiling information from Lustre clients and servers.
-
- <indexterm><primary>flock</primary></indexterm>Flock Feature - Lustre now includes the flock feature, which provides file locking support. Flock describes classes of file locks known as 'flocks'. Flock can apply or remove a lock on an open file as specified by the user. However, a single file may not, simultaneously, have both shared and exclusive locks. - By default, the flock utility is disabled on Lustre. Two modes are available. - - - - - - - - local mode - - - In this mode, locks are coherent on one node (a single-node flock), but not across all clients. To enable it, use -o localflock. This is a client-mount option. - - This mode does not impact performance and is appropriate for single-node databases. - - - - - - consistent mode - - - In this mode, locks are coherent across all clients. - To enable it, use the -o flock. This is a client-mount option. - CAUTION: This mode affects the performance of the file being flocked and may affect stability, depending on the Lustre version used. Consider using a newer Lustre version which is more stable. If the consistent mode is enabled and no applications are using flock, then it has no effect. - - - - - - A call to use flock may be blocked if another process is holding an incompatible lock. Locks created using flock are applicable for an open file table entry. Therefore, a single process may hold only one type of lock (shared or exclusive) on a single file. Subsequent flock calls on a file that is already locked converts the existing lock to the new lock mode. +
+ <indexterm><primary>fileset</primary></indexterm>Fileset Feature + With the fileset feature, Lustre now provides subdirectory mount + support. Subdirectory mounts, also referred to as filesets, allow a + client to mount a child directory of a parent filesystem, thereby limiting + the filesystem namespace visibility on a specific client. A common use + case is for a client to use a subdirectory mount when there is a desire to + limit the visibility of the entire filesystem namesapce to aid in the + prevention of accidental file deletions outside of the subdirectory + mount. + It is important to note that invocation of the subdirectory mount is + voluntary by the client and not does prevent access to files that are + visible in multiple subdirectory mounts via hard links. Furthermore, it + does not prevent the client from subsequently mounting the whole file + system without a subdirectory being specified. +
+ + <indexterm> + <primary>Lustre</primary> + <secondary>fileset</secondary> + </indexterm>Lustre fileset + + + + + + Lustre file system fileset feature + + +
- Example - $ mount -t lustre -o flock mds@tcp0:/lustre /mnt/client - You can check it in /etc/mtab. It should look like, - mds@tcp0:/lustre /mnt/client lustre rw,flock 0 0 + Examples + The following example will mount the + chipfs filesystem on client1 and create a + subdirectory v1_1 within that filesystem. Client2 + will then mount only the v1_1 subdirectory as a + fileset, thereby limiting access to anything else in the + chipfs filesystem from client2. + client1# mount -t lustre mgs@tcp:/chipfs /mnt/chip +client1# mkdir /mnt/chip/v1_1 + client2# mount -t lustre mgs@tcp:/chipfs/v1_1 /mnt/chipv1_1 + You can check the created mounts in /etc/mtab. It should look like + the following: + client1 +mds@tcp0:/chipfs/ /mnt/chip lustre rw 0 0 + +client2 +mds@tcp0:/chipfs/v1_1 /mnt/chipv1_1 lustre rw 0 0 + Create a directory under the /mnt/chip mount, and get its FID + client1# mkdir /mnt/chip/v1_2 +client1# lfs path2fid /mnt/chip/v1_2 +[0x200000400:0x2:0x0] + + If you try resolve the FID of the /mnt/chip/v1_2 + path (as created in the example above) on client2, an error will be returned + as the FID can not be resolved on client2 since it is not part of the + mounted fileset on that client. Recall that the fileset on client2 mounted + the v1_1 subdirectory beneath the top level + chipfs filesystem. + + client2# lfs fid2path /mnt/chip/v1_2 [0x200000400:0x2:0x0] +fid2path: error on FID [0x200000400:0x2:0x0]: No such file or directory + Subdirectory mounts do not have the .lustre + pseudo directory, which prevents clients from opening or accessing files + only by FID. + client1# ls /mnt/chipfs/.lustre + fid lost+found + client2# ls /mnt/chipv1_1/.lustre + ls: cannot access /mnt/chipv1_1/.lustre: No such file or directory +