X-Git-Url: https://git.whamcloud.com/?a=blobdiff_plain;f=LustreProc.xml;h=bc70d6fd9552b6dbb7968130ca38a729fb075a36;hb=00c99af814574fe85ae7bea886d9fffcce4d0261;hp=64b3f9f2887b86259eb0a6f5902c43667f0357e3;hpb=a239b0876f76e85a259765f2b47b1ddd588f1bcd;p=doc%2Fmanual.git diff --git a/LustreProc.xml b/LustreProc.xml index 64b3f9f..bc70d6f 100644 --- a/LustreProc.xml +++ b/LustreProc.xml @@ -1,14 +1,16 @@ - + Lustre Parameters - The /proc and /sys file systems - acts as an interface to internal data structures in the kernel. This chapter - describes parameters and tunables that are useful for optimizing and - monitoring aspects of a Lustre file system. It includes these sections: + There are many parameters for Lustre that can tune client and server + performance, change behavior of the system, and report statistics about + various subsystems. This chapter describes the various parameters and + tunables that are useful for optimizing and monitoring aspects of a Lustre + file system. It includes these sections: - + . @@ -23,9 +25,11 @@ Typically, metrics are accessed via lctl get_param files and settings are changed by via lctl set_param. - While it is possible to access parameters in /proc + They allow getting and setting multiple parameters with a single command, + through the use of wildcards in one or more part of the parameter name. + While each of these parameters maps to files in /proc and /sys directly, the location of these parameters may - change between releases, so it is recommended to always use + change between Lustre releases, so it is recommended to always use lctl to access the parameters from userspace scripts. Some data is server-only, some data is client-only, and some data is exported from the client to the server and is thus duplicated in both @@ -34,24 +38,29 @@ In the examples in this chapter, # indicates a command is entered as root. Lustre servers are named according to the convention fsname-MDT|OSTnumber. - The standard UNIX wildcard designation (*) is used. + The standard UNIX wildcard designation (*) is used to represent any + part of a single component of the parameter name, excluding + "." and "/". + It is also possible to use brace {}expansion + to specify a list of parameter names efficiently. Some examples are shown below: - To obtain data from a Lustre client: - # lctl list_param osc.* -osc.testfs-OST0000-osc-ffff881071d5cc00 -osc.testfs-OST0001-osc-ffff881071d5cc00 -osc.testfs-OST0002-osc-ffff881071d5cc00 -osc.testfs-OST0003-osc-ffff881071d5cc00 -osc.testfs-OST0004-osc-ffff881071d5cc00 -osc.testfs-OST0005-osc-ffff881071d5cc00 -osc.testfs-OST0006-osc-ffff881071d5cc00 -osc.testfs-OST0007-osc-ffff881071d5cc00 -osc.testfs-OST0008-osc-ffff881071d5cc00 + To list available OST targets on a Lustre client: + # lctl list_param -F osc.* +osc.testfs-OST0000-osc-ffff881071d5cc00/ +osc.testfs-OST0001-osc-ffff881071d5cc00/ +osc.testfs-OST0002-osc-ffff881071d5cc00/ +osc.testfs-OST0003-osc-ffff881071d5cc00/ +osc.testfs-OST0004-osc-ffff881071d5cc00/ +osc.testfs-OST0005-osc-ffff881071d5cc00/ +osc.testfs-OST0006-osc-ffff881071d5cc00/ +osc.testfs-OST0007-osc-ffff881071d5cc00/ +osc.testfs-OST0008-osc-ffff881071d5cc00/ In this example, information about OST connections available - on a client is displayed (indicated by "osc"). + on a client is displayed (indicated by "osc"). Each of these + connections may have numerous sub-parameters as well. @@ -71,12 +80,22 @@ osc.testfs-OST0000-osc-ffff881071d5cc00.rpc_stats + To see a specific subset of parameters, use braces, like: +# lctl list_param osc.*.{checksum,connect}* +osc.testfs-OST0000-osc-ffff881071d5cc00.checksum_type +osc.testfs-OST0000-osc-ffff881071d5cc00.checksums +osc.testfs-OST0000-osc-ffff881071d5cc00.connect_flags + + + + + To view a specific file, use lctl get_param: # lctl get_param osc.lustre-OST0000*.rpc_stats For more information about using lctl, see . + xmlns:xlink="http://www.w3.org/1999/xlink" linkend="setting_param_with_lctl"/>. Data can also be viewed using the cat command with the full path to the file. The form of the cat command is similar to that of the lctl get_param @@ -89,35 +108,18 @@ osc.testfs-OST0000-osc-ffff881071d5cc00.rpc_stats version and the Lustre version being used. The lctl command insulates scripts from these changes and is preferred over direct file access, unless as part of a high-performance monitoring system. - In the cat command: - - - Replace the dots in the path with slashes. - - - Prepend the path with the appropriate directory component: - /{proc,sys}/{fs,sys}/{lustre,lnet} - - - For example, an lctl get_param command may look like - this:# lctl get_param osc.*.uuid -osc.testfs-OST0000-osc-ffff881071d5cc00.uuid=594db456-0685-bd16-f59b-e72ee90e9819 -osc.testfs-OST0001-osc-ffff881071d5cc00.uuid=594db456-0685-bd16-f59b-e72ee90e9819 -... - The equivalent cat command may look like this: - # cat /proc/fs/lustre/osc/*/uuid -594db456-0685-bd16-f59b-e72ee90e9819 -594db456-0685-bd16-f59b-e72ee90e9819 -... - or like this: - # cat /sys/fs/lustre/osc/*/uuid -594db456-0685-bd16-f59b-e72ee90e9819 -594db456-0685-bd16-f59b-e72ee90e9819 -... + + Starting in Lustre 2.12, there is + lctl get_param and lctl set_param + command can provide tab completion when using an + interactive shell with bash-completion installed. + This simplifies the use of get_param significantly, + since it provides an interactive list of available parameters. + The llstat utility can be used to monitor some Lustre file system I/O activity over a specified time period. For more details, see - + Some data is imported from attached clients and is available in a directory called exports located in the corresponding per-service directory on a Lustre server. For example: @@ -373,7 +375,7 @@ testfs-MDT0000 brw_stats – Histogram data characterizing I/O requests to the OSTs. For more details, see . + linkend="monitor_ost_block_io_stream"/>. rpc_stats – Histogram data showing information about RPCs made by @@ -991,17 +993,19 @@ PID: 11429 to that point in the table of calls (cum %). -
+
<indexterm> <primary>proc</primary> <secondary>block I/O</secondary> </indexterm>Monitoring the OST Block I/O Stream - The brw_stats file in the obdfilter directory - contains histogram data showing statistics for number of I/O requests sent to the disk, - their size, and whether they are contiguous on the disk or not. + The brw_stats parameter file below the + osd-ldiskfs or osd-zfs directory + contains histogram data showing statistics for number of I/O requests + sent to the disk, their size, and whether they are contiguous on the + disk or not. Example: - Enter on the OSS: - # lctl get_param obdfilter.testfs-OST0000.brw_stats + Enter on the OSS or MDS: + oss# lctl get_param osd-*.*.brw_stats snapshot_time: 1372775039.769045 (secs.usecs) read | write pages per bulk r/w rpcs % cum % | rpcs % cum % @@ -1071,10 +1075,11 @@ disk I/O size ios % cum % | ios % cum % 512K: 0 0 100 | 24 0 0 1M: 0 0 100 | 23142 99 100 - The tabular data is described in the table below. Each row in the table shows the number - of reads and writes occurring for the statistic (ios), the relative - percentage of total reads or writes (%), and the cumulative percentage to - that point in the table for the statistic (cum %). + The tabular data is described in the table below. Each row in the + table shows the number of reads and writes occurring for the statistic + (ios), the relative percentage of total reads or + writes (%), and the cumulative percentage to that + point in the table for the statistic (cum %). @@ -1292,9 +1297,9 @@ osc.myth-OST0001-osc-ffff8804296c2800.checksum_type - llite.fsname_instance.max_cache_mb - - Maximum amount of inactive data cached by the client. The - default value is 3/4 of the client RAM. + llite.fsname_instance.max_cached_mb + - Maximum amount of read+write data cached by the client. The + default value is 1/2 of the client RAM. @@ -1343,7 +1348,7 @@ write RPCs in flight: 0 llite.fsname_instance.max_read_ahead_mb - - Controls the maximum amount of data readahead on a file. + - Controls the maximum amount of data readahead on all files. Files are read ahead in RPC-sized chunks (4 MiB, or the size of the read() call, if larger) after the second sequential read on a file descriptor. Random reads are done at @@ -1362,7 +1367,7 @@ write RPCs in flight: 0 llite.fsname_instance.max_read_ahead_per_file_mb - Controls the maximum number of megabytes (MiB) of data that should be prefetched by the client when sequential reads are - detected on a file. This is the per-file readahead limit and + detected on one file. This is the per-file readahead limit and cannot be larger than max_read_ahead_mb. @@ -1428,118 +1433,227 @@ write RPCs in flight: 0 <indexterm> <primary>proc</primary> <secondary>read cache</secondary> - </indexterm>Tuning OSS Read Cache - The OSS read cache feature provides read-only caching of data on an OSS. This - functionality uses the Linux page cache to store the data and uses as much physical memory + Tuning Server Read Cache + The server read cache feature provides read-only caching of file + data on an OSS or MDS (for Data-on-MDT). This functionality uses the + Linux page cache to store the data and uses as much physical memory as is allocated. - OSS read cache improves Lustre file system performance in these situations: + The server read cache can improves Lustre file system performance + in these situations: - Many clients are accessing the same data set (as in HPC applications or when - diskless clients boot from the Lustre file system). + Many clients are accessing the same data set (as in HPC + applications or when diskless clients boot from the Lustre file + system). - One client is storing data while another client is reading it (i.e., clients are - exchanging data via the OST). + One client is writing data while another client is reading + it (i.e., clients are exchanging data via the filesystem). A client has very limited caching of its own. - OSS read cache offers these benefits: + The server read cache offers these benefits: - Allows OSTs to cache read data more frequently. + Allows servers to cache read data more frequently. - Improves repeated reads to match network speeds instead of disk speeds. + Improves repeated reads to match network speeds instead of + storage speeds. - Provides the building blocks for OST write cache (small-write aggregation). + Provides the building blocks for server write cache + (small-write aggregation).
- Using OSS Read Cache - OSS read cache is implemented on the OSS, and does not require any special support on - the client side. Since OSS read cache uses the memory available in the Linux page cache, - the appropriate amount of memory for the cache should be determined based on I/O patterns; - if the data is mostly reads, then more cache is required than would be needed for mostly - writes. - OSS read cache is managed using the following tunables: + Using Server Read Cache + The server read cache is implemented on the OSS and MDS, and does + not require any special support on the client side. Since the server + read cache uses the memory available in the Linux page cache, the + appropriate amount of memory for the cache should be determined based + on I/O patterns. If the data is mostly reads, then more cache is + beneficial on the server than would be needed for mostly writes. + + The server read cache is managed using the following tunables. + Many tunables are available for both osd-ldiskfs + and osd-zfs, but in some cases the implementation + of osd-zfs prevents their use. - read_cache_enable - Controls whether data read from disk during - a read request is kept in memory and available for later read requests for the same - data, without having to re-read it from disk. By default, read cache is enabled - (read_cache_enable=1). - When the OSS receives a read request from a client, it reads data from disk into - its memory and sends the data as a reply to the request. If read cache is enabled, - this data stays in memory after the request from the client has been fulfilled. When - subsequent read requests for the same data are received, the OSS skips reading data - from disk and the request is fulfilled from the cached data. The read cache is managed - by the Linux kernel globally across all OSTs on that OSS so that the least recently - used cache pages are dropped from memory when the amount of free memory is running - low. - If read cache is disabled (read_cache_enable=0), the OSS - discards the data after a read request from the client is serviced and, for subsequent - read requests, the OSS again reads the data from disk. - To disable read cache on all the OSTs of an OSS, run: - root@oss1# lctl set_param obdfilter.*.read_cache_enable=0 - To re-enable read cache on one OST, run: - root@oss1# lctl set_param obdfilter.{OST_name}.read_cache_enable=1 - To check if read cache is enabled on all OSTs on an OSS, run: - root@oss1# lctl get_param obdfilter.*.read_cache_enable + read_cache_enable - High-level control of + whether data read from storage during a read request is kept in + memory and available for later read requests for the same data, + without having to re-read it from storage. By default, read cache + is enabled (read_cache_enable=1) for HDD OSDs + and automatically disabled for flash OSDs + (nonrotational=1). + The read cache cannot be disabled for osd-zfs, + and as a result this parameter is unavailable for that backend. + + When the server receives a read request from a client, + it reads data from storage into its memory and sends the data + to the client. If read cache is enabled for the target, + and the RPC and object size also meet the other criterion below, + this data may stay in memory after the client request has + completed. If later read requests for the same data are received, + if the data is still in cache the server skips reading it from + storage. The cache is managed by the Linux kernel globally + across all targets on that server so that the infrequently used + cache pages are dropped from memory when the free memory is + running low. + If read cache is disabled + (read_cache_enable=0), or the read or object + is large enough that it will not benefit from caching, the server + discards the data after the read request from the client is + completed. For subsequent read requests the server again reads + the data from storage. + To disable read cache on all targets of a server, run: + + oss1# lctl set_param osd-*.*.read_cache_enable=0 + + To re-enable read cache on one target, run: + + oss1# lctl set_param osd-*.{target_name}.read_cache_enable=1 + + To check if read cache is enabled on targets on a server, run: + + + oss1# lctl get_param osd-*.*.read_cache_enable + - writethrough_cache_enable - Controls whether data sent to the - OSS as a write request is kept in the read cache and available for later reads, or if - it is discarded from cache when the write is completed. By default, the writethrough - cache is enabled (writethrough_cache_enable=1). - When the OSS receives write requests from a client, it receives data from the - client into its memory and writes the data to disk. If the writethrough cache is - enabled, this data stays in memory after the write request is completed, allowing the - OSS to skip reading this data from disk if a later read request, or partial-page write - request, for the same data is received. + writethrough_cache_enable - High-level + control of whether data sent to the server as a write request is + kept in the read cache and available for later reads, or if it is + discarded when the write completes. By default, writethrough + cache is enabled (writethrough_cache_enable=1) + for HDD OSDs and automatically disabled for flash OSDs + (nonrotational=1). + The write cache cannot be disabled for osd-zfs, + and as a result this parameter is unavailable for that backend. + + When the server receives write requests from a client, it + fetches data from the client into its memory and writes the data + to storage. If the writethrough cache is enabled for the target, + and the RPC and object size meet the other criterion below, + this data may stay in memory after the write request has + completed. If later read or partial-block write requests for this + same data are received, if the data is still in cache the server + skips reading it from storage. + If the writethrough cache is disabled - (writethrough_cache_enabled=0), the OSS discards the data after - the write request from the client is completed. For subsequent read requests, or - partial-page write requests, the OSS must re-read the data from disk. - Enabling writethrough cache is advisable if clients are doing small or unaligned - writes that would cause partial-page updates, or if the files written by one node are - immediately being accessed by other nodes. Some examples where enabling writethrough - cache might be useful include producer-consumer I/O models or shared-file writes with - a different node doing I/O not aligned on 4096-byte boundaries. - Disabling the writethrough cache is advisable when files are mostly written to the - file system but are not re-read within a short time period, or files are only written - and re-read by the same node, regardless of whether the I/O is aligned or not. - To disable the writethrough cache on all OSTs of an OSS, run: - root@oss1# lctl set_param obdfilter.*.writethrough_cache_enable=0 + (writethrough_cache_enabled=0), or the + write or object is large enough that it will not benefit from + caching, the server discards the data after the write request + from the client is completed. For subsequent read requests, or + partial-page write requests, the server must re-read the data + from storage. + Enabling writethrough cache is advisable if clients are doing + small or unaligned writes that would cause partial-page updates, + or if the files written by one node are immediately being read by + other nodes. Some examples where enabling writethrough cache + might be useful include producer-consumer I/O models or + shared-file writes that are not aligned on 4096-byte boundaries. + + Disabling the writethrough cache is advisable when files are + mostly written to the file system but are not re-read within a + short time period, or files are only written and re-read by the + same node, regardless of whether the I/O is aligned or not. + To disable writethrough cache on all targets on a server, run: + + + oss1# lctl set_param osd-*.*.writethrough_cache_enable=0 + To re-enable the writethrough cache on one OST, run: - root@oss1# lctl set_param obdfilter.{OST_name}.writethrough_cache_enable=1 + + oss1# lctl set_param osd-*.{OST_name}.writethrough_cache_enable=1 + To check if the writethrough cache is enabled, run: - root@oss1# lctl get_param obdfilter.*.writethrough_cache_enable + + oss1# lctl get_param osd-*.*.writethrough_cache_enable + - readcache_max_filesize - Controls the maximum size of a file - that both the read cache and writethrough cache will try to keep in memory. Files - larger than readcache_max_filesize will not be kept in cache for - either reads or writes. - Setting this tunable can be useful for workloads where relatively small files are - repeatedly accessed by many clients, such as job startup files, executables, log - files, etc., but large files are read or written only once. By not putting the larger - files into the cache, it is much more likely that more of the smaller files will - remain in cache for a longer time. - When setting readcache_max_filesize, the input value can be - specified in bytes, or can have a suffix to indicate other binary units such as - K (kilobytes), M (megabytes), - G (gigabytes), T (terabytes), or - P (petabytes). - To limit the maximum cached file size to 32 MB on all OSTs of an OSS, run: - root@oss1# lctl set_param obdfilter.*.readcache_max_filesize=32M - To disable the maximum cached file size on an OST, run: - root@oss1# lctl set_param obdfilter.{OST_name}.readcache_max_filesize=-1 - To check the current maximum cached file size on all OSTs of an OSS, run: - root@oss1# lctl get_param obdfilter.*.readcache_max_filesize + readcache_max_filesize - Controls the + maximum size of an object that both the read cache and + writethrough cache will try to keep in memory. Objects larger + than readcache_max_filesize will not be kept + in cache for either reads or writes regardless of the + read_cache_enable or + writethrough_cache_enable settings. + Setting this tunable can be useful for workloads where + relatively small objects are repeatedly accessed by many clients, + such as job startup objects, executables, log objects, etc., but + large objects are read or written only once. By not putting the + larger objects into the cache, it is much more likely that more + of the smaller objects will remain in cache for a longer time. + + When setting readcache_max_filesize, + the input value can be specified in bytes, or can have a suffix + to indicate other binary units such as + K (kibibytes), + M (mebibytes), + G (gibibytes), + T (tebibytes), or + P (pebibytes). + + To limit the maximum cached object size to 64 MiB on all OSTs of + a server, run: + + + oss1# lctl set_param osd-*.*.readcache_max_filesize=64M + + To disable the maximum cached object size on all targets, run: + + + oss1# lctl set_param osd-*.*.readcache_max_filesize=-1 + + + To check the current maximum cached object size on all targets of + a server, run: + + + oss1# lctl get_param osd-*.*.readcache_max_filesize + + + + readcache_max_io_mb - Controls the maximum + size of a single read IO that will be cached in memory. Reads + larger than readcache_max_io_mb will be read + directly from storage and bypass the page cache completely. + This avoids significant CPU overhead at high IO rates. + The read cache cannot be disabled for osd-zfs, + and as a result this parameter is unavailable for that backend. + + When setting readcache_max_io_mb, the + input value can be specified in mebibytes, or can have a suffix + to indicate other binary units such as + K (kibibytes), + M (mebibytes), + G (gibibytes), + T (tebibytes), or + P (pebibytes). + + + writethrough_max_io_mb - Controls the + maximum size of a single writes IO that will be cached in memory. + Writes larger than writethrough_max_io_mb will + be written directly to storage and bypass the page cache entirely. + This avoids significant CPU overhead at high IO rates. + The write cache cannot be disabled for osd-zfs, + and as a result this parameter is unavailable for that backend. + + When setting writethrough_max_io_mb, the + input value can be specified in mebibytes, or can have a suffix + to indicate other binary units such as + K (kibibytes), + M (mebibytes), + G (gibibytes), + T (tebibytes), or + P (pebibytes).
@@ -1606,7 +1720,7 @@ obdfilter.lol-OST0001.sync_journal=0
$ lctl get_param obdfilter.*.sync_on_lock_cancel obdfilter.lol-OST0001.sync_on_lock_cancel=never
-
+
<indexterm> <primary>proc</primary> @@ -2088,7 +2202,7 @@ nid refs state max rtr min tx min queue <literal>rtr </literal></para> </entry> <entry> - <para>Number of routing buffer credits.</para> + <para>Number of available routing buffer credits.</para> </entry> </row> <row> @@ -2106,7 +2220,7 @@ nid refs state max rtr min tx min queue <literal>tx </literal></para> </entry> <entry> - <para>Number of send credits.</para> + <para>Number of available send credits.</para> </entry> </row> <row> @@ -2130,27 +2244,41 @@ nid refs state max rtr min tx min queue </tbody> </tgroup> </informaltable> - <para>Credits are initialized to allow a certain number of operations (in the example - above the table, eight as shown in the <literal>max</literal> column. LNet keeps track - of the minimum number of credits ever seen over time showing the peak congestion that - has occurred during the time monitored. Fewer available credits indicates a more - congested resource. </para> - <para>The number of credits currently in flight (number of transmit credits) is shown in - the <literal>tx</literal> column. The maximum number of send credits available is shown - in the <literal>max</literal> column and never changes. The number of router buffers - available for consumption by a peer is shown in the <literal>rtr</literal> - column.</para> - <para>Therefore, <literal>rtr</literal> – <literal>tx</literal> is the number of transmits - in flight. Typically, <literal>rtr == max</literal>, although a configuration can be set - such that <literal>max >= rtr</literal>. The ratio of routing buffer credits to send - credits (<literal>rtr/tx</literal>) that is less than <literal>max</literal> indicates - operations are in progress. If the ratio <literal>rtr/tx</literal> is greater than - <literal>max</literal>, operations are blocking.</para> - <para>LNet also limits concurrent sends and number of router buffers allocated to a single - peer so that no peer can occupy all these resources.</para> + <para>Credits are initialized to allow a certain number of operations + (in the example above the table, eight as shown in the + <literal>max</literal> column. LNet keeps track of the minimum + number of credits ever seen over time showing the peak congestion + that has occurred during the time monitored. Fewer available credits + indicates a more congested resource. </para> + <para>The number of credits currently available is shown in the + <literal>tx</literal> column. The maximum number of send credits is + shown in the <literal>max</literal> column and never changes. The + number of currently active transmits can be derived by + <literal>(max - tx)</literal>, as long as + <literal>tx</literal> is greater than or equal to 0. Once + <literal>tx</literal> is less than 0, it indicates the number of + transmits on that peer which have been queued for lack of credits. + </para> + <para>The number of router buffer credits available for consumption + by a peer is shown in <literal>rtr</literal> column. The number of + routing credits can be configured separately at the LND level or at + the LNet level by using the <literal>peer_buffer_credits</literal> + module parameter for the appropriate module. If the routing credits + is not set explicitly, it'll default to the maximum transmit credits + defined by <literal>peer_credits</literal> module parameter. + Whenever a gateway routes a message from a peer, it decrements the + number of available routing credits for that peer. If that value + goes to zero, then messages will be queued. Negative values show the + number of queued message waiting to be routed. The number of + messages which are currently being routed from a peer can be derived + by <literal>(max_rtr_credits - rtr)</literal>.</para> + <para>LNet also limits concurrent sends and number of router buffers + allocated to a single peer so that no peer can occupy all resources. + </para> </listitem> <listitem> - <para><literal>nis</literal> - Shows the current queue health on this node.</para> + <para><literal>nis</literal> - Shows current queue health on the node. + </para> <para>Example:</para> <screen># lctl get_param nis nid refs peer max tx min @@ -2247,7 +2375,7 @@ nid refs peer max tx min </listitem> </itemizedlist></para> </section> - <section remap="h3" xml:id="dbdoclet.balancing_free_space"> + <section remap="h3" xml:id="balancing_free_space"> <title><indexterm> <primary>proc</primary> <secondary>free space</secondary> @@ -2288,8 +2416,9 @@ nid refs peer max tx min space is more than this. The default is 0.2% of total OST size.</para> </listitem> </itemizedlist> - <para>For more information about monitoring and managing free space, see <xref - xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438209_10424"/>.</para> + <para>For more information about monitoring and managing free space, see + <xref xmlns:xlink="http://www.w3.org/1999/xlink" + linkend="file_striping.managing_free_space"/>.</para> </section> <section remap="h3"> <title><indexterm> @@ -2309,19 +2438,19 @@ nid refs peer max tx min <itemizedlist> <listitem> <para>To enable automatic LRU sizing, set the - <literal>lru_size</literal> parameter to 0. In this case, the - <literal>lru_size</literal> parameter shows the current number of locks + <literal>lru_size</literal> parameter to 0. In this case, the + <literal>lru_size</literal> parameter shows the current number of locks being used on the client. Dynamic LRU resizing is enabled by default. - </para> + </para> </listitem> <listitem> <para>To specify a maximum number of locks, set the - <literal>lru_size</literal> parameter to a value other than zero. - A good default value for compute nodes is around - <literal>100 * <replaceable>num_cpus</replaceable></literal>. + <literal>lru_size</literal> parameter to a value other than zero. + A good default value for compute nodes is around + <literal>100 * <replaceable>num_cpus</replaceable></literal>. It is recommended that you only set <literal>lru_size</literal> - to be signifivantly larger on a few login nodes where multiple - users access the file system interactively.</para> + to be signifivantly larger on a few login nodes where multiple + users access the file system interactively.</para> </listitem> </itemizedlist> <para>To clear the LRU on a single client, and, as a result, flush client @@ -2334,7 +2463,7 @@ nid refs peer max tx min <note> <para>The <literal>lru_size</literal> parameter can only be set temporarily using <literal>lctl set_param</literal>, it cannot be set - permanently.</para> + permanently.</para> </note> <para>To disable dynamic LRU resizing on the clients, run for example: </para> @@ -2362,7 +2491,7 @@ nid refs peer max tx min ldlm.namespaces.myth-MDT0000-mdc-ffff8804296c2800.lru_max_age=900000 </screen> </section> - <section xml:id="dbdoclet.50438271_87260"> + <section xml:id="tuning_setting_thread_count"> <title><indexterm> <primary>proc</primary> <secondary>thread counts</secondary> @@ -2460,15 +2589,18 @@ ldlm.namespaces.myth-MDT0000-mdc-ffff8804296c2800.lru_max_age=900000 <screen># lctl set_param <replaceable>service</replaceable>.threads_<replaceable>min|max|started=num</replaceable> </screen> </listitem> <listitem> - <para>To permanently set this tunable, run:</para> - <screen># lctl conf_param <replaceable>obdname|fsname.obdtype</replaceable>.threads_<replaceable>min|max|started</replaceable> </screen> - <para condition='l25'>For version 2.5 or later, run: - <screen># lctl set_param -P <replaceable>service</replaceable>.threads_<replaceable>min|max|started</replaceable></screen></para> + <para>To permanently set this tunable, run the following command on + the MGS: + <screen>mgs# lctl set_param -P <replaceable>service</replaceable>.threads_<replaceable>min|max|started</replaceable></screen></para> + <para condition='l25'>For Lustre 2.5 or earlier, run: + <screen>mgs# lctl conf_param <replaceable>obdname|fsname.obdtype</replaceable>.threads_<replaceable>min|max|started</replaceable></screen> + </para> </listitem> </itemizedlist> - <para>The following examples show how to set thread counts and get the number of running threads - for the service <literal>ost_io</literal> using the tunable - <literal><replaceable>service</replaceable>.threads_<replaceable>min|max|started</replaceable></literal>.</para> + <para>The following examples show how to set thread counts and get the + number of running threads for the service <literal>ost_io</literal> + using the tunable + <literal><replaceable>service</replaceable>.threads_<replaceable>min|max|started</replaceable></literal>.</para> <itemizedlist> <listitem> <para>To get the number of running threads, run:</para> @@ -2507,7 +2639,7 @@ ost.OSS.ost_io.threads_max=256</screen> </note> <para>See also <xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="lustretuning"/></para> </section> - <section xml:id="dbdoclet.50438271_83523"> + <section xml:id="enabling_interpreting_debugging_logs"> <title><indexterm> <primary>proc</primary> <secondary>debug</secondary> @@ -2608,8 +2740,8 @@ debug=neterror warning error emerg console</screen> <section> <title>Interpreting OST Statistics - See also (llobdstat) and - (collectl). + See also + (collectl). OST stats files can be used to provide statistics showing activity for each OST. For example: @@ -2868,8 +3000,8 @@ ost_write 21 2 59 [bytes] 7648424 15019 332725.08 910694 180397.87
Interpreting MDT Statistics - See also (llobdstat) and - (collectl). + See also + (collectl). MDT stats files can be used to track MDT statistics for the MDS. The example below shows sample output from an @@ -2890,3 +3022,6 @@ notify 16 samples [reqs]
+