From db4f659f94cdceabb259a0e6686072060a7fa1db Mon Sep 17 00:00:00 2001 From: Gregoire Pichon Date: Tue, 15 Sep 2015 16:48:31 +0200 Subject: [PATCH] LUDOC-304 tuning: support multiple modify RPCs in parallel Update the Lustre manual to reflect the support of multiple modify metadata RPCs in parallel, LU-5319: https://jira.hpdd.intel.com/browse/LU-5319. The following sections are added or modified: - Tuning the Client Metadata RPC Stream - Reply Reconstruction - lr_reader debugging utility Signed-off-by: Gregoire Pichon Change-Id: I3b4199424b2ca43b94e26ee715ae7cc9d5770080 Reviewed-on: http://review.whamcloud.com/16429 Reviewed-by: Richard Henwood Tested-by: Jenkins --- LustreProc.xml | 49 ++++++++++++++++++++++++++++++++++++++++ LustreRecovery.xml | 4 ++++ SystemConfigurationUtilities.xml | 2 +- 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/LustreProc.xml b/LustreProc.xml index d4ea8a1..efa0ec4 100644 --- a/LustreProc.xml +++ b/LustreProc.xml @@ -1495,6 +1495,55 @@ obdfilter.lol-OST0001.sync_journal=0 $ lctl get_param obdfilter.*.sync_on_lock_cancel obdfilter.lol-OST0001.sync_on_lock_cancel=never +
+ <indexterm><primary>proc</primary><secondary>client metadata performance</secondary></indexterm>Tuning the Client Metadata RPC Stream + The client metadata RPC stream represents the metadata RPCs issued in parallel by a client to a MDT target. The metadata RPCs can be split in two categories: the requests that do not modify the file system (like getattr operation), and the requests that do modify the file system (like create, unlink, setattr operations). To help optimize the client metadata RPC stream, several tuning variables are provided to adjust behavior according to network conditions and cluster size. + Note that increasing the number of metadata RPCs issued in parallel might improve the performance metadata intensive parallel applications, but as a consequence it will consum more memory on the client and on the MDS. +
+ Configuring the Client Metadata RPC Stream + The MDC max_rpcs_in_flight parameter defines the maximum number of metadata RPCs, both modifying and non-modifying RPCs, that can be sent in parallel by a client to a MDT target. This includes every file system metadata operations, such as file or directory stat, creation, unlink. The default setting is 8, minimum setting is 1 and maximum setting is 256. + To set the max_rpcs_in_flight parameter, run the following command on the Lustre client: + $ lctl set_param mdc.*.max_rcps_in_flight=16 + The MDC max_mod_rpcs_in_flight parameter defines the maximum number of file system modifying RPCs that can be sent in parallel by a client to a MDT target. For example, the Lustre client sends modify RPCs when it performs file or directory creation, unlink, access permission modification or ownership modification. The default setting is 7, minimum setting is 1 and maximum setting is 256. + To set the max_mod_rpcs_in_flight parameter, run the following command on the Lustre client: + $ lctl set_param mdc.*.max_mod_rcps_in_flight=12 + The max_mod_rpcs_in_flight value must be strictly less than the max_rpcs_in_flight value. It must also be less or equal to the MDT max_mod_rpcs_per_client value. If one of theses conditions is not enforced, the setting fails and an explicit message is written in the Lustre log. + The MDT max_mod_rpcs_per_client parameter is a tunable of the kernel module mdt that defines the maximum number of file system modifying RPCs in flight allowed per client. The parameter can be updated at runtime, but the change is effective to new client connections only. The default setting is 8. + To set the max_mod_rpcs_per_client parameter, run the following command on the MDS: + $ echo 12 > /sys/module/mdt/parameters/max_mod_rpcs_per_client +
+
+ Monitoring the Client Metadata RPC Stream + The rpc_stats file contains histogram data showing information about modify metadata RPCs. It can be helpful to identify the level of parallelism achieved by an application doing modify metadata operations. + Example: + $ lctl get_param mdc.testfs-MDT0000-mdt-ffff88077fb3a000.rpc_stats +snapshot_time: 1441876896.567070 (secs.usecs) +modify_RPCs_in_flight: 0 + + modify +rpcs in flight rpcs % cum % +0: 0 0 0 +1: 56 0 0 +2: 40 0 0 +3: 70 0 0 +4 41 0 0 +5: 51 0 1 +6: 88 0 1 +7: 366 1 2 +8: 1321 5 8 +9: 3624 15 23 +10: 6482 27 50 +11: 7321 30 81 +12: 4540 18 100 + The file information includes: + + snapshot_time - UNIX epoch instant the file was read. + modify_RPCs_in_flight - Number of modify RPCs issued by the MDC, but not completed at the time of the snapshot. This value should always be less than or equal to max_mod_rpcs_in_flight. + rpcs in flight - Number of modify RPCs that are pending when a RPC is sent, the relative percentage (%) of total modify RPCs, and the cumulative percentage (cum %) to that point. + + If a large proportion of modify metadata RPCs are issued with a number of pending metadata RPCs close to the max_mod_rpcs_in_flight value, it means the max_mod_rpcs_in_flight value could be increased to improve the modify metadata performance. +
+
Configuring Timeouts in a Lustre File System diff --git a/LustreRecovery.xml b/LustreRecovery.xml index 36205b9..67a1832 100644 --- a/LustreRecovery.xml +++ b/LustreRecovery.xml @@ -286,6 +286,10 @@ The lock handle can be found by walking the list of granted locks for the resource looking for one with the appropriate remote file handle (present in the re-sent request). Verify that the lock has the right mode (determined by performing the disposition/request/status analysis above) and is granted to the proper client.
+
+ Multiple Reply Data per Client + Since Lustre 2.8, the MDS is able to save several reply data per client. The reply data are stored in the reply_data internal file of the MDT. Additionally to the XID of the request, the transaction number, the result code and the open "disposition", the reply data contains a generation number that identifies the client thanks to the content of the last_rcvd file. +
<indexterm><primary>Version-based recovery (VBR)</primary></indexterm>Version-based Recovery diff --git a/SystemConfigurationUtilities.xml b/SystemConfigurationUtilities.xml index bd11b10..f4a21bb 100644 --- a/SystemConfigurationUtilities.xml +++ b/SystemConfigurationUtilities.xml @@ -2723,7 +2723,7 @@ EOF
<indexterm><primary>lr_reader</primary></indexterm> lr_reader - The lr_reader utility translates a last received (last_rcvd) file into human-readable form. + The lr_reader utility translates the content of the last_rcvd and reply_data files into human-readable form. The following utilities are part of the Lustre I/O kit. For more information, see .
-- 1.8.3.1