From: Richard Henwood Date: Wed, 18 May 2011 15:55:45 +0000 (-0500) Subject: FIX: xrefs and tidying X-Git-Tag: workingxslt~16 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=38099eb9f4935b720f9159849030ab0686a91441;p=doc%2Fmanual.git FIX: xrefs and tidying --- diff --git a/LNETSelfTest.xml b/LNETSelfTest.xml index 6ab9f04..9db6b05 100644 --- a/LNETSelfTest.xml +++ b/LNETSelfTest.xml @@ -1,132 +1,89 @@ - + - Testing Lustre Network Performance (LNET Self-Test) + Testing Lustre Network Performance (LNET Self-Test) This chapter describes the LNET self-test, which is used by site administrators to confirm that Lustre Networking (LNET) has been properly installed and configured, and that underlying network software and hardware are performing according to expectations. The chapter includes: - LNET Self-Test Overview - Using LNET Self-Test - LNET Self-Test Command Reference -   -
- <anchor xml:id="dbdoclet.50438223_pgfId-1295792" xreflabel=""/> -
- 23.1 <anchor xml:id="dbdoclet.50438223_91742" xreflabel=""/>LNET Self-Test Overview + + + + + + + + + +
+ 23.1 LNET Self-Test Overview LNET self-test is a kernel module that runs over LNET and the Lustre network drivers (LNDs. It is designed to: Test the connection ability of the Lustre network - - - + Run regression tests of the Lustre network - - - + Test performance of the Lustre network - - - + - After you have obtained performance results for your Lustre network, refer to Chapter 25: Lustre Tuning for information about parameters that can be used to tune LNET for optimum performance. - - - - - - Note -Apart from the performance impact, LNET self-test is invisible to Lustre. - - - - +After you have obtained performance results for your Lustre network, refer to for information about parameters that can be used to tune LNET for optimum performance. + Apart from the performance impact, LNET self-test is invisible to Lustre. + An LNET self-test cluster includes two types of nodes: Console node - A node used to control and monitor an LNET self-test cluster. The console node serves as the user interface of the LNET self-test system and can be any node in the test cluster. All self-test commands are entered from the console node. From the console node, a user can control and monitor the status of the entire LNET self-test cluster (session). The console node is exclusive in that a user cannot control two different sessions from one console node. - - - + Test nodes - The nodes on which the tests are run. Test nodes are controlled by the user from the console node; the user does not need to log into them directly. - - - + LNET self-test has two user utilities: lst - The user interface for the self-test console (run on the console node). It provides a list of commands to control the entire test system, including commands to create a session, create test groups, etc. - - - + lstclient - The userspace LNET self-test program (run on a test node). The lstclient utility is linked with userspace LNDs and LNET. This utility is not needed if only kernel space LNET and LNDs are used. - - - + - - - - - - Note -Test nodes can be in either kernel or userspace. A console user can invite a kernel test node to join the test session by running lstadd_groupNID, but the console user cannot actively add a userspace test node to the test-session. However, the console user can passively accept a test node to the test session while the test node is running lstclient to connect to the console. - - - - + Test nodes can be in either kernel or userspace. A console user can invite a kernel test node to join the test session by running lstadd_groupNID, but the console user cannot actively add a userspace test node to the test-session. However, the console user can passively accept a test node to the test session while the test node is running lstclient to connect to the console. +
<anchor xml:id="dbdoclet.50438223_pgfId-1300634" xreflabel=""/>23.1.1 Prerequisites To run LNET self-test, these modules must be loaded: libcfs - - - + net - - - + lnet_selftest - - - + One of the klnds (i.e, ksocklnd, ko2iblnd...) as needed by your network configuration - - - + To load the required modules, run: modprobe lnet_selftest This command recursively loads the modules on which LNET self-test depends. - - - - - - Note -While the console and test nodes require all the prerequisite modules to be loaded, userspace test nodes do not require these modules. - - - - + While the console and test nodes require all the prerequisite modules to be loaded, userspace test nodes do not require these modules. +
-
- 23.2 <anchor xml:id="dbdoclet.50438223_48138" xreflabel=""/>Using LNET Self-Test +
+ 23.2 Using LNET Self-Test This section describes how to create and run an LNET self-test. The examples shown are for a test that simulates the traffic pattern of a set of Lustre servers on a TCP network accessed by Lustre clients on an InfiniBand network connected via LNET routers. In this example, half the clients are reading and half the clients are writing.
<anchor xml:id="dbdoclet.50438223_pgfId-1300917" xreflabel=""/>23.2.1 Creating a Session @@ -149,34 +106,19 @@ These three groups include: - Nodes that will function as “servers†to be accessed by “clients†during the LNET self-test session - - - - - - Nodes that will function as “clients†that will simulate reading data from the “servers†- - - + Nodes that will function as 'servers' to be accessed by 'clients' during the LNET self-test session + - Nodes that will function as “clients†that will simulate writing data to the “servers†+ Nodes that will function as 'clients' that will simulate reading data from the 'servers' + - + Nodes that will function as 'clients' that will simulate writing data to the 'servers' + - - - - - - Note -A console user can associate kernel space test nodes with the session by running lst add_group NIDs, but a userspace test node cannot be actively added to the session. However, the console user can passively "accept" a test node to associate with a test session while the test node running lstclient connects to the console node, i.e: lstclient --sesid CONSOLE_NID --group NAME). - - - - + A console user can associate kernel space test nodes with the session by running lst add_group NIDs, but a userspace test node cannot be actively added to the session. However, the console user can passively "accept" a test node to associate with a test session while the test node running lstclient connects to the console node, i.e: lstclient --sesid CONSOLE_NID --group NAME).
<anchor xml:id="dbdoclet.50438223_pgfId-1296646" xreflabel=""/>23.2.3 <anchor xml:id="dbdoclet.50438223_42848" xreflabel=""/>Defining and Running the Tests @@ -188,15 +130,11 @@ ping - A ping generates a short request message, which results in a short response. Pings are useful to determine latency and small message overhead and to simulate Lustre metadata traffic. + - - - - brw - In a brw (“bulk read writeâ€) test, data is transferred from the target to the source (brwread) or data is transferred from the source to the target (brwwrite). The size of the bulk transfer is set using the size parameter. A brw test is useful to determine network bandwidth and to simulate Lustre I/O traffic. - - - + brw - In a brw ('bulk read write') test, data is transferred from the target to the source (brwread) or data is transferred from the source to the target (brwwrite). The size of the bulk transfer is set using the size parameter. A brw test is useful to determine network bandwidth and to simulate Lustre I/O traffic. + In the example below, a batch is created called bulk_rw. Then two brw tests are added. In the first test, 1M of data is sent from the servers to the clients as a simulated read operation with a simple data validation check. In the second test, 4K of data is sent from the clients to the servers as a simulated write operation with a full data validation check. lst add_batch bulk_rw @@ -205,7 +143,7 @@ lst add_test --batch bulk_rw --from writers --to servers \ brw write check=full size=4K - The traffic pattern and test intensity is determined by several properties such as test type, distribution of test nodes, concurrency of test, and RDMA operation type. For more details, see Batch and Test Commands. +The traffic pattern and test intensity is determined by several properties such as test type, distribution of test nodes, concurrency of test, and RDMA operation type. For more details, see .
<anchor xml:id="dbdoclet.50438223_pgfId-1290855" xreflabel=""/>23.2.4 Sample Script @@ -229,20 +167,12 @@ # tear down lst end_session - - - - - - Note -This script can be easily adapted to pass the group NIDs by shell variables or command line arguments (making it good for general-purpose use). - - - - + This script can be easily adapted to pass the group NIDs by shell variables or command line arguments (making it good for general-purpose use). +
-
- 23.3 <anchor xml:id="dbdoclet.50438223_27277" xreflabel=""/>LNET Self-Test <anchor xml:id="dbdoclet.50438223_marker-1298562" xreflabel=""/>Command Reference +
+ 23.3 LNET Self-Test <anchor xml:id="dbdoclet.50438223_marker-1298562" xreflabel=""/>Command Reference The LNET self-test (lst) utility is used to issue LNET self-test commands. The lst utility takes a number of command line arguments. The first argument is the command name and subsequent arguments are command-specific.
<anchor xml:id="dbdoclet.50438223_pgfId-1290916" xreflabel=""/>23.3.1 <anchor xml:id="dbdoclet.50438223_91247" xreflabel=""/>Session Commands @@ -271,7 +201,7 @@ --force - Ends conflicting sessions. This determines who “wins†when one session conflicts with another. For example, if there is already an active session on this node, then the attempt to create a new session fails unless the -force flag is specified. If the -force flag is specified, then the active session is ended. Similarly, if a session attempts to add a node that is already “owned†by another session, the -force flag allows this session to “steal†the node. + Ends conflicting sessions. This determines who 'wins' when one session conflicts with another. For example, if there is already an active session on this node, then the attempt to create a new session fails unless the -force flag is specified. If the -force flag is specified, then the active session is ended. Similarly, if a session attempts to add a node that is already 'owned' by another session, the -force flag allows this session to 'steal' the node. <name> @@ -284,7 +214,7 @@ $ lst new_session --force read_write end_session - Stops all operations and tests in the current session and clears the session’s status. + Stops all operations and tests in the current session and clears the session'™s status. $ lst end_session show_session @@ -323,7 +253,7 @@ $ lst add_group servers 192.168.10.[35,40-45]@tcp$ lst add_group clients 192.168.1.[10-100]@tcp 192.168.[2,4].\[10-20]@tcp   update_group<name>[--refresh] [--clean<status>] [--remove<NIDs>] - Updates the state of nodes in a group or adjusts a group’s membership. This command is useful if some nodes have crashed and should be excluded from the group. + Updates the state of nodes in a group or adjusts a group'™s membership. This command is useful if some nodes have crashed and should be excluded from the group. @@ -362,7 +292,7 @@   unknown - The node’s status has yet to be determined. + The node'™s status has yet to be determined.   @@ -442,7 +372,7 @@ --sesid<NID> - The first console’s NID. + The first console'™s NID. --group<name> @@ -456,7 +386,8 @@ Example: - Console $ lst new_session testsessionClient1 $ lstclient --sesid 192.168.1.52@tcp --group clients + Console $ lst new_session testsession +Client1 $ lstclient --sesid 192.168.1.52@tcp --group clients Example: Client1 $ lstclient --sesid 192.168.1.52@tcp |--group clients --server_mode @@ -535,26 +466,35 @@ Examples showing use of the distribute parameter: - Clients: (C1, C2, C3, C4, C5, C6)Server: (S1, S2, S3)--distribute 1:1 (C1->S1), (C2->S2), (C3->S3), (C4->S1), (C5->S2),\(C6->S3) /* -> means test conversation */ --distribute 2:1 (C1,C2->S1), (C3,C\ -4->S2), (C5,C6->S3)--distribute 3:1 (C1,C2,C3->S1), (C4,C5,C6->S2), (NULL->S3)--distribute 3:2 (C1,C2,C3->S1,S2), (C4,C5,C6->S3,S1)--distribute 4:1 (C1,C2,C3,C4->S1), (C5,C6->S2), (NULL->S3)--distribute 4:2 (C1,C2,C3,C4->S1,S2), (C5, C6->S3, S1)--distribute 6:3 (C1,C2,C3,C4,C5,C6->S1,S2,S3) + +Clients: (C1, C2, C3, C4, C5, C6) +Server: (S1, S2, S3) +--distribute 1:1 (C1->S1), (C2->S2), (C3->S3), (C4->S1), (C5->S2), +\(C6->S3) /* -> means test conversation */ --distribute 2:1 (C1,C2->S1), (C3,C4->S2), (C5,C6->S3) +--distribute 3:1 (C1,C2,C3->S1), (C4,C5,C6->S2), (NULL->S3) +--distribute 3:2 (C1,C2,C3->S1,S2), (C4,C5,C6->S3,S1) +--distribute 4:1 (C1,C2,C3,C4->S1), (C5,C6->S2), (NULL->S3) +--distribute 4:2 (C1,C2,C3,C4->S1,S2), (C5, C6->S3, S1) +--distribute 6:3 (C1,C2,C3,C4,C5,C6->S1,S2,S3) The setting --distribute 1:1 is the default setting where each source node communicates with one target node. When the setting --distribute 1:<n> (where <n> is the size of the target group) is used, each source node communicates with every node in the target group. Note that if there are more source nodes than target nodes, some source nodes may share the same target nodes. Also, if there are more target nodes than source nodes, some higher-ranked target nodes will be idle. Example showing a brw test: - $ lst add_group clients 192.168.1.[10-17]@tcp$ lst add_group servers 192.168.10.[100-103]@tcp$ lst add_batch bulkperf$ lst add_test --batch bulkperf --loop 100 --concurrency 4 \--distribute 4:2 --from clients brw WRITE size=16K + +$ lst add_group clients 192.168.1.[10-17]@tcp +$ lst add_group servers 192.168.10.[100-103]@tcp +$ lst add_batch bulkperf +$ lst add_test --batch bulkperf --loop 100 --concurrency 4 \ +--distribute 4:2 --from clients brw WRITE size=16K In the example above, a batch test called bulkperf that will do a 16 kbyte bulk write request. In this test, two groups of four clients (sources) write to each of four servers (targets) as shown below: 192.168.1.[10-13] will write to 192.168.10.[100,101] - - - + 192.168.1.[14-17] will write to 192.168.10.[102,103] - - - +   list_batch [<name>] [--test<index>] [--active] [--invalid] [--server | client] @@ -647,7 +587,7 @@ <anchor xml:id="dbdoclet.50438223_pgfId-1291130" xreflabel=""/>23.3.4 Other Commands This section describes other lst commands. ping [-session] [--group<name>] [--nodes<NIDs>] [--batch<name>] [--server] [--timeout<seconds>] - Sends a “hello†query to the nodes. + Sends a 'hello' query to the nodes. @@ -778,5 +718,4 @@ se options. $ lst show_error clientsclients12345-192.168.1.15@tcp: [Session: 1 brw errors, 0 ping errors] \[RPC: 20 errors, 0 dropped,12345-192.168.1.16@tcp: [Session: 0 brw errors, 0 ping errors] \[RPC: 1 errors, 0 dropped, Total 2 error nodes in clients$ lst show_error --session clientsclients12345-192.168.1.15@tcp: [Session: 1 brw errors, 0 ping errors]Total 1 error nodes in clients
-