From: Richard Henwood Date: Tue, 17 May 2011 20:58:48 +0000 (-0500) Subject: FIX: xrefs X-Git-Tag: workingxslt~30 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=ab64b1c6fc1d5b96c8dd2bf4cbc044e6a40b67cd;p=doc%2Fmanual.git FIX: xrefs --- diff --git a/ConfiguringLNET.xml b/ConfiguringLNET.xml index 4e28fa2..18e924c 100644 --- a/ConfiguringLNET.xml +++ b/ConfiguringLNET.xml @@ -1,66 +1,48 @@ - + Configuring Lustre Networking (LNET) + This chapter describes how to configure Lustre Networking (LNET). It includes the following sections: + + - Overview of LNET Module Parameters - - - - - - Setting the LNET Module networks Parameter - - - - - - Setting the LNET Module ip2nets Parameter - - - - - - Setting the LNET Module routes Parameter - - - - - - Testing the LNET Configuration - - - - - - Configuring the Router Checker - - - - - - Best Practices for LNET Options - - - - - - - - - - - Note -Configuring LNET is optional. LNET will, by default, use the first TCP/IP interface it discovers on a system (eth0). If this network configuration is sufficient, you do not need to configure LNET. LNET configuration is required if you are using Infiniband or multiple Ethernet interfaces. - - - - -
- <anchor xml:id="dbdoclet.50438216_pgfId-1304719" xreflabel=""/> -
- 9.1 <anchor xml:id="dbdoclet.50438216_33148" xreflabel=""/>Overview of LNET Module Parameters + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Configuring LNET is optional. LNET will, by default, use the first TCP/IP interface it discovers on a system (eth0). If this network configuration is sufficient, you do not need to configure LNET. LNET configuration is required if you are using Infiniband or multiple Ethernet interfaces. + + +
+ 9.1 Overview of LNET Module Parameters LNET kernel module (lnet) parameters specify how LNET is to be configured to work with Lustre, including which NICs will be configured to work with Lustre and the routing to be used with Lustre. Parameters for lnet are specified in the modprobe.conf or modules.conf file (depending on your Linux distribution) in one or more entries with the syntax: options lnet <parameter>=<parameter value> @@ -70,37 +52,23 @@ networks - Specifies the networks to be used. - - - ip2nets - Lists globally-available networks, each with a range of IP addresses. LNET then identifies locally-available networks through address list-matching lookup. - - - - See Setting the LNET Module networks Parameter and Setting the LNET Module ip2nets Parameter for more details. +See and Setting the LNET Module ip2nets Parameter for more details. To set up routing between networks, use: routes - Lists networks and the NIDs of routers that forward to them. - - - - See Setting the LNET Module routes Parameter for more details. - A router checker can be configured to enable Lustre nodes to detect router health status, avoid routers that appear dead, and reuse those that restore service after failures. See Configuring the Router Checker for more details. - For a complete reference to the LNET module parameters, see LNET Options. - - - - - - Note -We recommend that you use “dotted-quad†notation for IP addresses rather than host names to make it easier to read debug logs and debug configurations with multiple interfaces. - - - - +See for more details. + A router checker can be configured to enable Lustre nodes to detect router health status, avoid routers that appear dead, and reuse those that restore service after failures. See for more details. + For a complete reference to the LNET module parameters, see LNET Options. + + +We recommend that you use 'dotted-quad' notation for IP addresses rather than host names to make it easier to read debug logs and debug configurations with multiple interfaces. + +
<anchor xml:id="dbdoclet.50438216_pgfId-1304745" xreflabel=""/>9.1.1 Using a Lustre Network Identifier (NID) to Identify a Node A Lustre network identifier (NID) is used to uniquely identify a Lustre network endpoint by node ID and network type. The format of the NID is: @@ -116,9 +84,9 @@ lctl which_nid <MDS NID>
-
- 9.2 <anchor xml:id="dbdoclet.50438216_46279" xreflabel=""/>Setting the LNET Module networks Parameter - If a node has more than one network interface, you’ll typically want to dedicate a specific interface to Lustre. You can do this by including an entry in the modprobe.conf file on the node that sets the LNET module networks parameter: +
+ 9.2 Setting the LNET Module networks Parameter + If a node has more than one network interface, you'll typically want to dedicate a specific interface to Lustre. You can do this by including an entry in the modprobe.conf file on the node that sets the LNET module networks parameter: options lnet networks=<comma-separated list of networks> This example specifies that a Lustre node will use a TCP/IP interface and an InfiniBand interface: @@ -131,16 +99,11 @@ options lnet networks=tcp0(eth2),tcp1(eth3) When more than one interface is available during the network setup, Lustre chooses the best route based on the hop count. Once the network connection is established, Lustre expects the network to stay connected. In a Lustre network, connections do not fail over to another interface, even if multiple interfaces are available on the same node. - - - - - - Note -LNET lines in modprobe.conf are only used by the local node to determine what to call its interfaces. They are not used for routing decisions. - - - - + + +LNET lines in modprobe.conf are only used by the local node to determine what to call its interfaces. They are not used for routing decisions. + +
<anchor xml:id="dbdoclet.50438216_pgfId-1304771" xreflabel=""/>9.2.1 <anchor xml:id="dbdoclet.50438216_74334" xreflabel=""/>Multihome Server Example If a server with multiple IP addresses (multihome server) is connected to a Lustre network, certain configuration setting are required. An example illustrating these setting consists of a network with the following nodes: @@ -148,34 +111,19 @@ Server svr1 with three TCP NICs (eth0, eth1, and eth2) and an InfiniBand NIC. - - - Server svr2 with three TCP NICs (eth0, eth1, and eth2) and an InfiniBand NIC. Interface eth2 will not be used for Lustre networking. - - - TCP clients, each with a single TCP interface. - - - InfiniBand clients, each with a single Infiniband interface and a TCP/IP interface for administration. - - - To set the networks option for this example: On each server, svr1 and svr2, include the following line in the modprobe.conf file: - - - options lnet networks=tcp0(eth0),tcp1(eth1),o2ib @@ -183,41 +131,23 @@ For TCP-only clients, the first available non-loopback IP interface is used for tcp0. Thus, TCP clients with only one interface do not need to have options defined in the modprobe.conf file. - - - On the InfiniBand clients, include the following line in the modprobe.conf file: - - - options lnet networks=o2ib - - - - - - Note -By default, Lustre ignores the loopback (lo0) interface. Lustre does not ignore IP addresses aliased to the loopback. If you alias IP addresses to the loopback interface, you must specify all Lustre networks using the LNET networks parameter. - - - - - - - - - - Note -If the server has multiple interfaces on the same subnet, the Linux kernel will send all traffic using the first configured interface. This is a limitation of Linux, not Lustre. In this case, network interface bonding should be used. For more information about network interface bonding, see Chapter 7: Setting Up Network Interface Bonding. - - - - + + + By default, Lustre ignores the loopback (lo0) interface. Lustre does not ignore IP addresses aliased to the loopback. If you alias IP addresses to the loopback interface, you must specify all Lustre networks using the LNET networks parameter. + + + If the server has multiple interfaces on the same subnet, the Linux kernel will send all traffic using the first configured interface. This is a limitation of Linux, not Lustre. In this case, network interface bonding should be used. For more information about network interface bonding, see . + +
-
- 9.3 <anchor xml:id="dbdoclet.50438216_31414" xreflabel=""/>Setting the LNET Module ip2nets Parameter +
+ 9.3 Setting the LNET Module ip2nets Parameter The ip2nets option is typically used when a single, universal modprobe.conf file is run on all servers and clients. Each node identifies the locally available networks based on the listed IP address patterns that match the node's local IP addresses. Note that the IP address patterns listed in the ip2nets option are only used to identify the networks that an individual node should instantiate. They are not used by LNET for any other communications purpose. For the example below, the nodes in the network have these IP addresses: @@ -225,38 +155,26 @@ Server svr1: eth0 IP address 192.168.0.2, IP over Infiniband (o2ib) address 132.6.1.2. - - - Server svr2: eth0 IP address 192.168.0.4, IP over Infiniband (o2ib) address 132.6.1.4. - - - TCP clients have IP addresses 192.168.0.5-255. - - - Infiniband clients have IP over Infiniband (o2ib) addresses 132.6.[2-3].2, .4, .6, .8. - - - The following entry is placed in the modprobe.conf file on each server and client: options lnet 'ip2nets="tcp0(eth0) 192.168.0.[2,4]; \ tcp0 192.168.0.*; o2ib0 132.6.[1-3].[2-8/2]"' - Each entry in ip2nets is referred to as a “ruleâ€. + Each entry in ip2nets is referred to as a 'rule'. The order of LNET entries is important when configuring servers. If a server node can be reached using more than one network, the first network specified in modprobe.conf will be used. Because svr1 and svr2 match the first rule, LNET uses eth0 for tcp0 on those machines. (Although svr1 and svr2 also match the second rule, the first matching rule for a particular network is used). The [2-8/2] format indicates a range of 2-8 stepped by 2; that is 2,4,6,8. Thus, the clients at 132.6.3.5 will not find a matching o2ib network.
-
- 9.4 <anchor xml:id="dbdoclet.50438216_71227" xreflabel=""/>Setting the LNET Module routes Parameter +
+ 9.4 Setting the LNET Module routes Parameter The LNET module routes parameter is used to identify routers in a Lustre configuration. These parameters are set in modprob.conf on each Lustre node. The LNET routes parameter specifies a colon-separated list of router definitions. Each route is defined as a network number, followed by a list of routers: routes=<net type> <router NID(s)> @@ -285,12 +203,12 @@
-
- 9.5 <anchor xml:id="dbdoclet.50438216_10523" xreflabel=""/>Testing the LNET Configuration - After configuring Lustre Networking, it is highly recommended that you test your LNET configuration using the LNET Self-Test provided with the Lustre software. For more information about using LNET Self-Test, see Chapter 23: Testing Lustre Network Performance (LNET Self-Test). +
+ 9.5 Testing the LNET Configuration + After configuring Lustre Networking, it is highly recommended that you test your LNET configuration using the LNET Self-Test provided with the Lustre software. For more information about using LNET Self-Test, see .
-
- 9.6 <anchor xml:id="dbdoclet.50438216_35668" xreflabel=""/>Configuring the Router Checker +
+ 9.6 Configuring the Router Checker In a Lustre configuration in which different types of networks, such as a TCP/IP network and an Infiniband network, are connected by routers, a router checker can be run on the clients and servers in the routed configuration to monitor the status of the routers. In a multi-hop routing configuration, router checkers can be configured on routers to monitor the health of their next-hop routers. A router checker is configured by setting lnet parameters in modprobe.conf by including an entry in this form: options lnet <router checker parameter>=<parameter value> @@ -299,55 +217,36 @@ live_router_check_interval - Specifies a time interval in seconds after which the router checker will ping the live routers. The default value is 0, meaning no checking is done. To set the value to 60, enter: - - - options lnet live_router_check_interval=60 dead_router_check_interval - Specifies a time interval in seconds after which the router checker will check for dead routers. The default value is 0, meaning no checking is done. To set the value to 60, enter: - - - options lnet dead_router_check_interval=60 auto_down - Enables/disables (1/0) the automatic marking of router state as up or down. The default value is 1. To disable router marking, enter: - - - options lnet auto_down=0 router_ping_timeout - Specifies a timeout for the router checker when it checks live or dead routers. The router checker sends a ping message to each dead or live router once every dead_router_check_interval or live_router_check_interval respectively. The default value is 50. To set the value to 60, enter: - - - options lnet router_ping_timeout=60 - - - - - - Note -The router_ping_timeout is consistent with the default LND timeouts. You may have to increase it on very large clusters if the LND timeout is also increased. For larger clusters, we suggest increasing the check interval. - - - - + + +The router_ping_timeout is consistent with the default LND timeouts. You may have to increase it on very large clusters if the LND timeout is also increased. For larger clusters, we suggest increasing the check interval. + + + check_routers_before_use - Specifies that routers are to be checked before use. Set to off by default. If this parameter is set to on, the dead_router_check_interval parameter must be given a positive integer value. - - - options lnet check_routers_before_use=on @@ -356,29 +255,23 @@ Time the router was disabled - - - Elapsed disable time - - - If the router checker does not get a reply message from the router within router_ping_timeout seconds, it considers the router to be down. If a router is marked “up†and responds to a ping, the timeout is reset. If 100 packets have been sent successfully through a router, the sent-packets counter for that router will have a value of 100.
-
- 9.7 <anchor xml:id="dbdoclet.50438216_15200" xreflabel=""/>Best Practices for LNET Options +
+ 9.7 Best Practices for LNET Options For the networks, ip2nets, and routes options, follow these best practices to avoid configuration errors.
<anchor xml:id="dbdoclet.50438216_pgfId-1304888" xreflabel=""/>Escaping commas with quotes Depending on the Linux distribution, commas may need to be escaped using single or double quotes. In the extreme case, the options entry would look like this: options lnet'networks="tcp0,elan0"' 'routes="tcp [2,10]@elan0"' Added quotes may confuse some distributions. Messages such as the following may indicate an issue related to added quotes: - lnet: Unknown parameter ‘'networks' - A “Refusing connection - no matching NID†message generally points to an error in the LNET module configuration. + lnet: Unknown parameter 'networks' + A 'Refusing connection - no matching NID' message generally points to an error in the LNET module configuration.
<anchor xml:id="dbdoclet.50438216_pgfId-1304894" xreflabel=""/>Including comments @@ -386,6 +279,5 @@ In this incorrect example, LNET silently ignores pt11 192.168.0.[92,96], resulting in these nodes not being properly initialized. No error message is generated. options lnet ip2nets=
-