1 <?xml version='1.0' encoding='UTF-8'?>
2 <!-- This document was created with Syntext Serna Free. --><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="configuringlnet">
3 <title xml:id="configuringlnet.title">Configuring Lustre Networking (LNet)</title>
4 <para>This chapter describes how to configure Lustre Networking (LNet). It includes the following sections:</para>
7 <para><xref linkend="dbdoclet.50438216_15201"/>
11 <para><xref linkend="dbdoclet.50438216_33148"/>
15 <para><xref linkend="dbdoclet.50438216_46279"/>
19 <para><xref linkend="dbdoclet.50438216_31414"/>
23 <para><xref linkend="dbdoclet.50438216_71227"/>
27 <para><xref linkend="dbdoclet.50438216_10523"/>
31 <para><xref linkend="dbdoclet.50438216_35668"/>
35 <para><xref linkend="dbdoclet.50438216_15200"/>
40 <para>Configuring LNet is optional.</para>
41 <para> LNet will use the first TCP/IP interface it discovers on a
42 system (<literal>eth0</literal>) if it's loaded using the
43 <literal>lctl network up</literal>. If this network configuration is
44 sufficient, you do not need to configure LNet. LNet configuration is
45 required if you are using Infiniband or multiple Ethernet
47 <para condition='l27'>The <literal>lnetctl</literal> utility can be used
48 to initialize LNet without bringing up any network interfaces. Network
49 interfaces can be added after configuring LNet via <literal>lnetctl</literal>.
50 <literal>lnetctl</literal> can also be used to manage an operational LNet.
51 However, if it wasn't initialized by <literal>lnetctl</literal> then
52 <literal>lnetctl lnet configure</literal> must be invoked before
53 <literal>lnetctl</literal> can be used to manage LNet.</para>
54 <para condition='l27'>DLC also introduces a C-API to enable
55 configuring LNet programatically. See <xref
56 linkend="lnetconfigurationapi"/></para>
58 <section xml:id="dbdoclet.50438216_15201" condition='l27'>
60 <primary>LNet</primary>
61 <secondary>Configuring LNet</secondary>
62 </indexterm>Configuring LNet via <literal>lnetctl</literal></title>
63 <para>The <literal>lnetctl</literal> utility can be used to initialize
64 and configure the LNet kernel module after it has been loaded via
65 <literal>modprobe</literal>. In general the lnetctl format is as
67 <screen>lnetctl cmd subcmd [options]</screen>
68 <para>The following configuration items are managed by the tool:</para>
72 <para>Configuring/unconfiguring LNet</para>
75 <para>Adding/removing/showing Networks</para>
78 <para>Adding/removing/showing Routes</para>
81 <para>Enabling/Disabling routing</para>
84 <para>Configuring Router Buffer Pools</para>
90 <primary>LNet</primary>
91 <secondary>cli</secondary>
92 </indexterm>Configuring LNet</title>
93 <para>After LNet has been loaded via <literal>modprobe</literal>,
94 <literal>lnetctl</literal> utility can be used to configure LNet
95 without bringing up networks which are specified in the module
96 parameters. It can also be used to configure network interfaces
97 specified in the module prameters by providing the
98 <literal>--all</literal> option.</para>
99 <screen>lnetctl lnet configure [--all]
100 # --all: load NI configuration from module parameters</screen>
101 <para>The <literal>lnetctl</literal> utility can also be used to
102 unconfigure LNet.</para>
103 <screen>lnetctl lnet unconfigure</screen>
105 <section xml:id="dbdoclet.lnetaddshowdelete">
106 <title><indexterm><primary>LNet</primary>
107 <secondary>cli</secondary></indexterm>Adding, Deleting and Showing
109 <para>Networks can be added, deleted, or shown after the LNet kernel
110 module is loaded.</para>
111 <para>The <emphasis role="bold"><literal>lnetctl net add</literal>
112 </emphasis> command is used to add networks:</para>
113 <screen>lnetctl net add: add a network
114 --net: net name (ex tcp0)
115 --if: physical interface (ex eth0)
116 --peer_timeout: time to wait before declaring a peer dead
117 --peer_credits: defines the max number of inflight messages
118 --peer_buffer_credits: the number of buffer credits per peer
119 --credits: Network Interface credits
120 --cpts: CPU Partitions configured net uses
121 --help: display this help text
124 lnetctl net add --net tcp2 --if eth0
125 --peer_timeout 180 --peer_credits 8</screen>
126 <note condition='l2A'><para>With the addition of Software based Multi-Rail
127 in Lustre 2.10, the following should be noted:</para>
129 <listitem><para>--net: no longer needs to be unique since multiple
130 interfaces can be added to the same network.</para></listitem>
131 <listitem><para>--if: The same interface per network can be added
132 only once, however, more than one interface can now be specified
133 (separated by a comma) for a node. For example: eth0,eth1,eth2.
135 </itemizedlist></para>
136 <para>For examples on adding multiple interfaces via
137 <literal>lnetctl net add</literal> and/or YAML, please see
138 <xref linkend="dbdoclet.mrconfiguring" />
141 <para>Networks can be deleted with the
142 <emphasis role="bold"><literal>lnetctl net del</literal></emphasis>
144 <screen>net del: delete a network
145 --net: net name (ex tcp0)
146 --if: physical inerface (e.g. eth0)
149 lnetctl net del --net tcp2</screen>
150 <note condition='l2A'><para>In a Software Multi-Rail configuration,
151 specifying only the <literal>--net</literal> argument will delete the
152 entire network and all interfaces under it. The new
153 <literal>--if</literal> switch should also be used in conjunction with
154 <literal>--net</literal> to specify deletion of a specific interface.
156 <para>All or a subset of the configured networks can be shown with the
157 <emphasis role="bold"><literal>lnetctl net show</literal></emphasis>
158 command. The output can be non-verbose or verbose.</para>
159 <screen>net show: show networks
160 --net: net name (ex tcp0) to filter on
161 --verbose: display detailed output per network
165 lnetctl net show --verbose
166 lnetctl net show --net tcp2 --verbose</screen>
167 <para>Below are examples of non-detailed and detailed network
168 configuration show.</para>
169 <screen># non-detailed show
170 > lnetctl net show --net tcp2
172 - nid: 192.168.205.130@tcp2
178 > lnetctl net show --net tcp2 --verbose
180 - nid: 192.168.205.130@tcp2
187 peer_buffer_credits: 0
188 credits: 256</screen>
190 <section condition='l2A'>
192 <primary>LNet</primary>
193 <secondary>cli</secondary>
194 </indexterm>Adding, Deleting and Showing Peers</title>
195 <para>The <emphasis role="bold"><literal>lnetctl peer add</literal>
196 </emphasis> command is used to add a remote peer to a software
197 multi-rail configuration.</para>
198 <para>When configuring peers, use the <literal>–-prim_nid</literal>
199 option to specify the key or primary nid of the peer node. Then
200 follow that with the <literal>--nid</literal> option to specify a set
201 of comma separated NIDs.</para>
202 <screen>peer add: add a peer
203 --prim_nid: primary NID of the peer
204 --nid: comma separated list of peer nids (e.g. 10.1.1.2@tcp0)
205 --non_mr: if specified this interface is created as a non mulit-rail
206 capable peer. Only one NID can be specified in this case.</screen>
207 <para>For example:</para>
209 lnetctl peer add --prim_nid 10.10.10.2@tcp --nid 10.10.3.3@tcp1,10.4.4.5@tcp2
211 <para>The <literal>--prim-nid</literal> (primary nid for the peer
212 node) can go unspecified. In this case, the first listed NID in the
213 <literal>--nid</literal> option becomes the primary nid of the peer.
216 lnetctl peer_add --nid 10.10.10.2@tcp,10.10.3.3@tcp1,10.4.4.5@tcp2</screen>
217 <para>YAML can also be used to configure peers:</para>
219 - primary nid: <key or primary nid>
224 - nid: <nid n></screen>
225 <para>As with all other commands, the result of the
226 <literal>lnetctl peer show</literal> command can be used to gather
227 information to aid in configuring or deleting a peer:</para>
228 <screen>lnetctl peer show -v</screen>
229 <para>Example output from the <literal>lnetctl peer show</literal>
232 - primary nid: 192.168.122.218@tcp
235 - nid: 192.168.122.218@tcp
238 available_tx_credits: 8
239 available_rtr_credits: 8
246 - nid: 192.168.122.78@tcp
249 available_tx_credits: 8
250 available_rtr_credits: 8
257 - nid: 192.168.122.96@tcp
260 available_tx_credits: 8
261 available_rtr_credits: 8
268 <para>Use the following <literal>lnetctl</literal> command to delete a
270 <screen>peer del: delete a peer
271 --prim_nid: Primary NID of the peer
272 --nid: comma separated list of peer nids (e.g. 10.1.1.2@tcp0)</screen>
273 <para><literal>prim_nid</literal> should always be specified. The
274 <literal>prim_nid</literal> identifies the peer. If the
275 <literal>prim_nid</literal> is the only one specified, then the entire
276 peer is deleted.</para>
277 <para>Example of deleting a single nid of a peer (10.10.10.3@tcp):
279 <screen>lnetctl peer del --prim_nid 10.10.10.2@tcp --nid 10.10.10.3@tcp</screen>
280 <para>Example of deleting the entire peer:</para>
281 <screen>lnetctl peer del --prim_nid 10.10.10.2@tcp</screen>
286 <primary>LNet</primary>
287 <secondary>cli</secondary>
288 </indexterm>Adding, Deleting and Showing routes</title>
289 <para>A set of routes can be added to identify how LNet messages are
291 <screen>lnetctl route add: add a route
292 --net: net name (ex tcp0) LNet message is destined to.
293 The can not be a local network.
294 --gateway: gateway node nid (ex 10.1.1.2@tcp) to route
295 all LNet messaged destined for the identified
297 --hop: number of hops to final destination
298 (1 < hops < 255)
299 --priority: priority of route (0 - highest prio)
302 lnetctl route add --net tcp2 --gateway 192.168.205.130@tcp1 --hop 2 --prio 1</screen>
303 <para>Routes can be deleted via the following <literal>lnetctl</literal> command.</para>
304 <screen>lnetctl route del: delete a route
305 --net: net name (ex tcp0)
306 --gateway: gateway nid (ex 10.1.1.2@tcp)
309 lnetctl route del --net tcp2 --gateway 192.168.205.130@tcp1</screen>
310 <para>Configured routes can be shown via the following
311 <literal>lnetctl</literal> command.</para>
312 <screen>lnetctl route show: show routes
313 --net: net name (ex tcp0) to filter on
314 --gateway: gateway nid (ex 10.1.1.2@tcp) to filter on
315 --hop: number of hops to final destination
316 (1 < hops < 255) to filter on
317 --priority: priority of route (0 - highest prio)
319 --verbose: display detailed output per route
326 lnetctl route show --verbose</screen>
327 <para>When showing routes the <literal>--verbose</literal> option
328 outputs more detailed information. All show and error output are in
329 YAML format. Below are examples of both non-detailed and detailed
330 route show output.</para>
331 <screen>#Non-detailed output
335 gateway: 192.168.205.130@tcp1
338 > lnetctl route show --verbose
341 gateway: 192.168.205.130@tcp1
348 <primary>LNet</primary>
349 <secondary>cli</secondary>
350 </indexterm>Enabling and Disabling Routing</title>
351 <para>When an LNet node is configured as a router it will route LNet
352 messages not destined to itself. This feature can be enabled or
353 disabled as follows.</para>
354 <screen>lnetctl set routing [0 | 1]
355 # 0 - disable routing feature
356 # 1 - enable routing feature</screen>
360 <primary>LNet</primary>
361 <secondary>cli</secondary>
362 </indexterm>Showing routing information</title>
363 <para>When routing is enabled on a node, the tiny, small and large
364 routing buffers are allocated. See <xref
365 linkend="dbdoclet.50438272_73839"/> for more details on router
366 buffers. This information can be shown as follows:</para>
367 <screen>lnetctl routing show: show routing information
370 lnetctl routing show</screen>
371 <para>An example of the show output:</para>
372 <screen>> lnetctl routing show
394 <primary>LNet</primary>
395 <secondary>cli</secondary>
396 </indexterm>Configuring Routing Buffers</title>
397 <para> The routing buffers values configured specify the number of
398 buffers in each of the tiny, small and large groups.</para>
399 <para>It is often desirable to configure the tiny, small and large
400 routing buffers to some values other than the default. These values
401 are global values, when set they are used by all configured CPU
402 partitions. If routing is enabled then the values set take effect
403 immediately. If a larger number of buffers is specified, then
404 buffers are allocated to satisfy the configuration change. If fewer
405 buffers are configured then the excess buffers are freed as they
406 become unused. If routing is not set the values are not changed.
407 The buffer values are reset to default if routing is turned off and
409 <para>The <literal>lnetctl</literal> 'set' command can be
410 used to set these buffer values. A VALUE greater than 0
411 will set the number of buffers accordingly. A VALUE of 0
412 will reset the number of buffers to system defaults.</para>
413 <screen>set tiny_buffers:
414 set tiny routing buffers
415 VALUE must be greater than or equal to 0
417 set small_buffers: set small routing buffers
418 VALUE must be greater than or equal to 0
420 set large_buffers: set large routing buffers
421 VALUE must be greater than or equal to 0</screen>
422 <para>Usage examples:</para>
423 <screen>> lnetctl set tiny_buffers 4096
424 > lnetctl set small_buffers 8192
425 > lnetctl set large_buffers 2048</screen>
426 <para>The buffers can be set back to the default values as follows:</para>
427 <screen>> lnetctl set tiny_buffers 0
428 > lnetctl set small_buffers 0
429 > lnetctl set large_buffers 0</screen>
433 <primary>LNet</primary>
434 <secondary>cli</secondary>
435 </indexterm>Importing YAML Configuration File</title>
436 <para>Configuration can be described in YAML format and can be fed
437 into the <literal>lnetctl</literal> utility. The
438 <literal>lnetctl</literal> utility parses the YAML file and performs
439 the specified operation on all entities described there in. If no
440 operation is defined in the command as shown below, the default
441 operation is 'add'. The YAML syntax is described in a later
442 section.</para> <screen>lnetctl import FILE.yaml
443 lnetctl import < FILE.yaml</screen>
444 <para>The '<literal>lnetctl</literal> import' command provides three
445 optional parameters to define the operation to be performed on the
446 configuration items described in the YAML file.</para>
447 <screen># if no options are given to the command the "add" command is assumed
449 lnetctl import --add FILE.yaml
450 lnetctl import --add < FILE.yaml
452 # to delete all items described in the YAML file
453 lnetctl import --del FILE.yaml
454 lnetctl import --del < FILE.yaml
456 # to show all items described in the YAML file
457 lnetctl import --show FILE.yaml
458 lnetctl import --show < FILE.yaml</screen>
462 <primary>LNet</primary>
463 <secondary>cli</secondary>
464 </indexterm>Exporting Configuration in YAML format</title>
465 <para><literal>lnetctl</literal> utility provides the 'export'
466 command to dump current LNet configuration in YAML format </para>
467 <screen>lnetctl export FILE.yaml
468 lnetctl export > FILE.yaml</screen>
472 <primary>LNet</primary>
473 <secondary>cli</secondary>
474 </indexterm>Showing LNet Traffic Statistics</title>
475 <para><literal>lnetctl</literal> utility can dump the LNet traffic
476 statistiscs as follows</para>
477 <screen>lnetctl stats show</screen>
481 <primary>LNet</primary>
482 <secondary>yaml syntax</secondary>
483 </indexterm>YAML Syntax</title>
484 <para>The <literal>lnetctl</literal> utility can take in a YAML file
485 describing the configuration items that need to be operated on and
486 perform one of the following operations: add, delete or show on the
487 items described there in.</para>
488 <para>Net, routing and route YAML blocks are all defined as a YAML
489 sequence, as shown in the following sections. The stats YAML block
490 is a YAML object. Each sequence item can take a seq_no field. This
491 seq_no field is returned in the error block. This allows the caller
492 to associate the error with the item that caused the error. The
493 <literal>lnetctl</literal> utilty does a best effort at configuring
494 items defined in the YAML file. It does not stop processing the file
495 at the first error.</para>
496 <para>Below is the YAML syntax describing the various
497 configuration elements which can be operated on via DLC. Not all
498 YAML elements are required for all operations (add/delete/show).
499 The system ignores elements which are not pertinent to the requested
503 <primary>LNet</primary>
504 <secondary>network yaml syntax</secondary>
505 </indexterm>Network Configuration</title>
508 - net: <network. Ex: tcp or o2ib>
510 0: <physical interface>
511 detail: <This is only applicable for show command. 1 - output detailed info. 0 - basic output>
513 peer_timeout: <Integer. Timeout before consider a peer dead>
514 peer_credits: <Integer. Transmit credits for a peer>
515 peer_buffer_credits: <Integer. Credits available for receiving messages>
516 credits: <Integer. Network Interface credits>
517 SMP: <An array of integers of the form: "[x,y,...]", where each
518 integer represents the CPT to associate the network interface
519 with> seq_no: <integer. Optional. User generated, and is
520 passed back in the YAML error block></screen>
521 <para>Both seq_no and detail fields do not appear in the show output.</para>
525 <primary>LNet</primary>
526 <secondary>buffer yaml syntax</secondary>
527 </indexterm>Enable Routing and Adjust Router Buffer Configuration</title>
530 - tiny: <Integer. Tiny buffers>
531 small: <Integer. Small buffers>
532 large: <Integer. Large buffers>
533 enable: <0 - disable routing. 1 - enable routing>
534 seq_no: <Integer. Optional. User generated, and is passed back in the YAML error block></screen>
535 <para>The seq_no field does not appear in the show output</para>
539 <primary>LNet</primary>
540 <secondary>statistics yaml syntax</secondary>
541 </indexterm>Show Statistics</title>
544 seq_no: <Integer. Optional. User generated, and is passed back in the YAML error block></screen>
545 <para>The seq_no field does not appear in the show output</para>
549 <primary>LNet</primary>
550 <secondary>router yaml syntax</secondary>
551 </indexterm>Route Configuration</title>
554 - net: <network. Ex: tcp or o2ib>
555 gateway: <nid of the gateway in the form <ip>@<net>: Ex: 192.168.29.1@tcp>
556 hop: <an integer between 1 and 255. Optional>
557 detail: <This is only applicable for show commands. 1 - output detailed info. 0. basic output>
558 seq_no: <integer. Optional. User generated, and is passed back in the YAML error block></screen>
559 <para>Both seq_no and detail fields do not appear in the show output.</para>
563 <section xml:id="dbdoclet.50438216_33148">
564 <title><indexterm><primary>LNet</primary></indexterm>
566 Overview of LNet Module Parameters</title>
567 <para>LNet kernel module (lnet) parameters specify how LNet is to be
568 configured to work with Lustre, including which NICs will be
569 configured to work with Lustre and the routing to be used with
571 <para>Parameters for LNet can be specified in the
572 <literal>/etc/modprobe.d/lustre.conf</literal> file. In some cases
573 the parameters may have been stored in
574 <literal>/etc/modprobe.conf</literal>, but this has been deprecated
575 since before RHEL5 and SLES10, and having a separate
576 <literal>/etc/modprobe.d/lustre.conf</literal> file simplifies
577 administration and distribution of the Lustre networking
578 configuration. This file contains one or more entries with the
580 <screen>options lnet <replaceable>parameter</replaceable>=<replaceable>value</replaceable></screen>
581 <para>To specify the network interfaces that are to be used for
582 Lustre, set either the <literal>networks</literal> parameter or the
583 <literal>ip2nets</literal> parameter (only one of these parameters can
584 be used at a time):</para>
587 <para><literal>networks</literal> - Specifies the networks to be used.</para>
590 <para><literal>ip2nets</literal> - Lists globally-available
591 networks, each with a range of IP addresses. LNet then identifies
592 locally-available networks through address list-matching
596 <para>See <xref linkend="dbdoclet.50438216_46279"/> and <xref linkend="dbdoclet.50438216_31414"/> for more details.</para>
597 <para>To set up routing between networks, use:</para>
600 <para><literal>routes</literal> - Lists networks and the NIDs of routers that forward to them.</para>
603 <para>See <xref linkend="dbdoclet.50438216_71227"/> for more details.</para>
604 <para>A <literal>router</literal> checker can be configured to enable
605 Lustre nodes to detect router health status, avoid routers that appear
606 dead, and reuse those that restore service after failures. See <xref
607 linkend="dbdoclet.50438216_35668"/> for more details.</para>
608 <para>For a complete reference to the LNet module parameters, see
609 <emphasis><xref linkend="configurationfilesmoduleparameters"/>LNet
610 Options</emphasis>.</para>
612 <para>We recommend that you use 'dotted-quad' notation for
613 IP addresses rather than host names to make it easier to read debug
614 logs and debug configurations with multiple interfaces.</para>
617 <title><indexterm><primary>LNet</primary><secondary>using
618 NID</secondary></indexterm>Using a Lustre Network Identifier (NID)
619 to Identify a Node</title>
620 <para>A Lustre network identifier (NID) is used to uniquely identify
621 a Lustre network endpoint by node ID and network type. The format of
623 <screen><replaceable>network_id</replaceable>@<replaceable>network_type</replaceable></screen>
624 <para>Examples are:</para>
625 <screen>10.67.73.200@tcp0
626 10.67.75.100@o2ib</screen>
627 <para>The first entry above identifies a TCP/IP node, while the
628 second entry identifies an InfiniBand node.</para>
629 <para>When a mount command is run on a client, the client uses the
630 NID of the MDS to retrieve configuration information. If an MDS has
631 more than one NID, the client should use the appropriate NID for its
632 local network.</para>
633 <para>To determine the appropriate NID to specify in
634 the mount command, use the <literal>lctl</literal> command. To
635 display MDS NIDs, run on the MDS :</para>
636 <screen>lctl list_nids</screen>
637 <para>To determine if a client can reach the MDS using a particular NID, run on the client:</para>
638 <screen>lctl which_nid <replaceable>MDS_NID</replaceable></screen>
641 <section xml:id="dbdoclet.50438216_46279">
642 <title><indexterm><primary>LNet</primary><secondary>module parameters</secondary></indexterm>Setting the LNet Module networks Parameter</title>
643 <para>If a node has more than one network interface, you'll
644 typically want to dedicate a specific interface to Lustre. You can do
645 this by including an entry in the <literal>lustre.conf</literal> file
646 on the node that sets the LNet module <literal>networks</literal>
648 <screen>options lnet networks=<replaceable>comma-separated list of
649 networks</replaceable></screen>
650 <para>This example specifies that a Lustre node will use a TCP/IP
651 interface and an InfiniBand interface:</para>
652 <screen>options lnet networks=tcp0(eth0),o2ib(ib0)</screen>
653 <para>This example specifies that the Lustre node will use the TCP/IP interface <literal>eth1</literal>:</para>
654 <screen>options lnet networks=tcp0(eth1)</screen>
655 <para>Depending on the network design, it may be necessary to specify
656 explicit interfaces. To explicitly specify that interface
657 <literal>eth2</literal> be used for network <literal>tcp0</literal>
658 and <literal>eth3</literal> be used for <literal>tcp1</literal> , use
660 <screen>options lnet networks=tcp0(eth2),tcp1(eth3)</screen>
661 <para>When more than one interface is available during the network
662 setup, Lustre chooses the best route based on the hop count. Once the
663 network connection is established, Lustre expects the network to stay
664 connected. In a Lustre network, connections do not fail over to
665 another interface, even if multiple interfaces are available on the
668 <para>LNet lines in <literal>lustre.conf</literal> are only used by
669 the local node to determine what to call its interfaces. They are
670 not used for routing decisions.</para>
673 <title><indexterm><primary>configuring</primary><secondary>multihome</secondary></indexterm>Multihome Server Example</title>
674 <para>If a server with multiple IP addresses (multihome server) is
675 connected to a Lustre network, certain configuration setting are
676 required. An example illustrating these setting consists of a
677 network with the following nodes:</para>
680 <para> Server svr1 with three TCP NICs (<literal>eth0</literal>,
681 <literal>eth1</literal>, and <literal>eth2</literal>) and an
682 InfiniBand NIC.</para>
685 <para> Server svr2 with three TCP NICs (<literal>eth0</literal>,
686 <literal>eth1</literal>, and <literal>eth2</literal>) and an
687 InfiniBand NIC. Interface eth2 will not be used for Lustre
691 <para> TCP clients, each with a single TCP interface.</para>
694 <para> InfiniBand clients, each with a single Infiniband
695 interface and a TCP/IP interface for administration.</para>
698 <para>To set the <literal>networks</literal> option for this example:</para>
701 <para> On each server, <literal>svr1</literal> and
702 <literal>svr2</literal>, include the following line in the
703 <literal>lustre.conf</literal> file:</para>
706 <screen>options lnet networks=tcp0(eth0),tcp1(eth1),o2ib</screen>
709 <para> For TCP-only clients, the first available non-loopback IP
710 interface is used for <literal>tcp0</literal>. Thus, TCP clients
711 with only one interface do not need to have options defined in
712 the <literal>lustre.conf</literal> file.</para>
715 <para> On the InfiniBand clients, include the following line in
716 the <literal>lustre.conf</literal> file:</para>
719 <screen>options lnet networks=o2ib</screen>
721 <para>By default, Lustre ignores the loopback
722 (<literal>lo0</literal>) interface. Lustre does not ignore IP
723 addresses aliased to the loopback. If you alias IP addresses to
724 the loopback interface, you must specify all Lustre networks using
725 the LNet networks parameter.</para>
728 <para>If the server has multiple interfaces on the same subnet,
729 the Linux kernel will send all traffic using the first configured
730 interface. This is a limitation of Linux, not Lustre. In this
731 case, network interface bonding should be used. For more
732 information about network interface bonding, see <xref
733 linkend="settingupbonding"/>.</para>
737 <section xml:id="dbdoclet.50438216_31414">
738 <title><indexterm><primary>LNet</primary><secondary>ip2nets</secondary></indexterm>Setting the LNet Module ip2nets Parameter</title>
739 <para>The <literal>ip2nets</literal> option is typically used when a
740 single, universal <literal>lustre.conf</literal> file is run on all
741 servers and clients. Each node identifies the locally available
742 networks based on the listed IP address patterns that match the
743 node's local IP addresses.</para>
744 <para>Note that the IP address patterns listed in the
745 <literal>ip2nets</literal> option are <emphasis>only</emphasis> used
746 to identify the networks that an individual node should instantiate.
747 They are <emphasis>not</emphasis> used by LNet for any other
748 communications purpose.</para>
749 <para>For the example below, the nodes in the network have these IP
753 <para> Server svr1: <literal>eth0</literal> IP address
754 <literal>192.168.0.2</literal>, IP over Infiniband
755 (<literal>o2ib</literal>) address
756 <literal>132.6.1.2</literal>.</para>
759 <para> Server svr2: <literal>eth0</literal> IP address
760 <literal>192.168.0.4</literal>, IP over Infiniband
761 (<literal>o2ib</literal>) address
762 <literal>132.6.1.4</literal>.</para>
765 <para> TCP clients have IP addresses
766 <literal>192.168.0.5-255.</literal></para>
769 <para> Infiniband clients have IP over Infiniband
770 (<literal>o2ib</literal>) addresses <literal>132.6.[2-3].2, .4,
771 .6, .8</literal>.</para>
774 <para>The following entry is placed in the
775 <literal>lustre.conf</literal> file on each server and client:</para>
776 <screen>options lnet 'ip2nets="tcp0(eth0) 192.168.0.[2,4]; \
777 tcp0 192.168.0.*; o2ib0 132.6.[1-3].[2-8/2]"'</screen>
778 <para>Each entry in <literal>ip2nets</literal> is referred to as a 'rule'.</para>
779 <para>The order of LNet entries is important when configuring servers.
780 If a server node can be reached using more than one network, the first
781 network specified in <literal>lustre.conf</literal> will be
783 <para>Because <literal>svr1</literal> and <literal>svr2</literal>
784 match the first rule, LNet uses <literal>eth0</literal> for
785 <literal>tcp0</literal> on those machines. (Although
786 <literal>svr1</literal> and <literal>svr2</literal> also match the
787 second rule, the first matching rule for a particular network is
789 <para>The <literal>[2-8/2]</literal> format indicates a range of 2-8
790 stepped by 2; that is 2,4,6,8. Thus, the clients at
791 <literal>132.6.3.5</literal> will not find a matching o2ib
793 <note condition='l2A'>
794 <para>Multi-rail deprecates the kernel parsing of ip2nets. ip2nets
795 patterns are matched in user space and translated into Network interfaces
796 to be added into the system.</para>
797 <para>The first interface that matches the IP pattern will be used when
798 adding a network interface.</para>
799 <para>If an interface is explicitly specified as well as a pattern, the
800 interface matched using the IP pattern will be sanitized against the
801 explicitly-defined interface.</para>
802 <para>For example, <literal>tcp(eth0) 192.168.*.3</literal> and there
803 exists in the system <literal>eth0 == 192.158.19.3</literal> and
804 <literal>eth1 == 192.168.3.3</literal>, then the configuration will fail,
805 because the pattern contradicts the interface specified.</para>
806 <para>A clear warning will be displayed if inconsistent configuration is
808 <para>You could use the following command to configure ip2nets:</para>
809 <screen>lnetctl import < ip2nets.yaml</screen>
810 <para>For example:</para>
823 0: 192.168.*.*</screen>
826 <section xml:id="dbdoclet.50438216_71227">
827 <title><indexterm><primary>LNet</primary><secondary>routes</secondary></indexterm>Setting
828 the LNet Module routes Parameter</title>
829 <para>The LNet module routes parameter is used to identify routers in
830 a Lustre configuration. These parameters are set in
831 <literal>modprobe.conf</literal> on each Lustre node. </para>
832 <para>Routes are typically set to connect to segregated subnetworks
833 or to cross connect two different types of networks such as tcp and
835 <para>The LNet routes parameter specifies a colon-separated list of
836 router definitions. Each route is defined as a network number,
837 followed by a list of routers:</para>
838 <screen>routes=<replaceable>net_type router_NID(s)</replaceable></screen>
839 <para>This example specifies bi-directional routing in which TCP
840 clients can reach Lustre resources on the IB networks and IB servers
841 can access the TCP networks:</para>
842 <screen>options lnet 'ip2nets="tcp0 192.168.0.*; \
843 o2ib0(ib0) 132.6.1.[1-128]"' 'routes="tcp0 132.6.1.[1-8]@o2ib0; \
844 o2ib0 192.16.8.0.[1-8]@tcp0"'</screen>
845 <para>All LNet routers that bridge two networks are equivalent. They
846 are not configured as primary or secondary, and the load is balanced
847 across all available routers.</para>
848 <para>The number of LNet routers is not limited. Enough routers should
849 be used to handle the required file serving bandwidth plus a 25
850 percent margin for headroom.</para>
852 <title><indexterm><primary>LNet</primary><secondary>routing
853 example</secondary></indexterm>Routing Example</title>
854 <para>On the clients, place the following entry in the
855 <literal>lustre.conf</literal> file</para>
856 <screen>lnet networks="tcp" routes="o2ib0 192.168.0.[1-8]@tcp0"</screen>
857 <para>On the router nodes, use:</para>
858 <screen>lnet networks="tcp o2ib" forwarding=enabled </screen>
859 <para>On the MDS, use the reverse as shown below:</para>
860 <screen>lnet networks="o2ib0" routes="tcp0 132.6.1.[1-8]@o2ib0" </screen>
861 <para>To start the routers, run:</para>
862 <screen>modprobe lnet
863 lctl network configure</screen>
866 <section xml:id="dbdoclet.50438216_10523">
867 <title><indexterm><primary>LNet</primary><secondary>testing</secondary></indexterm>Testing
868 the LNet Configuration</title>
869 <para>After configuring Lustre Networking, it is highly recommended
870 that you test your LNet configuration using the LNet Self-Test
871 provided with the Lustre software. For more information about using
872 LNet Self-Test, see <xref linkend="lnetselftest"/>.</para>
874 <section xml:id="dbdoclet.50438216_35668">
875 <title><indexterm><primary>LNet</primary><secondary>route
876 checker</secondary></indexterm>Configuring the Router Checker</title>
877 <para>In a Lustre configuration in which different types of networks,
878 such as a TCP/IP network and an Infiniband network, are connected by
879 routers, a router checker can be run on the clients and servers in the
880 routed configuration to monitor the status of the routers. In a
881 multi-hop routing configuration, router checkers can be configured on
882 routers to monitor the health of their next-hop routers.</para>
883 <para>A router checker is configured by setting LNet parameters in
884 <literal>lustre.conf</literal> by including an entry in this
887 <replaceable>router_checker_parameter</replaceable>=<replaceable>value</replaceable></screen>
888 <para>The router checker parameters are:</para>
891 <para><literal>live_router_check_interval</literal> - Specifies a
892 time interval in seconds after which the router checker will ping
893 the live routers. The default value is 0, meaning no checking is
894 done. To set the value to 60, enter:</para>
895 <screen>options lnet live_router_check_interval=60</screen>
898 <para><literal>dead_router_check_interval</literal> - Specifies a
899 time interval in seconds after which the router checker will check
900 for dead routers. The default value is 0, meaning no checking is
901 done. To set the value to 60, enter:</para>
902 <screen>options lnet dead_router_check_interval=60</screen>
905 <para>auto_down - Enables/disables (1/0) the automatic marking of
906 router state as up or down. The default value is 1. To disable
907 router marking, enter:</para>
908 <screen>options lnet auto_down=0</screen>
911 <para><literal>router_ping_timeout</literal> - Specifies a
912 timeout for the router checker when it checks live or dead
913 routers. The router checker sends a ping message to each dead or
914 live router once every dead_router_check_interval or
915 live_router_check_interval respectively. The default value is 50.
916 To set the value to 60, enter:</para>
917 <screen>options lnet router_ping_timeout=60</screen>
919 <para>The <literal>router_ping_timeout</literal> is consistent
920 with the default LND timeouts. You may have to increase it on very
921 large clusters if the LND timeout is also increased. For larger
922 clusters, we suggest increasing the check interval.</para>
926 <para><literal>check_routers_before_use</literal> - Specifies
927 that routers are to be checked before use. Set to off by
928 default. If this parameter is set to on, the
929 dead_router_check_interval parameter must be given a positive
930 integer value.</para>
931 <screen>options lnet check_routers_before_use=on</screen>
934 <para>The router checker obtains the following information from each router:</para>
937 <para> Time the router was disabled</para>
940 <para> Elapsed disable time</para>
943 <para>If the router checker does not get a reply message from the
944 router within router_ping_timeout seconds, it considers the router to
946 <para>If a router is marked 'up' and responds to a ping, the
947 timeout is reset.</para>
948 <para>If 100 packets have been sent successfully through a router, the
949 sent-packets counter for that router will have a value of 100.</para>
951 <section xml:id="dbdoclet.50438216_15200">
952 <title><indexterm><primary>LNet</primary><secondary>best
953 practice</secondary></indexterm>Best Practices for LNet
955 <para>For the <literal>networks</literal>, <literal>ip2nets</literal>,
956 and <literal>routes</literal> options, follow these best practices to
957 avoid configuration errors.</para>
959 <title><indexterm><primary>LNet</primary><secondary>escaping commas
960 with quotes</secondary></indexterm>Escaping commas with
962 <para>Depending on the Linux distribution, commas may need to be
963 escaped using single or double quotes. In the extreme case, the
964 <literal>options</literal> entry would look like this:</para>
965 <para><screen>options
966 lnet'networks="tcp0,elan0"'
967 'routes="tcp [2,10]@elan0"'</screen></para>
968 <para>Added quotes may confuse some distributions. Messages such as
969 the following may indicate an issue related to added quotes:</para>
970 <para><screen>lnet: Unknown parameter 'networks'</screen></para>
971 <para>A <literal>'Refusing connection - no matching
972 NID'</literal> message generally points to an error in the LNet
973 module configuration.</para>
976 <title><indexterm><primary>LNet</primary><secondary>comments</secondary></indexterm>Including
978 <para><emphasis>Place the semicolon terminating a comment
979 immediately after the comment.</emphasis> LNet silently ignores
980 everything between the <literal>#</literal> character at the
981 beginning of the comment and the next semicolon.</para>
982 <para>In this <emphasis>incorrect</emphasis> example, LNet silently
983 ignores <literal>pt11 192.168.0.[92,96]</literal>, resulting in
984 these nodes not being properly initialized. No error message is
986 <screen>options lnet ip2nets="pt10 192.168.0.[89,93]; # comment
987 with semicolon BEFORE comment \ pt11 192.168.0.[92,96];</screen>
988 <para>This <emphasis role="italic">correct</emphasis> example shows
989 the required syntax: </para>
990 <para><screen>options lnet ip2nets="pt10 192.168.0.[89,93] \
991 # comment with semicolon AFTER comment; \
992 pt11 192.168.0.[92,96] # comment</screen></para>
993 <para><emphasis role="italic">Do not add an excessive number of
994 comments.</emphasis> The Linux kernel limits the length of character
995 strings used in module options (usually to 1KB, but this may differ
996 between vendor kernels). If you exceed this limit, errors result and
997 the specified configuration may not be processed correctly.</para>