1 <?xml version='1.0' encoding='UTF-8'?>
2 <!-- This document was created with Syntext Serna Free. --><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="configuringlnet">
3 <title xml:id="configuringlnet.title">Configuring Lustre Networking (LNet)</title>
4 <para>This chapter describes how to configure Lustre Networking (LNet). It includes the following sections:</para>
7 <para><xref linkend="dbdoclet.50438216_15201"/>
11 <para><xref linkend="dbdoclet.50438216_33148"/>
15 <para><xref linkend="dbdoclet.50438216_46279"/>
19 <para><xref linkend="dbdoclet.50438216_31414"/>
23 <para><xref linkend="dbdoclet.50438216_71227"/>
27 <para><xref linkend="dbdoclet.50438216_10523"/>
31 <para><xref linkend="dbdoclet.50438216_35668"/>
35 <para><xref linkend="dbdoclet.50438216_15200"/>
40 <para>Configuring LNet is optional.</para>
41 <para> LNet will use the first TCP/IP interface it discovers on a
42 system (<literal>eth0</literal>) if it's loaded using the
43 <literal>lctl network up</literal>. If this network configuration is
44 sufficient, you do not need to configure LNet. LNet configuration is
45 required if you are using Infiniband or multiple Ethernet
47 <para condition='l27'>The <literal>lnetctl</literal> utility can be used
48 to initialize LNet without bringing up any network interfaces. This
49 gives flexibility to the user to add interfaces after LNet has been
51 <para condition='l27'>DLC also introduces a C-API to enable
52 configuring LNet programatically. See <xref
53 linkend="lnetconfigurationapi"/></para>
55 <section xml:id="dbdoclet.50438216_15201" condition='l27'>
57 <primary>LNet</primary>
58 <secondary>Configuring LNet</secondary>
59 </indexterm>Configuring LNet via <literal>lnetctl</literal></title>
60 <para>The <literal>lnetctl</literal> utility can be used to initialize
61 and configure the LNet kernel module after it has been loaded via
62 <literal>modprobe</literal>. In general the lnetctl format is as
64 <screen>lnetctl cmd subcmd [options]</screen>
65 <para>The following configuration items are managed by the tool:</para>
69 <para>Configuring/unconfiguring LNet</para>
72 <para>Adding/removing/showing Networks</para>
75 <para>Adding/removing/showing Routes</para>
78 <para>Enabling/Disabling routing</para>
81 <para>Configuring Router Buffer Pools</para>
87 <primary>LNet</primary>
88 <secondary>cli</secondary>
89 </indexterm>Configuring LNet</title>
90 <para>After LNet has been loaded via <literal>modprobe</literal>,
91 <literal>lnetctl</literal> utility can be used to configure LNet
92 without bringing up networks which are specified in the module
93 parameters. It can also be used to configure network interfaces
94 specified in the module prameters by providing the
95 <literal>--all</literal> option.</para>
96 <screen>lnetctl lnet configure [--all]
97 # --all: load NI configuration from module parameters</screen>
98 <para>The <literal>lnetctl</literal> utility can also be used to
99 unconfigure LNet.</para>
100 <screen>lnetctl lnet unconfigure</screen>
102 <section xml:id="dbdoclet.lnetaddshowdelete">
103 <title><indexterm><primary>LNet</primary>
104 <secondary>cli</secondary></indexterm>Adding, Deleting and Showing
106 <para>Networks can be added, deleted, or shown after the LNet kernel
107 module is loaded.</para>
108 <para>The <emphasis role="bold"><literal>lnetctl net add</literal>
109 </emphasis> command is used to add networks:</para>
110 <screen>lnetctl net add: add a network
111 --net: net name (ex tcp0)
112 --if: physical interface (ex eth0)
113 --peer_timeout: time to wait before declaring a peer dead
114 --peer_credits: defines the max number of inflight messages
115 --peer_buffer_credits: the number of buffer credits per peer
116 --credits: Network Interface credits
117 --cpts: CPU Partitions configured net uses
118 --help: display this help text
121 lnetctl net add --net tcp2 --if eth0
122 --peer_timeout 180 --peer_credits 8</screen>
123 <note condition='l2A'><para>With the addition of Software based Multi-Rail
124 in Lustre 2.10, the following should be noted:</para>
126 <listitem><para>--net: no longer needs to be unique since multiple
127 interfaces can be added to the same network.</para></listitem>
128 <listitem><para>--if: The same interface per network can be added
129 only once, however, more than one interface can now be specified
130 (separated by a comma) for a node. For example: eth0,eth1,eth2.
132 </itemizedlist></para>
133 <para>For examples on adding multiple interfaces via
134 <literal>lnetctl net add</literal> and/or YAML, please see
135 <xref linkend="dbdoclet.mrconfiguring" />
138 <para>Networks can be deleted with the
139 <emphasis role="bold"><literal>lnetctl net del</literal></emphasis>
141 <screen>net del: delete a network
142 --net: net name (ex tcp0)
143 --if: physical inerface (e.g. eth0)
146 lnetctl net del --net tcp2</screen>
147 <note condition='l2A'><para>In a Software Multi-Rail configuration,
148 specifying only the <literal>--net</literal> argument will delete the
149 entire network and all interfaces under it. The new
150 <literal>--if</literal> switch should also be used in conjunction with
151 <literal>--net</literal> to specify deletion of a specific interface.
153 <para>All or a subset of the configured networks can be shown with the
154 <emphasis role="bold"><literal>lnetctl net show</literal></emphasis>
155 command. The output can be non-verbose or verbose.</para>
156 <screen>net show: show networks
157 --net: net name (ex tcp0) to filter on
158 --verbose: display detailed output per network
162 lnetctl net show --verbose
163 lnetctl net show --net tcp2 --verbose</screen>
164 <para>Below are examples of non-detailed and detailed network
165 configuration show.</para>
166 <screen># non-detailed show
167 > lnetctl net show --net tcp2
169 - nid: 192.168.205.130@tcp2
175 > lnetctl net show --net tcp2 --verbose
177 - nid: 192.168.205.130@tcp2
184 peer_buffer_credits: 0
185 credits: 256</screen>
187 <section condition='l2A'>
189 <primary>LNet</primary>
190 <secondary>cli</secondary>
191 </indexterm>Adding, Deleting and Showing Peers</title>
192 <para>The <emphasis role="bold"><literal>lnetctl peer add</literal>
193 </emphasis> command is used to add a remote peer to a software
194 multi-rail configuration.</para>
195 <para>When configuring peers, use the <literal>–-prim_nid</literal>
196 option to specify the key or primary nid of the peer node. Then
197 follow that with the <literal>--nid</literal> option to specify a set
198 of comma separated NIDs.</para>
199 <screen>peer add: add a peer
200 --prim_nid: primary NID of the peer
201 --nid: comma separated list of peer nids (e.g. 10.1.1.2@tcp0)
202 --non_mr: if specified this interface is created as a non mulit-rail
203 capable peer. Only one NID can be specified in this case.</screen>
204 <para>For example:</para>
206 lnetctl peer add --prim_nid 10.10.10.2@tcp --nid 10.10.3.3@tcp1,10.4.4.5@tcp2
208 <para>The <literal>--prim-nid</literal> (primary nid for the peer
209 node) can go unspecified. In this case, the first listed NID in the
210 <literal>--nid</literal> option becomes the primary nid of the peer.
213 lnetctl peer_add --nid 10.10.10.2@tcp,10.10.3.3@tcp1,10.4.4.5@tcp2</screen>
214 <para>YAML can also be used to configure peers:</para>
216 - primary nid: <key or primary nid>
221 - nid: <nid n></screen>
222 <para>As with all other commands, the result of the
223 <literal>lnetctl peer show</literal> command can be used to gather
224 information to aid in configuring or deleting a peer:</para>
225 <screen>lnetctl peer show -v</screen>
226 <para>Example output from the <literal>lnetctl peer show</literal>
229 - primary nid: 192.168.122.218@tcp
232 - nid: 192.168.122.218@tcp
235 available_tx_credits: 8
236 available_rtr_credits: 8
243 - nid: 192.168.122.78@tcp
246 available_tx_credits: 8
247 available_rtr_credits: 8
254 - nid: 192.168.122.96@tcp
257 available_tx_credits: 8
258 available_rtr_credits: 8
265 <para>Use the following <literal>lnetctl</literal> command to delete a
267 <screen>peer del: delete a peer
268 --prim_nid: Primary NID of the peer
269 --nid: comma separated list of peer nids (e.g. 10.1.1.2@tcp0)</screen>
270 <para><literal>prim_nid</literal> should always be specified. The
271 <literal>prim_nid</literal> identifies the peer. If the
272 <literal>prim_nid</literal> is the only one specified, then the entire
273 peer is deleted.</para>
274 <para>Example of deleting a single nid of a peer (10.10.10.3@tcp):
276 <screen>lnetctl peer del --prim_nid 10.10.10.2@tcp --nid 10.10.10.3@tcp</screen>
277 <para>Example of deleting the entire peer:</para>
278 <screen>lnetctl peer del --prim_nid 10.10.10.2@tcp</screen>
283 <primary>LNet</primary>
284 <secondary>cli</secondary>
285 </indexterm>Adding, Deleting and Showing routes</title>
286 <para>A set of routes can be added to identify how LNet messages are
288 <screen>lnetctl route add: add a route
289 --net: net name (ex tcp0) LNet message is destined to.
290 The can not be a local network.
291 --gateway: gateway node nid (ex 10.1.1.2@tcp) to route
292 all LNet messaged destined for the identified
294 --hop: number of hops to final destination
295 (1 < hops < 255)
296 --priority: priority of route (0 - highest prio)
299 lnetctl route add --net tcp2 --gateway 192.168.205.130@tcp1 --hop 2 --prio 1</screen>
300 <para>Routes can be deleted via the following <literal>lnetctl</literal> command.</para>
301 <screen>lnetctl route del: delete a route
302 --net: net name (ex tcp0)
303 --gateway: gateway nid (ex 10.1.1.2@tcp)
306 lnetctl route del --net tcp2 --gateway 192.168.205.130@tcp1</screen>
307 <para>Configured routes can be shown via the following
308 <literal>lnetctl</literal> command.</para>
309 <screen>lnetctl route show: show routes
310 --net: net name (ex tcp0) to filter on
311 --gateway: gateway nid (ex 10.1.1.2@tcp) to filter on
312 --hop: number of hops to final destination
313 (1 < hops < 255) to filter on
314 --priority: priority of route (0 - highest prio)
316 --verbose: display detailed output per route
323 lnetctl route show --verbose</screen>
324 <para>When showing routes the <literal>--verbose</literal> option
325 outputs more detailed information. All show and error output are in
326 YAML format. Below are examples of both non-detailed and detailed
327 route show output.</para>
328 <screen>#Non-detailed output
332 gateway: 192.168.205.130@tcp1
335 > lnetctl route show --verbose
338 gateway: 192.168.205.130@tcp1
345 <primary>LNet</primary>
346 <secondary>cli</secondary>
347 </indexterm>Enabling and Disabling Routing</title>
348 <para>When an LNet node is configured as a router it will route LNet
349 messages not destined to itself. This feature can be enabled or
350 disabled as follows.</para>
351 <screen>lnetctl set routing [0 | 1]
352 # 0 - disable routing feature
353 # 1 - enable routing feature</screen>
357 <primary>LNet</primary>
358 <secondary>cli</secondary>
359 </indexterm>Showing routing information</title>
360 <para>When routing is enabled on a node, the tiny, small and large
361 routing buffers are allocated. See <xref
362 linkend="dbdoclet.50438272_73839"/> for more details on router
363 buffers. This information can be shown as follows:</para>
364 <screen>lnetctl routing show: show routing information
367 lnetctl routing show</screen>
368 <para>An example of the show output:</para>
369 <screen>> lnetctl routing show
391 <primary>LNet</primary>
392 <secondary>cli</secondary>
393 </indexterm>Configuring Routing Buffers</title>
394 <para> The routing buffers values configured specify the number of
395 buffers in each of the tiny, small and large groups.</para>
396 <para>It is often desirable to configure the tiny, small and large
397 routing buffers to some values other than the default. These values
398 are global values, when set they are used by all configured CPU
399 partitions. If routing is enabled then the values set take effect
400 immediately. If a larger number of buffers is specified, then
401 buffers are allocated to satisfy the configuration change. If fewer
402 buffers are configured then the excess buffers are freed as they
403 become unused. If routing is not set the values are not changed.
404 The buffer values are reset to default if routing is turned off and
406 <para>The <literal>lnetctl</literal> 'set' command can be
407 used to set these buffer values. A VALUE greater than 0
408 will set the number of buffers accordingly. A VALUE of 0
409 will reset the number of buffers to system defaults.</para>
410 <screen>set tiny_buffers:
411 set tiny routing buffers
412 VALUE must be greater than or equal to 0
414 set small_buffers: set small routing buffers
415 VALUE must be greater than or equal to 0
417 set large_buffers: set large routing buffers
418 VALUE must be greater than or equal to 0</screen>
419 <para>Usage examples:</para>
420 <screen>> lnetctl set tiny_buffers 4096
421 > lnetctl set small_buffers 8192
422 > lnetctl set large_buffers 2048</screen>
423 <para>The buffers can be set back to the default values as follows:</para>
424 <screen>> lnetctl set tiny_buffers 0
425 > lnetctl set small_buffers 0
426 > lnetctl set large_buffers 0</screen>
430 <primary>LNet</primary>
431 <secondary>cli</secondary>
432 </indexterm>Importing YAML Configuration File</title>
433 <para>Configuration can be described in YAML format and can be fed
434 into the <literal>lnetctl</literal> utility. The
435 <literal>lnetctl</literal> utility parses the YAML file and performs
436 the specified operation on all entities described there in. If no
437 operation is defined in the command as shown below, the default
438 operation is 'add'. The YAML syntax is described in a later
439 section.</para> <screen>lnetctl import FILE.yaml
440 lnetctl import < FILE.yaml</screen>
441 <para>The '<literal>lnetctl</literal> import' command provides three
442 optional parameters to define the operation to be performed on the
443 configuration items described in the YAML file.</para>
444 <screen># if no options are given to the command the "add" command is assumed
446 lnetctl import --add FILE.yaml
447 lnetctl import --add < FILE.yaml
449 # to delete all items described in the YAML file
450 lnetctl import --del FILE.yaml
451 lnetctl import --del < FILE.yaml
453 # to show all items described in the YAML file
454 lnetctl import --show FILE.yaml
455 lnetctl import --show < FILE.yaml</screen>
459 <primary>LNet</primary>
460 <secondary>cli</secondary>
461 </indexterm>Exporting Configuration in YAML format</title>
462 <para><literal>lnetctl</literal> utility provides the 'export'
463 command to dump current LNet configuration in YAML format </para>
464 <screen>lnetctl export FILE.yaml
465 lnetctl export > FILE.yaml</screen>
469 <primary>LNet</primary>
470 <secondary>cli</secondary>
471 </indexterm>Showing LNet Traffic Statistics</title>
472 <para><literal>lnetctl</literal> utility can dump the LNet traffic
473 statistiscs as follows</para>
474 <screen>lnetctl stats show</screen>
478 <primary>LNet</primary>
479 <secondary>yaml syntax</secondary>
480 </indexterm>YAML Syntax</title>
481 <para>The <literal>lnetctl</literal> utility can take in a YAML file
482 describing the configuration items that need to be operated on and
483 perform one of the following operations: add, delete or show on the
484 items described there in.</para>
485 <para>Net, routing and route YAML blocks are all defined as a YAML
486 sequence, as shown in the following sections. The stats YAML block
487 is a YAML object. Each sequence item can take a seq_no field. This
488 seq_no field is returned in the error block. This allows the caller
489 to associate the error with the item that caused the error. The
490 <literal>lnetctl</literal> utilty does a best effort at configuring
491 items defined in the YAML file. It does not stop processing the file
492 at the first error.</para>
493 <para>Below is the YAML syntax describing the various
494 configuration elements which can be operated on via DLC. Not all
495 YAML elements are required for all operations (add/delete/show).
496 The system ignores elements which are not pertinent to the requested
500 <primary>LNet</primary>
501 <secondary>network yaml syntax</secondary>
502 </indexterm>Network Configuration</title>
505 - net: <network. Ex: tcp or o2ib>
507 0: <physical interface>
508 detail: <This is only applicable for show command. 1 - output detailed info. 0 - basic output>
510 peer_timeout: <Integer. Timeout before consider a peer dead>
511 peer_credits: <Integer. Transmit credits for a peer>
512 peer_buffer_credits: <Integer. Credits available for receiving messages>
513 credits: <Integer. Network Interface credits>
514 SMP: <An array of integers of the form: "[x,y,...]", where each
515 integer represents the CPT to associate the network interface
516 with> seq_no: <integer. Optional. User generated, and is
517 passed back in the YAML error block></screen>
518 <para>Both seq_no and detail fields do not appear in the show output.</para>
522 <primary>LNet</primary>
523 <secondary>buffer yaml syntax</secondary>
524 </indexterm>Enable Routing and Adjust Router Buffer Configuration</title>
527 - tiny: <Integer. Tiny buffers>
528 small: <Integer. Small buffers>
529 large: <Integer. Large buffers>
530 enable: <0 - disable routing. 1 - enable routing>
531 seq_no: <Integer. Optional. User generated, and is passed back in the YAML error block></screen>
532 <para>The seq_no field does not appear in the show output</para>
536 <primary>LNet</primary>
537 <secondary>statistics yaml syntax</secondary>
538 </indexterm>Show Statistics</title>
541 seq_no: <Integer. Optional. User generated, and is passed back in the YAML error block></screen>
542 <para>The seq_no field does not appear in the show output</para>
546 <primary>LNet</primary>
547 <secondary>router yaml syntax</secondary>
548 </indexterm>Route Configuration</title>
551 - net: <network. Ex: tcp or o2ib>
552 gateway: <nid of the gateway in the form <ip>@<net>: Ex: 192.168.29.1@tcp>
553 hop: <an integer between 1 and 255. Optional>
554 detail: <This is only applicable for show commands. 1 - output detailed info. 0. basic output>
555 seq_no: <integer. Optional. User generated, and is passed back in the YAML error block></screen>
556 <para>Both seq_no and detail fields do not appear in the show output.</para>
560 <section xml:id="dbdoclet.50438216_33148">
561 <title><indexterm><primary>LNet</primary></indexterm>
563 Overview of LNet Module Parameters</title>
564 <para>LNet kernel module (lnet) parameters specify how LNet is to be
565 configured to work with Lustre, including which NICs will be
566 configured to work with Lustre and the routing to be used with
568 <para>Parameters for LNet can be specified in the
569 <literal>/etc/modprobe.d/lustre.conf</literal> file. In some cases
570 the parameters may have been stored in
571 <literal>/etc/modprobe.conf</literal>, but this has been deprecated
572 since before RHEL5 and SLES10, and having a separate
573 <literal>/etc/modprobe.d/lustre.conf</literal> file simplifies
574 administration and distribution of the Lustre networking
575 configuration. This file contains one or more entries with the
577 <screen>options lnet <replaceable>parameter</replaceable>=<replaceable>value</replaceable></screen>
578 <para>To specify the network interfaces that are to be used for
579 Lustre, set either the <literal>networks</literal> parameter or the
580 <literal>ip2nets</literal> parameter (only one of these parameters can
581 be used at a time):</para>
584 <para><literal>networks</literal> - Specifies the networks to be used.</para>
587 <para><literal>ip2nets</literal> - Lists globally-available
588 networks, each with a range of IP addresses. LNet then identifies
589 locally-available networks through address list-matching
593 <para>See <xref linkend="dbdoclet.50438216_46279"/> and <xref linkend="dbdoclet.50438216_31414"/> for more details.</para>
594 <para>To set up routing between networks, use:</para>
597 <para><literal>routes</literal> - Lists networks and the NIDs of routers that forward to them.</para>
600 <para>See <xref linkend="dbdoclet.50438216_71227"/> for more details.</para>
601 <para>A <literal>router</literal> checker can be configured to enable
602 Lustre nodes to detect router health status, avoid routers that appear
603 dead, and reuse those that restore service after failures. See <xref
604 linkend="dbdoclet.50438216_35668"/> for more details.</para>
605 <para>For a complete reference to the LNet module parameters, see
606 <emphasis><xref linkend="configurationfilesmoduleparameters"/>LNet
607 Options</emphasis>.</para>
609 <para>We recommend that you use 'dotted-quad' notation for
610 IP addresses rather than host names to make it easier to read debug
611 logs and debug configurations with multiple interfaces.</para>
614 <title><indexterm><primary>LNet</primary><secondary>using
615 NID</secondary></indexterm>Using a Lustre Network Identifier (NID)
616 to Identify a Node</title>
617 <para>A Lustre network identifier (NID) is used to uniquely identify
618 a Lustre network endpoint by node ID and network type. The format of
620 <screen><replaceable>network_id</replaceable>@<replaceable>network_type</replaceable></screen>
621 <para>Examples are:</para>
622 <screen>10.67.73.200@tcp0
623 10.67.75.100@o2ib</screen>
624 <para>The first entry above identifies a TCP/IP node, while the
625 second entry identifies an InfiniBand node.</para>
626 <para>When a mount command is run on a client, the client uses the
627 NID of the MDS to retrieve configuration information. If an MDS has
628 more than one NID, the client should use the appropriate NID for its
629 local network.</para>
630 <para>To determine the appropriate NID to specify in
631 the mount command, use the <literal>lctl</literal> command. To
632 display MDS NIDs, run on the MDS :</para>
633 <screen>lctl list_nids</screen>
634 <para>To determine if a client can reach the MDS using a particular NID, run on the client:</para>
635 <screen>lctl which_nid <replaceable>MDS_NID</replaceable></screen>
638 <section xml:id="dbdoclet.50438216_46279">
639 <title><indexterm><primary>LNet</primary><secondary>module parameters</secondary></indexterm>Setting the LNet Module networks Parameter</title>
640 <para>If a node has more than one network interface, you'll
641 typically want to dedicate a specific interface to Lustre. You can do
642 this by including an entry in the <literal>lustre.conf</literal> file
643 on the node that sets the LNet module <literal>networks</literal>
645 <screen>options lnet networks=<replaceable>comma-separated list of
646 networks</replaceable></screen>
647 <para>This example specifies that a Lustre node will use a TCP/IP
648 interface and an InfiniBand interface:</para>
649 <screen>options lnet networks=tcp0(eth0),o2ib(ib0)</screen>
650 <para>This example specifies that the Lustre node will use the TCP/IP interface <literal>eth1</literal>:</para>
651 <screen>options lnet networks=tcp0(eth1)</screen>
652 <para>Depending on the network design, it may be necessary to specify
653 explicit interfaces. To explicitly specify that interface
654 <literal>eth2</literal> be used for network <literal>tcp0</literal>
655 and <literal>eth3</literal> be used for <literal>tcp1</literal> , use
657 <screen>options lnet networks=tcp0(eth2),tcp1(eth3)</screen>
658 <para>When more than one interface is available during the network
659 setup, Lustre chooses the best route based on the hop count. Once the
660 network connection is established, Lustre expects the network to stay
661 connected. In a Lustre network, connections do not fail over to
662 another interface, even if multiple interfaces are available on the
665 <para>LNet lines in <literal>lustre.conf</literal> are only used by
666 the local node to determine what to call its interfaces. They are
667 not used for routing decisions.</para>
670 <title><indexterm><primary>configuring</primary><secondary>multihome</secondary></indexterm>Multihome Server Example</title>
671 <para>If a server with multiple IP addresses (multihome server) is
672 connected to a Lustre network, certain configuration setting are
673 required. An example illustrating these setting consists of a
674 network with the following nodes:</para>
677 <para> Server svr1 with three TCP NICs (<literal>eth0</literal>,
678 <literal>eth1</literal>, and <literal>eth2</literal>) and an
679 InfiniBand NIC.</para>
682 <para> Server svr2 with three TCP NICs (<literal>eth0</literal>,
683 <literal>eth1</literal>, and <literal>eth2</literal>) and an
684 InfiniBand NIC. Interface eth2 will not be used for Lustre
688 <para> TCP clients, each with a single TCP interface.</para>
691 <para> InfiniBand clients, each with a single Infiniband
692 interface and a TCP/IP interface for administration.</para>
695 <para>To set the <literal>networks</literal> option for this example:</para>
698 <para> On each server, <literal>svr1</literal> and
699 <literal>svr2</literal>, include the following line in the
700 <literal>lustre.conf</literal> file:</para>
703 <screen>options lnet networks=tcp0(eth0),tcp1(eth1),o2ib</screen>
706 <para> For TCP-only clients, the first available non-loopback IP
707 interface is used for <literal>tcp0</literal>. Thus, TCP clients
708 with only one interface do not need to have options defined in
709 the <literal>lustre.conf</literal> file.</para>
712 <para> On the InfiniBand clients, include the following line in
713 the <literal>lustre.conf</literal> file:</para>
716 <screen>options lnet networks=o2ib</screen>
718 <para>By default, Lustre ignores the loopback
719 (<literal>lo0</literal>) interface. Lustre does not ignore IP
720 addresses aliased to the loopback. If you alias IP addresses to
721 the loopback interface, you must specify all Lustre networks using
722 the LNet networks parameter.</para>
725 <para>If the server has multiple interfaces on the same subnet,
726 the Linux kernel will send all traffic using the first configured
727 interface. This is a limitation of Linux, not Lustre. In this
728 case, network interface bonding should be used. For more
729 information about network interface bonding, see <xref
730 linkend="settingupbonding"/>.</para>
734 <section xml:id="dbdoclet.50438216_31414">
735 <title><indexterm><primary>LNet</primary><secondary>ip2nets</secondary></indexterm>Setting the LNet Module ip2nets Parameter</title>
736 <para>The <literal>ip2nets</literal> option is typically used when a
737 single, universal <literal>lustre.conf</literal> file is run on all
738 servers and clients. Each node identifies the locally available
739 networks based on the listed IP address patterns that match the
740 node's local IP addresses.</para>
741 <para>Note that the IP address patterns listed in the
742 <literal>ip2nets</literal> option are <emphasis>only</emphasis> used
743 to identify the networks that an individual node should instantiate.
744 They are <emphasis>not</emphasis> used by LNet for any other
745 communications purpose.</para>
746 <para>For the example below, the nodes in the network have these IP
750 <para> Server svr1: <literal>eth0</literal> IP address
751 <literal>192.168.0.2</literal>, IP over Infiniband
752 (<literal>o2ib</literal>) address
753 <literal>132.6.1.2</literal>.</para>
756 <para> Server svr2: <literal>eth0</literal> IP address
757 <literal>192.168.0.4</literal>, IP over Infiniband
758 (<literal>o2ib</literal>) address
759 <literal>132.6.1.4</literal>.</para>
762 <para> TCP clients have IP addresses
763 <literal>192.168.0.5-255.</literal></para>
766 <para> Infiniband clients have IP over Infiniband
767 (<literal>o2ib</literal>) addresses <literal>132.6.[2-3].2, .4,
768 .6, .8</literal>.</para>
771 <para>The following entry is placed in the
772 <literal>lustre.conf</literal> file on each server and client:</para>
773 <screen>options lnet 'ip2nets="tcp0(eth0) 192.168.0.[2,4]; \
774 tcp0 192.168.0.*; o2ib0 132.6.[1-3].[2-8/2]"'</screen>
775 <para>Each entry in <literal>ip2nets</literal> is referred to as a 'rule'.</para>
776 <para>The order of LNet entries is important when configuring servers.
777 If a server node can be reached using more than one network, the first
778 network specified in <literal>lustre.conf</literal> will be
780 <para>Because <literal>svr1</literal> and <literal>svr2</literal>
781 match the first rule, LNet uses <literal>eth0</literal> for
782 <literal>tcp0</literal> on those machines. (Although
783 <literal>svr1</literal> and <literal>svr2</literal> also match the
784 second rule, the first matching rule for a particular network is
786 <para>The <literal>[2-8/2]</literal> format indicates a range of 2-8
787 stepped by 2; that is 2,4,6,8. Thus, the clients at
788 <literal>132.6.3.5</literal> will not find a matching o2ib
790 <note condition='l2A'>
791 <para>Multi-rail deprecates the kernel parsing of ip2nets. ip2nets
792 patterns are matched in user space and translated into Network interfaces
793 to be added into the system.</para>
794 <para>The first interface that matches the IP pattern will be used when
795 adding a network interface.</para>
796 <para>If an interface is explicitly specified as well as a pattern, the
797 interface matched using the IP pattern will be sanitized against the
798 explicitly-defined interface.</para>
799 <para>For example, <literal>tcp(eth0) 192.168.*.3</literal> and there
800 exists in the system <literal>eth0 == 192.158.19.3</literal> and
801 <literal>eth1 == 192.168.3.3</literal>, then the configuration will fail,
802 because the pattern contradicts the interface specified.</para>
803 <para>A clear warning will be displayed if inconsistent configuration is
805 <para>You could use the following command to configure ip2nets:</para>
806 <screen>lnetctl import < ip2nets.yaml</screen>
807 <para>For example:</para>
820 0: 192.168.*.*</screen>
823 <section xml:id="dbdoclet.50438216_71227">
824 <title><indexterm><primary>LNet</primary><secondary>routes</secondary></indexterm>Setting
825 the LNet Module routes Parameter</title>
826 <para>The LNet module routes parameter is used to identify routers in
827 a Lustre configuration. These parameters are set in
828 <literal>modprobe.conf</literal> on each Lustre node. </para>
829 <para>Routes are typically set to connect to segregated subnetworks
830 or to cross connect two different types of networks such as tcp and
832 <para>The LNet routes parameter specifies a colon-separated list of
833 router definitions. Each route is defined as a network number,
834 followed by a list of routers:</para>
835 <screen>routes=<replaceable>net_type router_NID(s)</replaceable></screen>
836 <para>This example specifies bi-directional routing in which TCP
837 clients can reach Lustre resources on the IB networks and IB servers
838 can access the TCP networks:</para>
839 <screen>options lnet 'ip2nets="tcp0 192.168.0.*; \
840 o2ib0(ib0) 132.6.1.[1-128]"' 'routes="tcp0 132.6.1.[1-8]@o2ib0; \
841 o2ib0 192.16.8.0.[1-8]@tcp0"'</screen>
842 <para>All LNet routers that bridge two networks are equivalent. They
843 are not configured as primary or secondary, and the load is balanced
844 across all available routers.</para>
845 <para>The number of LNet routers is not limited. Enough routers should
846 be used to handle the required file serving bandwidth plus a 25
847 percent margin for headroom.</para>
849 <title><indexterm><primary>LNet</primary><secondary>routing
850 example</secondary></indexterm>Routing Example</title>
851 <para>On the clients, place the following entry in the
852 <literal>lustre.conf</literal> file</para>
853 <screen>lnet networks="tcp" routes="o2ib0 192.168.0.[1-8]@tcp0"</screen>
854 <para>On the router nodes, use:</para>
855 <screen>lnet networks="tcp o2ib" forwarding=enabled </screen>
856 <para>On the MDS, use the reverse as shown below:</para>
857 <screen>lnet networks="o2ib0" routes="tcp0 132.6.1.[1-8]@o2ib0" </screen>
858 <para>To start the routers, run:</para>
859 <screen>modprobe lnet
860 lctl network configure</screen>
863 <section xml:id="dbdoclet.50438216_10523">
864 <title><indexterm><primary>LNet</primary><secondary>testing</secondary></indexterm>Testing
865 the LNet Configuration</title>
866 <para>After configuring Lustre Networking, it is highly recommended
867 that you test your LNet configuration using the LNet Self-Test
868 provided with the Lustre software. For more information about using
869 LNet Self-Test, see <xref linkend="lnetselftest"/>.</para>
871 <section xml:id="dbdoclet.50438216_35668">
872 <title><indexterm><primary>LNet</primary><secondary>route
873 checker</secondary></indexterm>Configuring the Router Checker</title>
874 <para>In a Lustre configuration in which different types of networks,
875 such as a TCP/IP network and an Infiniband network, are connected by
876 routers, a router checker can be run on the clients and servers in the
877 routed configuration to monitor the status of the routers. In a
878 multi-hop routing configuration, router checkers can be configured on
879 routers to monitor the health of their next-hop routers.</para>
880 <para>A router checker is configured by setting LNet parameters in
881 <literal>lustre.conf</literal> by including an entry in this
884 <replaceable>router_checker_parameter</replaceable>=<replaceable>value</replaceable></screen>
885 <para>The router checker parameters are:</para>
888 <para><literal>live_router_check_interval</literal> - Specifies a
889 time interval in seconds after which the router checker will ping
890 the live routers. The default value is 0, meaning no checking is
891 done. To set the value to 60, enter:</para>
892 <screen>options lnet live_router_check_interval=60</screen>
895 <para><literal>dead_router_check_interval</literal> - Specifies a
896 time interval in seconds after which the router checker will check
897 for dead routers. The default value is 0, meaning no checking is
898 done. To set the value to 60, enter:</para>
899 <screen>options lnet dead_router_check_interval=60</screen>
902 <para>auto_down - Enables/disables (1/0) the automatic marking of
903 router state as up or down. The default value is 1. To disable
904 router marking, enter:</para>
905 <screen>options lnet auto_down=0</screen>
908 <para><literal>router_ping_timeout</literal> - Specifies a
909 timeout for the router checker when it checks live or dead
910 routers. The router checker sends a ping message to each dead or
911 live router once every dead_router_check_interval or
912 live_router_check_interval respectively. The default value is 50.
913 To set the value to 60, enter:</para>
914 <screen>options lnet router_ping_timeout=60</screen>
916 <para>The <literal>router_ping_timeout</literal> is consistent
917 with the default LND timeouts. You may have to increase it on very
918 large clusters if the LND timeout is also increased. For larger
919 clusters, we suggest increasing the check interval.</para>
923 <para><literal>check_routers_before_use</literal> - Specifies
924 that routers are to be checked before use. Set to off by
925 default. If this parameter is set to on, the
926 dead_router_check_interval parameter must be given a positive
927 integer value.</para>
928 <screen>options lnet check_routers_before_use=on</screen>
931 <para>The router checker obtains the following information from each router:</para>
934 <para> Time the router was disabled</para>
937 <para> Elapsed disable time</para>
940 <para>If the router checker does not get a reply message from the
941 router within router_ping_timeout seconds, it considers the router to
943 <para>If a router is marked 'up' and responds to a ping, the
944 timeout is reset.</para>
945 <para>If 100 packets have been sent successfully through a router, the
946 sent-packets counter for that router will have a value of 100.</para>
948 <section xml:id="dbdoclet.50438216_15200">
949 <title><indexterm><primary>LNet</primary><secondary>best
950 practice</secondary></indexterm>Best Practices for LNet
952 <para>For the <literal>networks</literal>, <literal>ip2nets</literal>,
953 and <literal>routes</literal> options, follow these best practices to
954 avoid configuration errors.</para>
956 <title><indexterm><primary>LNet</primary><secondary>escaping commas
957 with quotes</secondary></indexterm>Escaping commas with
959 <para>Depending on the Linux distribution, commas may need to be
960 escaped using single or double quotes. In the extreme case, the
961 <literal>options</literal> entry would look like this:</para>
962 <para><screen>options
963 lnet'networks="tcp0,elan0"'
964 'routes="tcp [2,10]@elan0"'</screen></para>
965 <para>Added quotes may confuse some distributions. Messages such as
966 the following may indicate an issue related to added quotes:</para>
967 <para><screen>lnet: Unknown parameter 'networks'</screen></para>
968 <para>A <literal>'Refusing connection - no matching
969 NID'</literal> message generally points to an error in the LNet
970 module configuration.</para>
973 <title><indexterm><primary>LNet</primary><secondary>comments</secondary></indexterm>Including
975 <para><emphasis>Place the semicolon terminating a comment
976 immediately after the comment.</emphasis> LNet silently ignores
977 everything between the <literal>#</literal> character at the
978 beginning of the comment and the next semicolon.</para>
979 <para>In this <emphasis>incorrect</emphasis> example, LNet silently
980 ignores <literal>pt11 192.168.0.[92,96]</literal>, resulting in
981 these nodes not being properly initialized. No error message is
983 <screen>options lnet ip2nets="pt10 192.168.0.[89,93]; # comment
984 with semicolon BEFORE comment \ pt11 192.168.0.[92,96];</screen>
985 <para>This <emphasis role="italic">correct</emphasis> example shows
986 the required syntax: </para>
987 <para><screen>options lnet ip2nets="pt10 192.168.0.[89,93] \
988 # comment with semicolon AFTER comment; \
989 pt11 192.168.0.[92,96] # comment</screen></para>
990 <para><emphasis role="italic">Do not add an excessive number of
991 comments.</emphasis> The Linux kernel limits the length of character
992 strings used in module options (usually to 1KB, but this may differ
993 between vendor kernels). If you exceed this limit, errors result and
994 the specified configuration may not be processed correctly.</para>