1 1. This version of the GM nal requires an unreleased extension to the GM API to
2 map physical memory: gm_register_memory_ex_phys(). This allows it to avoid
3 ENOMEM problems associated with large contiguous buffer allocation.
5 2. ./configure --with-gm=<path-to-gm-source-tree> \
6 [--with-gm-install=<path-to-gm-installation>]
8 If the sources do not support gm_register_memory_ex_phys(), configure flags
9 an error. In this case you should apply the patch and rebuild and re-install
10 GM as directed in the error message.
12 By default GM is installed in /opt/gm. If an alternate path was specified to
13 <GM-sources>/binary/GM_INSTALL, you should also specify --with-gm-install
16 3. The GM timeout is 300 seconds; i.e. the network may not release resources
17 claimed by communications stalled with a crashing node for this time.
18 Default gmnal buffer tuning parameters (see (4) below) have been chosen to
19 minimize this problem and prevent lustre having to block for resources.
20 However in some situations, where all network buffers are busy, the default
21 lustre timeout (various, scaled from the base timeout of 100 seconds) may be
22 too small and the only solution may be to increase the lustre timeout
25 4. The gmnal has the following module parameters...
27 gmnal_port The GM port that the NAL will use (default 4)
28 Change this if it conflicts with site usage.
30 gmnal_ntx The number of "normal" transmit descriptors (default
31 32). When this pool is exhausted, threads sending
32 and receiving on the network block until in-progress
33 transmits have completed. Each descriptor consumes 1
36 gmnal_ntx_nblk The number of "reserved" transmit descriptors
37 (default 256). This pool is reserved for responses to
38 incoming communications that may not block. Increase
39 only if console error messages indicates the pool
40 has been exhausted (LustreError: Can't get tx for
41 msg type...) Each descriptor consumes 1 GM_MTU sized
44 gmnal_nlarge_tx_bufs The number of 1MByte transmit buffers to reserve at
45 startup (default 32). This controls the number of
46 concurrent sends larger that GM_MTU. It can be
47 reduced to conserve memory, or increased to increase
48 large message sending concurrency.
50 gmnal_nrx_small The number of GM_MTU sized receive buffers posted to
51 receive from the network (default 128). Increase if
52 congestion is suspected, however note that the total
53 number of receives that can be posted at any time is
54 limited by the number of GM receive tokens
55 available. If there are too few, this, and
56 gmnal_nrx_large are scaled back accordingly.
58 gmnal_nrx_large The number of 1MByte receive buffers posted to
59 receive from the network (default 64). Increase if
60 the number of OST threads is increased. But note
61 that the total number of receives that can be posted
62 at any time is limited by the number of GM receive
63 tokens available. If there are too few, this, and
64 gmnal_nrx_small are scaled back accordingly.
66 5. Network configuration for GM is done in an lmc script as follows...
68 GM2NID=${path-to-lustre-tree}/portals/utils/gmnalnid
70 ${LMC} --node some_server --add net --nettype gm --nid `$GM2NID -n some_server`
72 ${LMC} --node client --add net --nettype gm --nid '*'