1. This version of the GM nal requires an unreleased extension to the GM API to map physical memory: gm_register_memory_ex_phys(). This allows it to avoid ENOMEM problems associated with large contiguous buffer allocation. 2. ./configure --with-gm= \ [--with-gm-install=] If the sources do not support gm_register_memory_ex_phys(), configure flags an error. In this case you should apply the patch and rebuild and re-install GM as directed in the error message. By default GM is installed in /opt/gm. If an alternate path was specified to /binary/GM_INSTALL, you should also specify --with-gm-install with the same path. 3. The GM timeout is 300 seconds; i.e. the network may not release resources claimed by communications stalled with a crashing node for this time. Default gmnal buffer tuning parameters (see (4) below) have been chosen to minimize this problem and prevent lustre having to block for resources. However in some situations, where all network buffers are busy, the default lustre timeout (various, scaled from the base timeout of 100 seconds) may be too small and the only solution may be to increase the lustre timeout dramatically. 4. The gmnal has the following module parameters... gmnal_port The GM port that the NAL will use (default 4) Change this if it conflicts with site usage. gmnal_ntx The number of "normal" transmit descriptors (default 32). When this pool is exhausted, threads sending and receiving on the network block until in-progress transmits have completed. Each descriptor consumes 1 GM_MTU sized buffer. gmnal_ntx_nblk The number of "reserved" transmit descriptors (default 256). This pool is reserved for responses to incoming communications that may not block. Increase only if console error messages indicates the pool has been exhausted (LustreError: Can't get tx for msg type...) Each descriptor consumes 1 GM_MTU sized buffer. gmnal_nlarge_tx_bufs The number of 1MByte transmit buffers to reserve at startup (default 32). This controls the number of concurrent sends larger that GM_MTU. It can be reduced to conserve memory, or increased to increase large message sending concurrency. gmnal_nrx_small The number of GM_MTU sized receive buffers posted to receive from the network (default 128). Increase if congestion is suspected, however note that the total number of receives that can be posted at any time is limited by the number of GM receive tokens available. If there are too few, this, and gmnal_nrx_large are scaled back accordingly. gmnal_nrx_large The number of 1MByte receive buffers posted to receive from the network (default 64). Increase if the number of OST threads is increased. But note that the total number of receives that can be posted at any time is limited by the number of GM receive tokens available. If there are too few, this, and gmnal_nrx_small are scaled back accordingly. 5. Network configuration for GM is done in an lmc script as follows... GM2NID=${path-to-lustre-tree}/portals/utils/gmnalnid ${LMC} --node some_server --add net --nettype gm --nid `$GM2NID -n some_server` ${LMC} --node client --add net --nettype gm --nid '*'