From a91a21ad3f10c834eb788ced7724cf4986949fa3 Mon Sep 17 00:00:00 2001 From: Richard Henwood Date: Wed, 18 May 2011 14:30:44 -0500 Subject: [PATCH] FIX: xrefs --- ConfigurationFilesModuleParameters.xml | 4 ++-- ConfiguringFailover.xml | 2 +- ConfiguringLNET.xml | 2 +- ConfiguringLustre.xml | 24 +++++++++---------- IV_LustreTuning.xml | 2 +- InstallingLustre.xml | 2 +- LustreMaintenance.xml | 44 ++++++++++++++++++++++------------ LustreProc.xml | 2 +- LustreRecovery.xml | 2 -- ManagingLNET.xml | 8 +++---- SystemConfigurationUtilities.xml | 4 ++-- TroubleShootingRecovery.xml | 2 -- UnderstandingFailover.xml | 2 +- UnderstandingLustre.xml | 6 ++--- 14 files changed, 58 insertions(+), 48 deletions(-) diff --git a/ConfigurationFilesModuleParameters.xml b/ConfigurationFilesModuleParameters.xml index efd31aa..80e1cd0 100644 --- a/ConfigurationFilesModuleParameters.xml +++ b/ConfigurationFilesModuleParameters.xml @@ -1,7 +1,7 @@ - + - Configuration Files and Module Parameters + Configuration Files and Module Parameters This section describes configuration files and module parameters and includes the following sections: diff --git a/ConfiguringFailover.xml b/ConfiguringFailover.xml index 9933b56..61bee7f 100644 --- a/ConfiguringFailover.xml +++ b/ConfiguringFailover.xml @@ -1,7 +1,7 @@ - Configuring Lustre Failover + Configuring Lustre Failover This chapter describes how to configure Lustre failover using the Heartbeat cluster infrastructure daemon. It includes: diff --git a/ConfiguringLNET.xml b/ConfiguringLNET.xml index 18e924c..1e477d0 100644 --- a/ConfiguringLNET.xml +++ b/ConfiguringLNET.xml @@ -141,7 +141,7 @@ By default, Lustre ignores the loopback (lo0) interface. Lustre does not ignore IP addresses aliased to the loopback. If you alias IP addresses to the loopback interface, you must specify all Lustre networks using the LNET networks parameter. - If the server has multiple interfaces on the same subnet, the Linux kernel will send all traffic using the first configured interface. This is a limitation of Linux, not Lustre. In this case, network interface bonding should be used. For more information about network interface bonding, see . + If the server has multiple interfaces on the same subnet, the Linux kernel will send all traffic using the first configured interface. This is a limitation of Linux, not Lustre. In this case, network interface bonding should be used. For more information about network interface bonding, see . diff --git a/ConfiguringLustre.xml b/ConfiguringLustre.xml index b06a684..572cdf5 100644 --- a/ConfiguringLustre.xml +++ b/ConfiguringLustre.xml @@ -1,7 +1,7 @@ - + - Configuring Lustre + Configuring Lustre @@ -41,7 +41,7 @@ For information about configuring LNET, see . For information about testing LNET, see . - Run the benchmark script sgpdd_survey to determine baseline performance of your hardware. Benchmarking your hardware will simplify debugging performance issues that are unrelated to Lustre and ensure you are getting the best possible performance with your installation. For information about running sgpdd_survey, see . + Run the benchmark script sgpdd_survey to determine baseline performance of your hardware. Benchmarking your hardware will simplify debugging performance issues that are unrelated to Lustre and ensure you are getting the best possible performance with your installation. For information about running sgpdd_survey, see . @@ -380,10 +380,10 @@ oup upcall set to /usr/sbin/l_getgroups - 4. Create and mount ost2. + Create and mount ost2. - a. Create ost2. On oss2 node, run: + Create ost2. On oss2 node, run: [root@oss2 /]# mkfs.lustre --ost --fsname=temp --mgsnode=10.2.0.1@tcp0 /dev\ /sdd @@ -411,7 +411,7 @@ oup upcall set to /usr/sbin/l_getgroups - b. Mount ost2 on the OSS on which it was created. On oss2 node, run: + Mount ost2 on the OSS on which it was created. On oss2 node, run: root@oss2 /] mount -t lustre /dev/sdd /mnt/ost2 The command generates this output: @@ -428,7 +428,7 @@ oup upcall set to /usr/sbin/l_getgroups - 5. Mount the Lustre file system on the client. On the client node, run: + Mount the Lustre file system on the client. On the client node, run: root@client1 /] mount -t lustre 10.2.0.1@tcp0:/temp /lustre This command generates this output: @@ -436,10 +436,10 @@ oup upcall set to /usr/sbin/l_getgroups - 6. Verify that the file system started and is working by running the df, dd and ls commands on the client node. + Verify that the file system started and is working by running the df, dd and ls commands on the client node. - a. Run the lfsdf -h command: + Run the lfsdf -h command: [root@client1 /] lfs df -h The lfsdf-h command lists space usage per OST and the MDT in human-readable format. This command generates output similar to this: @@ -457,7 +457,7 @@ oup upcall set to /usr/sbin/l_getgroups - b. Run the lfsdf-ih command. + Run the lfsdf-ih command. [root@client1 /] lfs df -ih The lfsdf-ih command lists inode usage per OST and the MDT. This command generates output similar to this: @@ -506,7 +506,7 @@ oup upcall set to /usr/sbin/l_getgroups This section describes how to scale the Lustre file system or make configuration changes using the Lustre configuration utilities.
<anchor xml:id="dbdoclet.50438267_pgfId-1292441" xreflabel=""/>10.2.1 Scaling the <anchor xml:id="dbdoclet.50438267_marker-1292440" xreflabel=""/>Lustre File System - A Lustre file system can be scaled by adding OSTs or clients. For instructions on creating additional OSTs repeat Step 3 and Step 4 above. For mounting additional clients, repeat Step 5 for each client. + A Lustre file system can be scaled by adding OSTs or clients. For instructions on creating additional OSTs repeat Step 3 and Step 4 above. For mounting additional clients, repeat Step 5 for each client.
<anchor xml:id="dbdoclet.50438267_pgfId-1292798" xreflabel=""/>10.2.2 <anchor xml:id="dbdoclet.50438267_50212" xreflabel=""/>Changing Striping Defaults @@ -541,7 +541,7 @@ oup upcall set to /usr/sbin/l_getgroups - Use the lfs setstripe command described in to change the file layout configuration. + Use the lfs setstripe command described in to change the file layout configuration.
<anchor xml:id="dbdoclet.50438267_pgfId-1292908" xreflabel=""/>10.2.3 Using the Lustre Configuration Utilities diff --git a/IV_LustreTuning.xml b/IV_LustreTuning.xml index 84f8864..a2bc525 100644 --- a/IV_LustreTuning.xml +++ b/IV_LustreTuning.xml @@ -14,7 +14,7 @@ - + diff --git a/InstallingLustre.xml b/InstallingLustre.xml index 7c84dcb..a0294be 100644 --- a/InstallingLustre.xml +++ b/InstallingLustre.xml @@ -1,7 +1,7 @@ - Installing the Lustre Software + Installing the Lustre Software This chapter describes how to install the Lustre software. It includes: diff --git a/LustreMaintenance.xml b/LustreMaintenance.xml index ba8589b..273afc2 100644 --- a/LustreMaintenance.xml +++ b/LustreMaintenance.xml @@ -357,17 +357,17 @@ The list of files that need to be restored from backup is stored in /tmp/files_t -4. Deactivate the OST. +Deactivate the OST. -a. To temporarily disable the deactivated OST, enter: +To temporarily disable the deactivated OST, enter: [client]# lctl set_param osc.<fsname>-<OST name>-*.active=0 If there is expected to be a replacement OST in some short time (a few days), the OST can temporarily be deactivated on the clients: -Note - This setting is only temporary and will be reset if the clients or MDS are rebooted. It needs to be run on all clients. + This setting is only temporary and will be reset if the clients or MDS are rebooted. It needs to be run on all clients. b. To permanently disable the deactivated OST, enter: @@ -378,8 +378,12 @@ b. To permanently disable the deactivated OST, enter: If there is not expected to be a replacement for this OST in the near future, permanently deactivate the OST on all clients and the MDS: -Note - A removed OST still appears in the file system; do not create a new OST with the same name. -14.7.2 Backing Up OST Configuration Files +A removed OST still appears in the file system; do not create a new OST with the same name. + + +
+
+ 14.7.2 Backing Up OST Configuration Files If the OST device is still accessible, then the Lustre configuration files on the OST should be backed up and saved for future use in order to avoid difficulties when a replacement OST is returned to service. These files rarely change, so they can and should be backed up while the OST is functional and accessible. If the deactivated OST is still available to mount (i.e. has not permanently failed or is unmountable due to severe corruption), an effort should be made to preserve these files. @@ -407,7 +411,9 @@ CONFIGS/ O/0/LAST_ID -14.7.3 Restoring OST Configuration Files +
+
+ 14.7.3 Restoring OST Configuration Files If the original OST is still available, it is best to follow the OST backup and restore procedure given in either Backing Up and Restoring an MDS or OST (Device Level), or Making a File-Level Backup of an OST File System and Restoring a File-Level Backup. @@ -461,7 +467,9 @@ seek=5 skip=5 -14.7.4 Returning a Deactivated OST to Service +
+
+ 14.7.4 Returning a Deactivated OST to Service If the OST was permanently deactivated, it needs to be reactivated in the MGS configuration. @@ -476,7 +484,10 @@ If the OST was temporarily deactivated, it needs to be reactivated on the MDS an [client]# lctl set_param osc.<fsname>-<OST name>-*.active=0 -14.8 Aborting Recovery +
+ +
+ 14.8 Aborting Recovery You can abort recovery with either the lctl utility or by mounting the target with the abort_recov option (mount -o abort_recov). When starting a target, run: @@ -485,7 +496,10 @@ $ mount -t lustre -L <MDT name> -o abort_recov <mount_point> Note - The recovery process is blocked until all OSTs are available. -14.9 Determining Which Machine is Serving an OST + +
+
+ 14.9 Determining Which Machine is Serving an OST In the course of administering a Lustre file system, you may need to determine which machine is serving a specific OST. It is not as simple as identifying the machine’s IP address, as IP is only one of several networking protocols that Lustre uses and, as such, LNET does not use IP addresses as node identifiers, but NIDs instead. @@ -513,7 +527,9 @@ osc.lustre-OST0003-osc-f1579000.ost_conn_uuid=192.168.20.1@tcp osc.lustre-OST0004-osc-f1579000.ost_conn_uuid=192.168.20.1@tcp -14.10 Changing the Address of a Failover Node +
+
+ 14.10 Changing the Address of a Failover Node To change the address of a failover node (e.g, to use node X instead of node Y), run this command on the OSS/OST partition: @@ -521,11 +537,9 @@ To change the address of a failover node (e.g, to use node X instead of node Y), tunefs.lustre --erase-params --failnode=<NID> <device> - -
- - -
+ + +
diff --git a/LustreProc.xml b/LustreProc.xml index 8895f6d..1413801 100644 --- a/LustreProc.xml +++ b/LustreProc.xml @@ -1230,7 +1230,7 @@ daa0/stats
<anchor xml:id="dbdoclet.50438271_pgfId-1290921" xreflabel=""/>31.3.1.1 Interpreting OST Statistics - See also (llobdstat) and (CollectL). + See also (llobdstat) and (CollectL). The OST .../stats files can be used to track client statistics (client activity) for each OST. It is possible to get a periodic dump of values from these file (for example, every 10 seconds), that show the RPC rates (similar to iostat) by using the llstat.pl tool: # llstat /proc/fs/lustre/osc/lustre-OST0000-osc/stats diff --git a/LustreRecovery.xml b/LustreRecovery.xml index 24d4d08..3dfadc2 100644 --- a/LustreRecovery.xml +++ b/LustreRecovery.xml @@ -26,8 +26,6 @@ -Usually the Lustre recovery process is transparent. For information about troubleshooting recovery when something goes wrong, see . -
30.1 Recovery Overview Lustre's recovery feature is responsible for dealing with node or network failure and returning the cluster to a consistent, performant state. Because Lustre allows servers to perform asynchronous update operations to the on-disk file system (i.e., the server can reply without waiting for the update to synchronously commit to disk), the clients may have state in memory that is newer than what the server can recover from disk after a crash. diff --git a/ManagingLNET.xml b/ManagingLNET.xml index eb22139..ab12e85 100644 --- a/ManagingLNET.xml +++ b/ManagingLNET.xml @@ -6,16 +6,16 @@ This chapter describes some tools for managing Lustre Networking (LNET) and includes the following sections: - Updating the Health Status of a Peer or Router + Updating the Health Status of a Peer or Router - Starting and Stopping LNET + Starting and Stopping LNET - Multi-Rail Configurations with LNET + Multi-Rail Configurations with LNET - Load Balancing with InfiniBand + Load Balancing with InfiniBand diff --git a/SystemConfigurationUtilities.xml b/SystemConfigurationUtilities.xml index 41b370d..4d7e2b9 100644 --- a/SystemConfigurationUtilities.xml +++ b/SystemConfigurationUtilities.xml @@ -719,7 +719,7 @@ unch: 18
<anchor xml:id="dbdoclet.50438219_pgfId-1318212" xreflabel=""/>See Also - +
@@ -1951,7 +1951,7 @@ tests 5 times each
<anchor xml:id="dbdoclet.50438219_pgfId-1294873" xreflabel=""/>lr_reader The lr_reader utility translates a last received (last_rcvd) file into human-readable form. - The following utilites are part of the Lustre I/O kit. For more information, see . + The following utilites are part of the Lustre I/O kit. For more information, see .
<anchor xml:id="dbdoclet.50438219_pgfId-1318396" xreflabel=""/>sgpdd_survey diff --git a/TroubleShootingRecovery.xml b/TroubleShootingRecovery.xml index 7bffb61..be264fa 100644 --- a/TroubleShootingRecovery.xml +++ b/TroubleShootingRecovery.xml @@ -19,8 +19,6 @@ -For a description of how recovery is implemented in Lustre, see . -
27.1 Recovering from Errors or <anchor xml:id="dbdoclet.50438225_marker-1292184" xreflabel=""/>Corruption on a Backing File System When an OSS, MDS, or MGS server crash occurs, it is not necessary to run e2fsck on the file system. ldiskfs journaling ensures that the file system remains coherent. The backing file systems are never accessed directly from the client, so client crashes are not relevant. diff --git a/UnderstandingFailover.xml b/UnderstandingFailover.xml index 235826e..270445e 100644 --- a/UnderstandingFailover.xml +++ b/UnderstandingFailover.xml @@ -40,7 +40,7 @@ Health monitoring - Verifies the availability of hardware and network resources and responds to health indications provided by Lustre. -These capabilities can be provided by a variety of software and/or hardware solutions. For more information about using power management software or hardware and high availability (HA) software with Lustre, see . +These capabilities can be provided by a variety of software and/or hardware solutions. For more information about using power management software or hardware and high availability (HA) software with Lustre, see . HA software is responsible for detecting failure of the primary Lustre server node and controlling the failover. Lustre works with any HA software that includes resource (I/O) fencing. For proper resource fencing, the HA software must be able to completely power off the failed server or disconnect it from the shared storage device. If two active nodes have access to the same storage device, data may be severely corrupted.
diff --git a/UnderstandingLustre.xml b/UnderstandingLustre.xml index c2d0420..f806d40 100644 --- a/UnderstandingLustre.xml +++ b/UnderstandingLustre.xml @@ -220,15 +220,15 @@ - For additional hardware requirements and considerations, see . + For additional hardware requirements and considerations, see .
<anchor xml:id="dbdoclet.50438250_pgfId-1295546" xreflabel=""/>1.2.3 Lustre Networking (LNET) - Lustre Networking (LNET) is a custom networking API that provides the communication infrastructure that handles metadata and file I/O data for the Lustre file system servers and clients. For more information about LNET, see . + Lustre Networking (LNET) is a custom networking API that provides the communication infrastructure that handles metadata and file I/O data for the Lustre file system servers and clients. For more information about LNET, see .
<anchor xml:id="dbdoclet.50438250_pgfId-1293940" xreflabel=""/>1.2.4 Lustre Cluster - At scale, the Lustre cluster can include hundreds of OSSs and thousands of clients (see ). More than one type of network can be used in a Lustre cluster. Shared storage between OSSs enables failover capability. For more details about OSS failover, see . + At scale, the Lustre cluster can include hundreds of OSSs and thousands of clients (see ). More than one type of network can be used in a Lustre cluster. Shared storage between OSSs enables failover capability. For more details about OSS failover, see .
Lustre cluster at scale -- 1.8.3.1