From: Richard Henwood Date: Wed, 18 May 2011 16:33:32 +0000 (-0500) Subject: FIX: xrefs and tidying X-Git-Tag: workingxslt~12 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=d790d2972f9dd3c9437cd42db3955b055af11fcf;p=doc%2Fmanual.git FIX: xrefs and tidying --- diff --git a/LustreTroubleshooting.xml b/LustreTroubleshooting.xml index 57e7a5e..2baaf7d 100644 --- a/LustreTroubleshooting.xml +++ b/LustreTroubleshooting.xml @@ -1,32 +1,25 @@ - + - Lustre Troubleshooting + Lustre Troubleshooting This chapter provides information to troubleshoot Lustre, submit a Lustre bug, and Lustre performance tips. It includes the following sections: - Lustre Error Messages + + - + + - Reporting a Lustre Bug - - - - - - Common Lustre Problems - - - + + -
- <anchor xml:id="dbdoclet.50438198_pgfId-1293193" xreflabel=""/> -
- 26.1 <anchor xml:id="dbdoclet.50438198_11171" xreflabel=""/>Lustre Error Messages + +
+ 26.1 Lustre Error Messages Several resources are available to help troubleshoot Lustre. This section describes error numbers, error messages and logs.
<anchor xml:id="dbdoclet.50438198_pgfId-1292773" xreflabel=""/>26.1.1 Error <anchor xml:id="dbdoclet.50438198_marker-1296744" xreflabel=""/>Numbers @@ -110,41 +103,26 @@ What the problem is - - - + Which process ID had trouble - - - + Which server node it was communicating with, and so on. - - - + Lustre logs are dumped to /proc/sys/lnet/debug_path. Collect the first group of messages related to a problem, and any messages that precede "LBUG" or "assertion failure" errors. Messages that mention server nodes (OST or MDS) are specific to that server; you must collect similar messages from the relevant server console logs. Another Lustre debug log holds information for Lustre action for a short period of time which, in turn, depends on the processes on the node to use Lustre. Use the following command to extract debug logs on each of the nodes, run $ lctl dk <filename> - - - - - - Note -LBUG freezes the thread to allow capture of the panic stack. A system reboot is needed to clear the thread. - - - - + LBUG freezes the thread to allow capture of the panic stack. A system reboot is needed to clear the thread.
-
- 26.2 <anchor xml:id="dbdoclet.50438198_30989" xreflabel=""/>Reporting a Lustre <anchor xml:id="dbdoclet.50438198_marker-1296753" xreflabel=""/>Bug +
+ 26.2 Reporting a Lustre <anchor xml:id="dbdoclet.50438198_marker-1296753" xreflabel=""/>Bug If, after troubleshooting your Lustre system, you cannot resolve the problem, consider reporting a Lustre bug. The process for reporting a bug is described in the Lustre wiki topic Reporting Bugs. You can also post a question to the lustre-discuss mailing list or search the lustre-discuss Archives for information about your issue. A Lustre diagnostics tool is available for downloading at: http://downloads.lustre.org/public/tools/lustre-diagnostics/ @@ -154,8 +132,8 @@ Output is sent directly to the terminal. Use normal file redirection to send the output to a file, and then manually attach the file to the bug you are submitting.
-
- 26.3 <anchor xml:id="dbdoclet.50438198_93109" xreflabel=""/>Common Lustre Problems +
+ 26.3 Common Lustre Problems This section describes how to address common issues encountered with Lustre.
<anchor xml:id="dbdoclet.50438198_pgfId-1291350" xreflabel=""/>26.3.1 OST Object is <anchor xml:id="dbdoclet.50438198_marker-1291349" xreflabel=""/>Missing or Damaged @@ -177,27 +155,19 @@ Examine the consoles of all servers for any error indications - - - + Examine the syslogs of all servers for any LustreErrors or LBUG - - - + Check the health of your system hardware and network. (Are the disks working as expected, is the network dropping packets?) - - - + Consider what was happening on the cluster at the time. Does this relate to a specific user workload or a system load condition? Is the condition reproducible? Does it happen at a specific time (day, week or month)? - - - + To recover from this problem, you must restart Lustre services using these file systems. There is no other way to know that the I/O made it to disk, and the state of the cache may be inconsistent with what is on disk.
@@ -213,16 +183,8 @@ The OST device number or device name is generated by the lctl dl command. The deactivate command prevents clients from creating new objects on the specified OST, although you can still access the OST for reading. - - - - - - Note -If the OST later becomes available it needs to be reactivated, run:# lctl --device <OST device name or number> activate - - - - + If the OST later becomes available it needs to be reactivated, run:# lctl --device <OST device name or number> activate + 3. Determine all files that are striped over the missing OST, run: # lfs getstripe -r -O {OST_UUID} /mountpoint @@ -233,16 +195,7 @@ 5. You can delete these files with the unlink or munlink command. # unlink|munlink filename {filename ...} - - - - - - Note -There is no functional difference between the unlink and munlink commands. The unlink command is for newer Linux distributions. You can run munlink if unlink is not available. When you run the unlink or munlink command, the file on the MDS is permanently removed. - - - - + There is no functional difference between the unlink and munlink commands. The unlink command is for newer Linux distributions. You can run munlink if unlink is not available. When you run the unlink or munlink command, the file on the MDS is permanently removed. 6. If you need to know, specifically, which parts of the file are missing data, then you first need to determine the file layout (striping pattern), which includes the index of the missing OST). Run: # lfs getstripe -v {filename} @@ -271,16 +224,8 @@ obdid 3438673 last_id 3478673" To recover from this situation, determine and set a reasonable LAST_ID value. - - - - - - Note -The file system must be stopped on all servers before performing this procedure. - - - - + The file system must be stopped on all servers before performing this procedure. + For hex < -> decimal translations: Use GDB: (gdb) p /x 15028 @@ -314,33 +259,43 @@ LAST_ID" If the OST LAST_ID value matches that for the objects existing on the OST, then it is possible the lov_objid file on the MDS is incorrect. Delete the lov_objid file on the MDS and it will be re-created from the LAST_ID on the OSTs. If you determine the LAST_ID file on the OST is incorrect (that is, it does not match what objects exist, does not match the MDS lov_objid value), then you have decided on a proper value for LAST_ID. Once you have decided on a proper value for LAST_ID, use this repair procedure. - 1. Access: + + Access: mount -t ldiskfs /dev/{ostdev} /mnt/ost - 2. Check the current: + + Check the current: od -Ax -td8 /mnt/ost/O/0/LAST_ID - 3. Be very safe, only work on backups: + + Be very safe, only work on backups: cp /mnt/ost/O/0/LAST_ID /tmp/LAST_ID - 4. Convert binary to text: + + Convert binary to text: xxd /tmp/LAST_ID /tmp/LAST_ID.asc - 5. Fix: + + Fix: vi /tmp/LAST_ID.asc - 6. Convert to binary: + + Convert to binary: xxd -r /tmp/LAST_ID.asc /tmp/LAST_ID.new - 7. Verify: + + Verify: od -Ax -td8 /tmp/LAST_ID.new - 8. Replace: + + Replace: cp /tmp/LAST_ID.new /mnt/ost/O/0/LAST_ID - 9. Clean up: + + Clean up: umount /mnt/ost +
<anchor xml:id="dbdoclet.50438198_pgfId-1291447" xreflabel=""/>26.3.5 Handling/Debugging <anchor xml:id="dbdoclet.50438198_marker-1291446" xreflabel=""/>"Bind: Address already in use" Error @@ -349,35 +304,21 @@ LAST_ID" Start Lustre before starting any service that uses sunrpc. - - - + Use a port other than 988 for Lustre. This is configured in /etc/modprobe.conf as an option to the LNET module. For example: - - - + options lnet accept_port=988 Add modprobe ptlrpc to your system startup scripts before the service that uses sunrpc. This causes Lustre to bind to port 988 and sunrpc to select a different port. - - - + - - - - - - Note -You can also use the sysctl command to mitigate the NFS client from grabbing the Lustre service port. However, this is a partial workaround as other user-space RPC servers still have the ability to grab the port. - - - - + You can also use the sysctl command to mitigate the NFS client from grabbing the Lustre service port. However, this is a partial workaround as other user-space RPC servers still have the ability to grab the port. +
<anchor xml:id="dbdoclet.50438198_pgfId-1291471" xreflabel=""/>26.3.6 Handling/Debugging <anchor xml:id="dbdoclet.50438198_marker-1291470" xreflabel=""/>Error "- 28" @@ -388,15 +329,11 @@ LAST_ID" Expand the disk space on the OST. - - - + Copy or stripe the file to a less full OST. - - - + A Linux error -28 (ENOSPC) that occurs when a new file is being created may indicate that the MDS has run out of inodes and needs to be made larger. Newly created files do not written to full OSTs, while existing files continue to reside on the OST where they were initially created. To view inode information on the MDS, enter: lfs df -i @@ -409,16 +346,8 @@ LAST_ID" grep '[0-9]' /proc/fs/lustre/mdc/*/kbytes{free,avail,total} grep '[0-9]' /proc/fs/lustre/mdc/*/files{free,total} - - - - - - Note -You can find other numeric error codes along with a short name and text description in /usr/include/asm/errno.h. - - - - + You can find other numeric error codes along with a short name and text description in /usr/include/asm/errno.h. +
<anchor xml:id="dbdoclet.50438198_pgfId-1291481" xreflabel=""/>26.3.7 Triggering <anchor xml:id="dbdoclet.50438198_marker-1291480" xreflabel=""/>Watchdog for PID NNN @@ -463,15 +392,11 @@ LAST_ID" You are using a disk device that claims to have data written to disk before it actually does, as in case of a device with a large cache. If that disk device crashes or loses power in a way that causes the loss of the cache, there can be a loss of transactions that you believe are committed. This is a very serious event, and you should run e2fsck against that storage before restarting Lustre. - - - + As per the Lustre requirement, the shared storage used for failover is completely cache-coherent. This ensures that if one server takes over for another, it sees the most up-to-date and accurate copy of the data. In case of the failover of the server, if the shared storage does not provide cache coherency between all of its ports, then Lustre can produce an error. - - - + If you know the exact reason for the error, then it is safe to proceed with no further action. If you do not know the reason, then this is a serious issue and you should explore it with your disk vendor. If the error occurs during failover, examine your disk cache settings. If it occurs after a restart without failover, try to determine how the disk can report that a write succeeded, then lose the Data Device corruption or Disk Errors. @@ -486,21 +411,15 @@ LAST_ID" Each client needs to take an EOF lock on all the OSTs, as it is difficult to know which OST holds the end of the file until you check all the OSTs. As all the clients are using the same O_APPEND, there is significant locking overhead. - - - + The second client cannot get all locks until the end of the writing of the first client, as the taking serializes all writes from the clients. - - - + To avoid deadlocks, the taking of these locks occurs in a known, consistent order. As a client cannot know which OST holds the next piece of the file until the client has locks on all OSTS, there is a need of these locks in case of a striped file. - - - +
@@ -515,21 +434,15 @@ LAST_ID" kernel "Out of memory" and/or "oom-killer" messages - - - + Lustre "kmalloc of 'mmm' (NNNN bytes) failed..." messages - - - + Lustre or kernel stack traces showing processes stuck in "try_to_free_pages" - - - + For information on determining the MDS memory and OSS memory requirements, see Determining Memory Requirements.
@@ -538,6 +451,5 @@ LAST_ID" Some SCSI drivers default to a maximum I/O size that is too small for good Lustre performance. we have fixed quite a few drivers, but you may still find that some drivers give unsatisfactory performance with Lustre. As the default value is hard-coded, you need to recompile the drivers to change their default. On the other hand, some drivers may have a wrong default set. If you suspect bad I/O performance and an analysis of Lustre statistics indicates that I/O is not 1 MB, check /sys/block/<device>/queue/max_sectors_kb. If the max_sectors_kb value is less than 1024, set it to at least 1024 to improve performance. If changing max_sectors_kb does not change the I/O size as reported by Lustre, you may want to examine the SCSI driver code.
-