Whamcloud - gitweb
LUDOC-338 proc: use lctl {get,set}_param
[doc/manual.git] / TroubleShootingRecovery.xml
1 <?xml version='1.0' encoding='utf-8'?>
2 <chapter xmlns="http://docbook.org/ns/docbook"
3 xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
4 xml:id="troubleshootingrecovery">
5   <title xml:id="troubleshootingrecovery.title">Troubleshooting
6   Recovery</title>
7   <para>This chapter describes what to do if something goes wrong during
8   recovery. It describes:</para>
9   <itemizedlist>
10     <listitem>
11       <para>
12         <xref linkend="dbdoclet.50438225_71141" />
13       </para>
14     </listitem>
15     <listitem>
16       <para>
17         <xref linkend="dbdoclet.50438225_37365" />
18       </para>
19     </listitem>
20     <listitem>
21       <para>
22         <xref linkend="dbdoclet.50438225_12316" />
23       </para>
24     </listitem>
25     <listitem>
26       <para>
27         <xref linkend="dbdoclet.lfsckadmin" />
28       </para>
29     </listitem>
30   </itemizedlist>
31   <section xml:id="dbdoclet.50438225_71141">
32     <title>
33     <indexterm>
34       <primary>recovery</primary>
35       <secondary>corruption of backing ldiskfs file system</secondary>
36     </indexterm>Recovering from Errors or Corruption on a Backing ldiskfs File
37     System</title>
38     <para>When an OSS, MDS, or MGS server crash occurs, it is not necessary to
39     run e2fsck on the file system.
40     <literal>ldiskfs</literal> journaling ensures that the file system remains
41     consistent over a system crash. The backing file systems are never accessed
42     directly from the client, so client crashes are not relevant for server
43     file system consistency.</para>
44     <para>The only time it is REQUIRED that
45     <literal>e2fsck</literal> be run on a device is when an event causes
46     problems that ldiskfs journaling is unable to handle, such as a hardware
47     device failure or I/O error. If the ldiskfs kernel code detects corruption
48     on the disk, it mounts the file system as read-only to prevent further
49     corruption, but still allows read access to the device. This appears as
50     error "-30" (
51     <literal>EROFS</literal>) in the syslogs on the server, e.g.:</para>
52     <screen>Dec 29 14:11:32 mookie kernel: LDISKFS-fs error (device sdz):
53             ldiskfs_lookup: unlinked inode 5384166 in dir #145170469
54 Dec 29 14:11:32 mookie kernel: Remounting filesystem read-only </screen>
55     <para>In such a situation, it is normally required that e2fsck only be run
56     on the bad device before placing the device back into service.</para>
57     <para>In the vast majority of cases, the Lustre software can cope with any
58     inconsistencies found on the disk and between other devices in the file
59     system.</para>
60     <note>
61           <para>The legacy offline-LFSCK tool included with e2fsprogs is rarely
62       required for Lustre file system operation. offline-LFSCK is not to be
63       confused with LFSCK tool, which is part of Lustre and provides online
64       consistency checking.</para>
65     </note>
66     <para>For problem analysis, it is strongly recommended that
67     <literal>e2fsck</literal> be run under a logger, like script, to record all
68     of the output and changes that are made to the file system in case this
69     information is needed later.</para>
70     <para>If time permits, it is also a good idea to first run
71     <literal>e2fsck</literal> in non-fixing mode (-n option) to assess the type
72     and extent of damage to the file system. The drawback is that in this mode,
73     <literal>e2fsck</literal> does not recover the file system journal, so there
74     may appear to be file system corruption when none really exists.</para>
75     <para>To address concern about whether corruption is real or only due to
76     the journal not being replayed, you can briefly mount and unmount the
77     <literal>ldiskfs</literal> file system directly on the node with the Lustre
78     file system stopped, using a command similar to:</para>
79     <screen>mount -t ldiskfs /dev/{ostdev} /mnt/ost; umount /mnt/ost</screen>
80     <para>This causes the journal to be recovered.</para>
81     <para>The
82     <literal>e2fsck</literal> utility works well when fixing file system
83     corruption (better than similar file system recovery tools and a primary
84     reason why
85     <literal>ldiskfs</literal> was chosen over other file systems). However, it
86     is often useful to identify the type of damage that has occurred so an
87     <literal>ldiskfs</literal> expert can make intelligent decisions about what
88     needs fixing, in place of
89     <literal>e2fsck</literal>.</para>
90     <screen>root# {stop lustre services for this device, if running}
91 root# script /tmp/e2fsck.sda
92 Script started, file is /tmp/e2fsck.sda
93 root# mount -t ldiskfs /dev/sda /mnt/ost
94 root# umount /mnt/ost
95 root# e2fsck -fn /dev/sda   # don't fix file system, just check for corruption
96 :
97 [e2fsck output]
98 :
99 root# e2fsck -fp /dev/sda   # fix errors with prudent answers (usually <literal>yes</literal>)</screen>
100   </section>
101   <section xml:id="dbdoclet.50438225_37365">
102     <title>
103     <indexterm>
104       <primary>recovery</primary>
105       <secondary>corruption of Lustre file system</secondary>
106     </indexterm>Recovering from Corruption in the Lustre File System</title>
107     <para>In cases where an ldiskfs MDT or OST becomes corrupt, you need to run
108     e2fsck to correct the local filesystem consistency, then use
109     <literal>LFSCK</literal> to run a distributed check on the file system to
110     resolve any inconsistencies between the MDTs and OSTs, or among MDTs.</para>
111     <orderedlist>
112       <listitem>
113         <para>Stop the Lustre file system.</para>
114       </listitem>
115       <listitem>
116         <para>Run
117         <literal>e2fsck -f</literal> on the individual MDT/OST that had
118         problems to fix any local file system damage.</para>
119         <para>We recommend running
120         <literal>e2fsck</literal> under script, to create a log of changes made
121         to the file system in case it is needed later. After
122         <literal>e2fsck</literal> is run, bring up the file system, if
123         necessary, to reduce the outage window.</para>
124       </listitem>
125     </orderedlist>
126     <section xml:id="dbdoclet.50438225_13916">
127       <title>
128       <indexterm>
129         <primary>recovery</primary>
130         <secondary>orphaned objects</secondary>
131       </indexterm>Working with Orphaned Objects</title>
132       <para>The simplest problem to resolve is that of orphaned objects. When
133       the LFSCK layout check is run, these objects are linked to new files and
134       put into 
135       <literal>.lustre/lost+found/MDT<replaceable>xxxx</replaceable></literal> 
136       in the Lustre file system 
137       (where MDTxxxx is the index of the MDT on which the orphan was found),
138       where they can be examined and saved or deleted as necessary.</para>
139       <para condition='l27'>With Lustre version 2.7 and later, LFSCK will
140        identify and process orphan objects found on MDTs as well.</para>
141     </section>
142   </section>
143   <section xml:id="dbdoclet.50438225_12316">
144     <title>
145     <indexterm>
146       <primary>recovery</primary>
147       <secondary>unavailable OST</secondary>
148     </indexterm>Recovering from an Unavailable OST</title>
149     <para>One problem encountered in a Lustre file system environment is when
150     an OST becomes unavailable due to a network partition, OSS node crash, etc.
151     When this happens, the OST's clients pause and wait for the OST to become
152     available again, either on the primary OSS or a failover OSS. When the OST
153     comes back online, the Lustre file system starts a recovery process to
154     enable clients to reconnect to the OST. Lustre servers put a limit on the
155     time they will wait in recovery for clients to reconnect.</para>
156     <para>During recovery, clients reconnect and replay their requests
157     serially, in the same order they were done originally. Until a client
158     receives a confirmation that a given transaction has been written to stable
159     storage, the client holds on to the transaction, in case it needs to be
160     replayed. Periodically, a progress message prints to the log, stating
161     how_many/expected clients have reconnected. If the recovery is aborted,
162     this log shows how many clients managed to reconnect. When all clients have
163     completed recovery, or if the recovery timeout is reached, the recovery
164     period ends and the OST resumes normal request processing.</para>
165     <para>If some clients fail to replay their requests during the recovery
166     period, this will not stop the recovery from completing. You may have a
167     situation where the OST recovers, but some clients are not able to
168     participate in recovery (e.g. network problems or client failure), so they
169     are evicted and their requests are not replayed. This would result in any
170     operations on the evicted clients failing, including in-progress writes,
171     which would cause cached writes to be lost. This is a normal outcome; the
172     recovery cannot wait indefinitely, or the file system would be hung any
173     time a client failed. The lost transactions are an unfortunate result of
174     the recovery process.</para>
175     <note>
176       <para>The failure of client recovery does not indicate or lead to
177       filesystem corruption. This is a normal event that is handled by the MDT
178       and OST, and should not result in any inconsistencies between
179       servers.</para>
180     </note>
181     <note>
182       <para>The version-based recovery (VBR) feature enables a failed client to
183       be ''skipped'', so remaining clients can replay their requests, resulting
184       in a more successful recovery from a downed OST. For more information
185       about the VBR feature, see
186       <xref linkend="lustrerecovery" />(Version-based Recovery).</para>
187     </note>
188   </section>
189   <section xml:id="dbdoclet.lfsckadmin" condition='l23'>
190     <title>
191     <indexterm>
192       <primary>recovery</primary>
193       <secondary>oiscrub</secondary>
194     </indexterm>
195     <indexterm>
196       <primary>recovery</primary>
197       <secondary>LFSCK</secondary>
198     </indexterm>Checking the file system with LFSCK</title>
199         <para condition='l23'>LFSCK is an administrative tool introduced in Lustre
200     software release 2.3 for checking and repair of the attributes specific to a
201     mounted Lustre file system. It is similar in concept to an offline fsck repair
202     tool for a local filesystem, but LFSCK is implemented to run as part of the
203     Lustre file system while the file system is mounted and in use. This allows
204     consistency of checking and repair by the Lustre software without unnecessary
205     downtime, and can be run on the largest Lustre file systems with negligible
206     disruption to normal operations.</para>
207     <para condition='l23'>Since Lustre software release 2.3, LFSCK can verify
208     and repair the Object Index (OI) table that is used internally to map
209     Lustre File Identifiers (FIDs) to MDT internal ldiskfs inode numbers, in
210     an internal table called the OI Table. An OI Scrub traverses this the IO
211     Table and makes corrections where necessary. An OI Scrub is required after
212     restoring from a file-level MDT backup (
213     <xref linkend="dbdoclet.50438207_71633" />), or in case the OI Table is
214     otherwise corrupted. Later phases of LFSCK will add further checks to the
215     Lustre distributed file system state.</para>
216     <para condition='l24'>In Lustre software release 2.4, LFSCK namespace
217     scanning can verify and repair the directory FID-in-Dirent and LinkEA
218     consistency.</para>
219     <para condition='l26'>In Lustre software release 2.6, LFSCK layout scanning
220     can verify and repair MDT-OST file layout inconsistencies. File layout
221     inconsistencies between MDT-objects and OST-objects that are checked and
222     corrected include dangling reference, unreferenced OST-objects, mismatched
223     references and multiple references.</para>
224     <para condition='l27'>In Lustre software release 2.7, LFSCK layout scanning
225     is enhanced to support verify and repair inconsistencies between multiple
226     MDTs.</para>
227     <para>Control and monitoring of LFSCK is through LFSCK and the
228     <literal>/proc</literal> file system interfaces. LFSCK supports three types
229     of interface: switch interface, status interface, and adjustment interface.
230     These interfaces are detailed below.</para>
231     <section>
232       <title>LFSCK switch interface</title>
233       <section>
234         <title>Manually Starting LFSCK</title>
235         <section>
236           <title>Description</title>
237           <para>LFSCK can be started after the MDT is mounted using the
238           <literal>lctl lfsck_start</literal> command.</para>
239         </section>
240         <section>
241           <title>Usage</title>
242 <screen>lctl lfsck_start &lt;-M | --device <replaceable>[MDT,OST]_device</replaceable>&gt; \
243                     [-A | --all] \
244                     [-c | --create_ostobj <replaceable>on | off</replaceable>] \
245                     [-C | --create_mdtobj <replaceable>on | off</replaceable>] \
246                     [-e | --error <replaceable>{continue | abort}</replaceable>] \
247                     [-h | --help] \
248                     [-n | --dryrun <replaceable>on | off</replaceable>] \
249                     [-o | --orphan] \
250                     [-r | --reset] \
251                     [-s | --speed <replaceable>ops_per_sec_limit</replaceable>] \
252                     [-t | --type <replaceable>check_type[,check_type...]</replaceable>] \
253                     [-w | --window_size <replaceable>size</replaceable>]</screen>
254         </section>
255         <section>
256           <title>Options</title>
257           <para>The various
258           <literal>lfsck_start</literal> options are listed and described below.
259           For a complete list of available options, type
260           <literal>lctl lfsck_start -h</literal>.</para>
261           <informaltable frame="all">
262             <tgroup cols="2">
263               <colspec colname="c1" colwidth="3*" />
264               <colspec colname="c2" colwidth="7*" />
265               <thead>
266                 <row>
267                   <entry>
268                     <para>
269                       <emphasis role="bold">Option</emphasis>
270                     </para>
271                   </entry>
272                   <entry>
273                     <para>
274                       <emphasis role="bold">Description</emphasis>
275                     </para>
276                   </entry>
277                 </row>
278               </thead>
279               <tbody>
280                 <row>
281                   <entry>
282                     <para>
283                       <literal>-M | --device</literal>
284                     </para>
285                   </entry>
286                   <entry>
287                     <para>The MDT or OST target to start LFSCK on.</para>
288                   </entry>
289                 </row>
290                 <row>
291                   <entry>
292                     <para>
293                       <literal>-A | --all</literal>
294                     </para>
295                   </entry>
296                   <entry>
297                     <para condition='l26'>Start LFSCK on all
298                     targets on all servers simultaneously.
299                     By default, both layout and namespace
300                     consistency checking and repair are started.</para>
301                   </entry>
302                 </row>
303                 <row>
304                   <entry>
305                     <para>
306                       <literal>-c | --create_ostobj</literal>
307                     </para>
308                   </entry>
309                   <entry>
310                     <para condition='l26'>Create the lost OST-object for
311                     dangling LOV EA,
312                     <literal>off</literal>(default) or
313                     <literal>on</literal>. If not specified, then the default
314                     behaviour is to keep the dangling LOV EA there without
315                     creating the lost OST-object.</para>
316                   </entry>
317                 </row>
318                 <row>
319                   <entry>
320                     <para>
321                       <literal>-C | --create_mdtobj</literal>
322                     </para>
323                   </entry>
324                   <entry>
325                     <para condition='l27'>Create the lost MDT-object for
326                     dangling name entry,
327                     <literal>off</literal>(default) or
328                     <literal>on</literal>. If not specified, then the default
329                     behaviour is to keep the dangling name entry there without
330                     creating the lost MDT-object.</para>
331                   </entry>
332                 </row>
333                 <row>
334                   <entry>
335                     <para>
336                       <literal>-e | --error</literal>
337                     </para>
338                   </entry>
339                   <entry>
340                     <para>Error handle,
341                     <literal>continue</literal>(default) or
342                     <literal>abort</literal>. Specify whether the LFSCK will
343                     stop or not if fails to repair something. If it is not
344                     specified, the saved value (when resuming from checkpoint)
345                     will be used if present. This option cannot be changed
346                     while LFSCK is running.</para>
347                   </entry>
348                 </row>
349                 <row>
350                   <entry>
351                     <para>
352                       <literal>-h | --help</literal>
353                     </para>
354                   </entry>
355                   <entry>
356                     <para>Operating help information.</para>
357                   </entry>
358                 </row>
359                 <row>
360                   <entry>
361                     <para>
362                       <literal>-n | --dryrun</literal>
363                     </para>
364                   </entry>
365                   <entry>
366                     <para>Perform a trial without making any changes.
367                     <literal>off</literal>(default) or
368                     <literal>on</literal>.</para>
369                   </entry>
370                 </row>
371                 <row>
372                   <entry>
373                     <para>
374                       <literal>-o | --orphan</literal>
375                     </para>
376                   </entry>
377                   <entry>
378                     <para condition='l26'>Repair orphan OST-objects for layout
379                     LFSCK.</para>
380                   </entry>
381                 </row>
382                 <row>
383                   <entry>
384                     <para>
385                       <literal>-r | --reset</literal>
386                     </para>
387                   </entry>
388                   <entry>
389                     <para>Reset the start position for the object iteration to
390                     the beginning for the specified MDT. By default the
391                     iterator will resume scanning from the last checkpoint
392                     (saved periodically by LFSCK) provided it is
393                     available.</para>
394                   </entry>
395                 </row>
396                 <row>
397                   <entry>
398                     <para>
399                       <literal>-s | --speed</literal>
400                     </para>
401                   </entry>
402                   <entry>
403                     <para>Set the upper speed limit of LFSCK processing in
404                     objects per second. If it is not specified, the saved value
405                     (when resuming from checkpoint) or default value of 0 (0 =
406                     run as fast as possible) is used. Speed can be adjusted
407                     while LFSCK is running with the adjustment
408                     interface.</para>
409                   </entry>
410                 </row>
411                 <row>
412                   <entry>
413                     <para>
414                       <literal>-t | --type</literal>
415                     </para>
416                   </entry>
417                   <entry>
418                     <para>The type of checking/repairing that should be
419                     performed. The new LFSCK framework provides a single
420                     interface for a variety of system consistency
421                     checking/repairing operations including:</para>
422                     <para>Without a specified option, the LFSCK component(s)
423                     which ran last time and did not finish or the component(s)
424                     corresponding to some known system inconsistency, will be
425                     started. Anytime the LFSCK is triggered, the OI scrub will
426                     run automatically, so there is no need to specify
427                     OI_scrub in that case.</para>
428                     <para condition='l24'>
429                     <literal>namespace</literal>: check and repair
430                     FID-in-Dirent and LinkEA consistency.</para>
431                     <para condition='l27'> Lustre-2.7 enhances
432                     namespace consistency verification under DNE mode.</para>
433                     <para condition='l26'>
434                     <literal>layout</literal>: check and repair MDT-OST
435                     inconsistency.</para>
436                   </entry>
437                 </row>
438                 <row>
439                   <entry>
440                     <para>
441                       <literal>-w | --window_size</literal>
442                     </para>
443                   </entry>
444                   <entry>
445                     <para condition='l26'>The window size for the async request
446                     pipeline. The LFSCK async request pipeline's input/output
447                     may have quite different processing speeds, and there may
448                     be too many requests in the pipeline as to cause abnormal
449                     memory/network pressure. If not specified, then the default
450                     window size for the async request pipeline is 1024.</para>
451                   </entry>
452                 </row>
453               </tbody>
454             </tgroup>
455           </informaltable>
456         </section>
457       </section>
458       <section>
459         <title>Manually Stopping LFSCK</title>
460         <section>
461           <title>Description</title>
462           <para>To stop LFSCK when the MDT is mounted, use the
463           <literal>lctl lfsck_stop</literal> command.</para>
464         </section>
465         <section>
466           <title>Usage</title>
467 <screen>lctl lfsck_stop &lt;-M | --device <replaceable>[MDT,OST]_device</replaceable>&gt; \
468                     [-A | --all] \
469                     [-h | --help]</screen>
470         </section>
471         <section>
472           <title>Options</title>
473           <para>The various
474           <literal>lfsck_stop</literal> options are listed and described below.
475           For a complete list of available options, type
476           <literal>lctl lfsck_stop -h</literal>.</para>
477           <informaltable frame="all">
478             <tgroup cols="2">
479               <colspec colname="c1" colwidth="3*" />
480               <colspec colname="c2" colwidth="7*" />
481               <thead>
482                 <row>
483                   <entry>
484                     <para>
485                       <emphasis role="bold">Option</emphasis>
486                     </para>
487                   </entry>
488                   <entry>
489                     <para>
490                       <emphasis role="bold">Description</emphasis>
491                     </para>
492                   </entry>
493                 </row>
494               </thead>
495               <tbody>
496                 <row>
497                   <entry>
498                     <para>
499                       <literal>-M | --device</literal>
500                     </para>
501                   </entry>
502                   <entry>
503                     <para>The MDT or OST target to stop LFSCK on.</para>
504                   </entry>
505                 </row>
506                 <row>
507                   <entry>
508                     <para>
509                       <literal>-A | --all</literal>
510                     </para>
511                   </entry>
512                   <entry>
513                     <para>Stop LFSCK on all targets on all servers
514                     simultaneously.</para>
515                   </entry>
516                 </row>
517                 <row>
518                   <entry>
519                     <para>
520                       <literal>-h | --help</literal>
521                     </para>
522                   </entry>
523                   <entry>
524                     <para>Operating help information.</para>
525                   </entry>
526                 </row>
527               </tbody>
528             </tgroup>
529           </informaltable>
530         </section>
531       </section>
532     </section>
533     <section>
534       <title>LFSCK status interface</title>
535       <section>
536         <title>LFSCK status of OI Scrub via
537         <literal>procfs</literal></title>
538         <section>
539           <title>Description</title>
540           <para>For each LFSCK component there is a dedicated procfs interface
541           to trace the corresponding LFSCK component status. For OI Scrub, the
542           interface is the OSD layer procfs interface, named
543           <literal>oi_scrub</literal>. To display OI Scrub status, the standard
544           <literal>lctl get_param</literal> command is used as shown in the
545           usage below.</para>
546         </section>
547         <section>
548           <title>Usage</title>
549           <screen>lctl get_param -n osd-ldiskfs.<replaceable>FSNAME</replaceable>-[<replaceable>MDT_target|OST_target</replaceable>].oi_scrub</screen>
550         </section>
551         <section>
552           <title>Output</title>
553           <informaltable frame="all">
554             <tgroup cols="2">
555               <colspec colname="c1" colwidth="3*" />
556               <colspec colname="c2" colwidth="7*" />
557               <thead>
558                 <row>
559                   <entry>
560                     <para>
561                       <emphasis role="bold">Information</emphasis>
562                     </para>
563                   </entry>
564                   <entry>
565                     <para>
566                       <emphasis role="bold">Detail</emphasis>
567                     </para>
568                   </entry>
569                 </row>
570               </thead>
571               <tbody>
572                 <row>
573                   <entry>
574                     <para>General Information</para>
575                   </entry>
576                   <entry>
577                     <itemizedlist>
578                       <listitem>
579                         <para>Name: OI_scrub.</para>
580                       </listitem>
581                       <listitem>
582                         <para>OI scrub magic id (an identifier unique to OI
583                         scrub).</para>
584                       </listitem>
585                       <listitem>
586                         <para>OI files count.</para>
587                       </listitem>
588                       <listitem>
589                         <para>Status: one of the status -
590                         <literal>init</literal>,
591                         <literal>scanning</literal>,
592                         <literal>completed</literal>,
593                         <literal>failed</literal>,
594                         <literal>stopped</literal>,
595                         <literal>paused</literal>, or
596                         <literal>crashed</literal>.</para>
597                       </listitem>
598                       <listitem>
599                         <para>Flags: including -
600                         <literal>recreated</literal>(OI file(s) is/are
601                         removed/recreated),
602                         <literal>inconsistent</literal>(restored from
603                         file-level backup),
604                         <literal>auto</literal>(triggered by non-UI mechanism),
605                         and
606                         <literal>upgrade</literal>(from Lustre software release
607                         1.8 IGIF format.)</para>
608                       </listitem>
609                       <listitem>
610                         <para>Parameters: OI scrub parameters, like
611                         <literal>failout</literal>.</para>
612                       </listitem>
613                       <listitem>
614                         <para>Time Since Last Completed.</para>
615                       </listitem>
616                       <listitem>
617                         <para>Time Since Latest Start.</para>
618                       </listitem>
619                       <listitem>
620                         <para>Time Since Last Checkpoint.</para>
621                       </listitem>
622                       <listitem>
623                         <para>Latest Start Position: the position for the
624                         latest scrub started from.</para>
625                       </listitem>
626                       <listitem>
627                         <para>Last Checkpoint Position.</para>
628                       </listitem>
629                       <listitem>
630                         <para>First Failure Position: the position for the
631                         first object to be repaired.</para>
632                       </listitem>
633                       <listitem>
634                         <para>Current Position.</para>
635                       </listitem>
636                     </itemizedlist>
637                   </entry>
638                 </row>
639                 <row>
640                   <entry>
641                     <para>Statistics</para>
642                   </entry>
643                   <entry>
644                     <itemizedlist>
645                       <listitem>
646                         <para>
647                         <literal>Checked</literal> total number of objects
648                         scanned.</para>
649                       </listitem>
650                       <listitem>
651                         <para>
652                         <literal>Updated</literal> total number of objects
653                         repaired.</para>
654                       </listitem>
655                       <listitem>
656                         <para>
657                         <literal>Failed</literal> total number of objects that
658                         failed to be repaired.</para>
659                       </listitem>
660                       <listitem>
661                         <para>
662                         <literal>No Scrub</literal> total number of objects
663                         marked
664                         <literal>LDISKFS_STATE_LUSTRE_NOSCRUB and
665                         skipped</literal>.</para>
666                       </listitem>
667                       <listitem>
668                         <para>
669                         <literal>IGIF</literal> total number of objects IGIF
670                         scanned.</para>
671                       </listitem>
672                       <listitem>
673                         <para>
674                         <literal>Prior Updated</literal> how many objects have
675                         been repaired which are triggered by parallel
676                         RPC.</para>
677                       </listitem>
678                       <listitem>
679                         <para>
680                         <literal>Success Count</literal> total number of
681                         completed OI_scrub runs on the target.</para>
682                       </listitem>
683                       <listitem>
684                         <para>
685                         <literal>Run Time</literal> how long the scrub has run,
686                         tally from the time of scanning from the beginning of
687                         the specified MDT target, not include the
688                         paused/failure time among checkpoints.</para>
689                       </listitem>
690                       <listitem>
691                         <para>
692                         <literal>Average Speed</literal> calculated by dividing
693                         <literal>Checked</literal> by
694                         <literal>run_time</literal>.</para>
695                       </listitem>
696                       <listitem>
697                         <para>
698                         <literal>Real-Time Speed</literal> the speed since last
699                         checkpoint if the OI_scrub is running.</para>
700                       </listitem>
701                       <listitem>
702                         <para>
703                         <literal>Scanned</literal> total number of objects under
704                         /lost+found that have been scanned.</para>
705                       </listitem>
706                       <listitem>
707                         <para>
708                         <literal>Repaired</literal> total number of objects
709                         under /lost+found that have been recovered.</para>
710                       </listitem>
711                       <listitem>
712                         <para>
713                         <literal>Failed</literal> total number of objects under
714                         /lost+found failed to be scanned or failed to be
715                         recovered.</para>
716                       </listitem>
717                     </itemizedlist>
718                   </entry>
719                 </row>
720               </tbody>
721             </tgroup>
722           </informaltable>
723         </section>
724       </section>
725       <section condition='l24'>
726         <title>LFSCK status of namespace via
727         <literal>procfs</literal></title>
728         <section>
729           <title>Description</title>
730           <para>The
731           <literal>namespace</literal> component is responsible for checks
732           described in <xref linkend="dbdoclet.lfsckadmin" />. The
733           <literal>procfs</literal> interface for this component is in the
734           MDD layer, named
735           <literal>lfsck_namespace</literal>. To show the status of this
736           component,
737           <literal>lctl get_param</literal> should be used as described in the
738           usage below.</para>
739           <para>The LFSCK namespace status output refers to phase 1 and phase 2.
740           Phase 1 is when the LFSCK main engine, which runs on each MDT,
741           linearly scans its local device, guaranteeing that all local objects
742           are checked.  However, there are certain cases in which LFSCK cannot
743           know whether an object is consistent or cannot repair an inconsistency
744           until the phase 1 scanning is completed. During phase 2 of the
745           namespace check, objects with multiple hard-links, objects with remote
746           parents, and other objects which couldn't be verified during phase 1
747           will be checked.</para>
748         </section>
749         <section>
750           <title>Usage</title>
751           <screen>lctl get_param -n mdd. <replaceable>FSNAME</replaceable>-<replaceable>MDT_target</replaceable>.lfsck_namespace</screen>
752         </section>
753         <section>
754           <title>Output</title>
755           <informaltable frame="all">
756             <tgroup cols="2">
757               <colspec colname="c1" colwidth="3*" />
758               <colspec colname="c2" colwidth="7*" />
759               <thead>
760                 <row>
761                   <entry>
762                     <para>
763                       <emphasis role="bold">Information</emphasis>
764                     </para>
765                   </entry>
766                   <entry>
767                     <para>
768                       <emphasis role="bold">Detail</emphasis>
769                     </para>
770                   </entry>
771                 </row>
772               </thead>
773               <tbody>
774                 <row>
775                   <entry>
776                     <para>General Information</para>
777                   </entry>
778                   <entry>
779                     <itemizedlist>
780                       <listitem>
781                         <para>Name:
782                         <literal>lfsck_namespace</literal></para>
783                       </listitem>
784                       <listitem>
785                         <para>LFSCK namespace magic.</para>
786                       </listitem>
787                       <listitem>
788                         <para>LFSCK namespace version..</para>
789                       </listitem>
790                       <listitem>
791                         <para>Status: one of the status -
792                         <literal>init</literal>,
793                         <literal>scanning-phase1</literal>,
794                         <literal>scanning-phase2</literal>,
795                         <literal>completed</literal>,
796                         <literal>failed</literal>,
797                         <literal>stopped</literal>,
798                         <literal>paused</literal>,
799                         <literal>partial</literal>,
800                         <literal>co-failed</literal>,
801                         <literal>co-stopped</literal> or
802                         <literal>co-paused</literal>.</para>
803                       </listitem>
804                       <listitem>
805                         <para>Flags: including -
806                         <literal>scanned-once</literal>(the first cycle
807                         scanning has been completed),
808                         <literal>inconsistent</literal>(one or more
809                         inconsistent FID-in-Dirent or LinkEA entries that have
810                         been discovered),
811                         <literal>upgrade</literal>(from Lustre software release
812                         1.8 IGIF format.)</para>
813                       </listitem>
814                       <listitem>
815                         <para>Parameters: including
816                         <literal>dryrun</literal>,
817                         <literal>all_targets</literal>,
818                         <literal>failout</literal>,
819                         <literal>broadcast</literal>,
820                         <literal>orphan</literal>,
821                         <literal>create_ostobj</literal> and
822                         <literal>create_mdtobj</literal>.</para>
823                       </listitem>
824                       <listitem>
825                         <para>Time Since Last Completed.</para>
826                       </listitem>
827                       <listitem>
828                         <para>Time Since Latest Start.</para>
829                       </listitem>
830                       <listitem>
831                         <para>Time Since Last Checkpoint.</para>
832                       </listitem>
833                       <listitem>
834                         <para>Latest Start Position: the position the checking
835                         began most recently.</para>
836                       </listitem>
837                       <listitem>
838                         <para>Last Checkpoint Position.</para>
839                       </listitem>
840                       <listitem>
841                         <para>First Failure Position: the position for the
842                         first object to be repaired.</para>
843                       </listitem>
844                       <listitem>
845                         <para>Current Position.</para>
846                       </listitem>
847                     </itemizedlist>
848                   </entry>
849                 </row>
850                 <row>
851                   <entry>
852                     <para>Statistics</para>
853                   </entry>
854                   <entry>
855                     <itemizedlist>
856                       <listitem>
857                         <para>
858                         <literal>Checked Phase1</literal> total number of
859                         objects scanned during
860                         <literal>scanning-phase1</literal>.</para>
861                       </listitem>
862                       <listitem>
863                         <para>
864                         <literal>Checked Phase2</literal> total number of
865                         objects scanned during
866                         <literal>scanning-phase2</literal>.</para>
867                       </listitem>
868                       <listitem>
869                         <para>
870                         <literal>Updated Phase1</literal> total number of
871                         objects repaired during
872                         <literal>scanning-phase1</literal>.</para>
873                       </listitem>
874                       <listitem>
875                         <para>
876                         <literal>Updated Phase2</literal> total number of
877                         objects repaired during
878                         <literal>scanning-phase2</literal>.</para>
879                       </listitem>
880                       <listitem>
881                         <para>
882                         <literal>Failed Phase1</literal> total number of objets
883                         that failed to be repaired during
884                         <literal>scanning-phase1</literal>.</para>
885                       </listitem>
886                       <listitem>
887                         <para>
888                         <literal>Failed Phase2</literal> total number of objets
889                         that failed to be repaired during
890                         <literal>scanning-phase2</literal>.</para>
891                       </listitem>
892                       <listitem>
893                         <para>
894                         <literal>directories</literal> total number of
895                         directories scanned.</para>
896                       </listitem>
897                       <listitem>
898                         <para>
899                         <literal>multiple_linked_checked</literal> total number
900                         of multiple-linked objects that have been
901                         scanned.</para>
902                       </listitem>
903                       <listitem>
904                         <para>
905                         <literal>dirent_repaired</literal> total number of
906                         FID-in-dirent entries that have been repaired.</para>
907                       </listitem>
908                       <listitem>
909                         <para>
910                         <literal>linkea_repaired</literal> total number of
911                         linkEA entries that have been repaired.</para>
912                       </listitem>
913                       <listitem>
914                         <para>
915                         <literal>unknown_inconsistency</literal> total number of
916                         undefined inconsistencies found in
917                         scanning-phase2.</para>
918                       </listitem>
919                       <listitem>
920                         <para>
921                         <literal>unmatched_pairs_repaired</literal> total number
922                         of unmatched pairs that have been repaired.</para>
923                       </listitem>
924                       <listitem>
925                         <para>
926                         <literal>dangling_repaired</literal> total number of
927                         dangling name entries that have been
928                         found/repaired.</para>
929                       </listitem>
930                       <listitem>
931                         <para>
932                         <literal>multi_referenced_repaired</literal> total
933                         number of multiple referenced name entries that have
934                         been found/repaired.</para>
935                       </listitem>
936                       <listitem>
937                         <para>
938                         <literal>bad_file_type_repaired</literal> total number
939                         of name entries with bad file type that have been
940                         repaired.</para>
941                       </listitem>
942                       <listitem>
943                         <para>
944                         <literal>lost_dirent_repaired</literal> total number of
945                         lost name entries that have been re-inserted.</para>
946                       </listitem>
947                       <listitem>
948                         <para>
949                         <literal>striped_dirs_scanned</literal> total number of
950                         striped directories (master) that have been
951                         scanned.</para>
952                       </listitem>
953                       <listitem>
954                         <para>
955                         <literal>striped_dirs_repaired</literal> total number of
956                         striped directories (master) that have been
957                         repaired.</para>
958                       </listitem>
959                       <listitem>
960                         <para>
961                         <literal>striped_dirs_failed</literal> total number of
962                         striped directories (master) that have failed to be
963                         verified.</para>
964                       </listitem>
965                       <listitem>
966                         <para>
967                         <literal>striped_dirs_disabled</literal> total number of
968                         striped directories (master) that have been
969                         disabled.</para>
970                       </listitem>
971                       <listitem>
972                         <para>
973                         <literal>striped_dirs_skipped</literal> total number of
974                         striped directories (master) that have been skipped
975                         (for shards verification) because of lost master LMV
976                         EA.</para>
977                       </listitem>
978                       <listitem>
979                         <para>
980                         <literal>striped_shards_scanned</literal> total number
981                         of striped directory shards (slave) that have been
982                         scanned.</para>
983                       </listitem>
984                       <listitem>
985                         <para>
986                         <literal>striped_shards_repaired</literal> total number
987                         of striped directory shards (slave) that have been
988                         repaired.</para>
989                       </listitem>
990                       <listitem>
991                         <para>
992                         <literal>striped_shards_failed</literal> total number of
993                         striped directory shards (slave) that have failed to be
994                         verified.</para>
995                       </listitem>
996                       <listitem>
997                         <para>
998                         <literal>striped_shards_skipped</literal> total number
999                         of striped directory shards (slave) that have been
1000                         skipped (for name hash verification) because LFSCK does
1001                         not know whether the slave LMV EA is valid or
1002                         not.</para>
1003                       </listitem>
1004                       <listitem>
1005                         <para>
1006                         <literal>name_hash_repaired</literal> total number of
1007                         name entries under striped directory with bad name hash
1008                         that have been repaired.</para>
1009                       </listitem>
1010                       <listitem>
1011                         <para>
1012                         <literal>nlinks_repaired</literal> total number of
1013                         objects with nlink fixed.</para>
1014                       </listitem>
1015                       <listitem>
1016                         <para>
1017                         <literal>mul_linked_repaired</literal> total number of
1018                         multiple-linked objects that have been repaired.</para>
1019                       </listitem>
1020                       <listitem>
1021                         <para>
1022                         <literal>local_lost_found_scanned</literal> total number
1023                         of objects under /lost+found that have been
1024                         scanned.</para>
1025                       </listitem>
1026                       <listitem>
1027                         <para>
1028                         <literal>local_lost_found_moved</literal> total number
1029                         of objects under /lost+found that have been moved to
1030                         namespace visible directory.</para>
1031                       </listitem>
1032                       <listitem>
1033                         <para>
1034                         <literal>local_lost_found_skipped</literal> total number
1035                         of objects under /lost+found that have been
1036                         skipped.</para>
1037                       </listitem>
1038                       <listitem>
1039                         <para>
1040                         <literal>local_lost_found_failed</literal> total number
1041                         of objects under /lost+found that have failed to be
1042                         processed.</para>
1043                       </listitem>
1044                       <listitem>
1045                         <para>
1046                         <literal>Success Count</literal> the total number of
1047                         completed LFSCK runs on the target.</para>
1048                       </listitem>
1049                       <listitem>
1050                         <para>
1051                         <literal>Run Time Phase1</literal> the duration of the
1052                         LFSCK run during
1053                         <literal>scanning-phase1</literal>. Excluding the time
1054                         spent paused between checkpoints.</para>
1055                       </listitem>
1056                       <listitem>
1057                         <para>
1058                         <literal>Run Time Phase2</literal> the duration of the
1059                         LFSCK run during
1060                         <literal>scanning-phase2</literal>. Excluding the time
1061                         spent paused between checkpoints.</para>
1062                       </listitem>
1063                       <listitem>
1064                         <para>
1065                         <literal>Average Speed Phase1</literal> calculated by
1066                         dividing
1067                         <literal>checked_phase1</literal> by
1068                         <literal>run_time_phase1</literal>.</para>
1069                       </listitem>
1070                       <listitem>
1071                         <para>
1072                         <literal>Average Speed Phase2</literal> calculated by
1073                         dividing
1074                         <literal>checked_phase2</literal> by
1075                         <literal>run_time_phase1</literal>.</para>
1076                       </listitem>
1077                       <listitem>
1078                         <para>
1079                         <literal>Real-Time Speed Phase1</literal> the speed
1080                         since the last checkpoint if the LFSCK is running
1081                         <literal>scanning-phase1</literal>.</para>
1082                       </listitem>
1083                       <listitem>
1084                         <para>
1085                         <literal>Real-Time Speed Phase2</literal> the speed
1086                         since the last checkpoint if the LFSCK is running
1087                         <literal>scanning-phase2</literal>.</para>
1088                       </listitem>
1089                     </itemizedlist>
1090                   </entry>
1091                 </row>
1092               </tbody>
1093             </tgroup>
1094           </informaltable>
1095         </section>
1096       </section>
1097       <section condition='l26'>
1098         <title>LFSCK status of layout via
1099         <literal>procfs</literal></title>
1100         <section>
1101           <title>Description</title>
1102           <para>The
1103           <literal>layout</literal> component is responsible for checking and
1104           repairing MDT-OST inconsistency. The
1105           <literal>procfs</literal> interface for this component is in the MDD
1106           layer, named
1107           <literal>lfsck_layout</literal>, and in the OBD layer, named
1108           <literal>lfsck_layout</literal>. To show the status of this component
1109           <literal>lctl get_param</literal> should be used as described in the
1110           usage below.</para>
1111           <para>The LFSCK layout status output refers to phase 1 and phase 2.
1112           Phase 1 is when the LFSCK main engine, which runs on each MDT/OST,
1113           linearly scans its local device, guaranteeing that all local objects
1114           are checked. During phase 1 of layout LFSCK, the OST-objects which are
1115           not referenced by any MDT-object are recorded in a bitmap. During
1116           phase 2 of the layout check, the OST-objects in the bitmap will be
1117           re-scanned to check whether they are really orphan objects.</para>
1118         </section>
1119         <section>
1120           <title>Usage</title>
1121           <screen>lctl get_param -n mdd.
1122 <replaceable>FSNAME</replaceable>-
1123 <replaceable>MDT_target</replaceable>.lfsck_layout
1124 lctl get_param -n obdfilter.
1125 <replaceable>FSNAME</replaceable>-
1126 <replaceable>OST_target</replaceable>.lfsck_layout</screen>
1127         </section>
1128         <section>
1129           <title>Output</title>
1130           <informaltable frame="all">
1131             <tgroup cols="2">
1132               <colspec colname="c1" colwidth="3*" />
1133               <colspec colname="c2" colwidth="7*" />
1134               <thead>
1135                 <row>
1136                   <entry>
1137                     <para>
1138                       <emphasis role="bold">Information</emphasis>
1139                     </para>
1140                   </entry>
1141                   <entry>
1142                     <para>
1143                       <emphasis role="bold">Detail</emphasis>
1144                     </para>
1145                   </entry>
1146                 </row>
1147               </thead>
1148               <tbody>
1149                 <row>
1150                   <entry>
1151                     <para>General Information</para>
1152                   </entry>
1153                   <entry>
1154                     <itemizedlist>
1155                       <listitem>
1156                         <para>Name:
1157                         <literal>lfsck_layout</literal></para>
1158                       </listitem>
1159                       <listitem>
1160                         <para>LFSCK namespace magic.</para>
1161                       </listitem>
1162                       <listitem>
1163                         <para>LFSCK namespace version..</para>
1164                       </listitem>
1165                       <listitem>
1166                         <para>Status: one of the status -
1167                         <literal>init</literal>,
1168                         <literal>scanning-phase1</literal>,
1169                         <literal>scanning-phase2</literal>,
1170                         <literal>completed</literal>,
1171                         <literal>failed</literal>,
1172                         <literal>stopped</literal>,
1173                         <literal>paused</literal>,
1174                         <literal>crashed</literal>,
1175                         <literal>partial</literal>,
1176                         <literal>co-failed</literal>,
1177                         <literal>co-stopped</literal>, or
1178                         <literal>co-paused</literal>.</para>
1179                       </listitem>
1180                       <listitem>
1181                         <para>Flags: including -
1182                         <literal>scanned-once</literal>(the first cycle
1183                         scanning has been completed),
1184                         <literal>inconsistent</literal>(one or more MDT-OST
1185                         inconsistencies have been discovered),
1186                         <literal>incomplete</literal>(some MDT or OST did not
1187                         participate in the LFSCK or failed to finish the LFSCK)
1188                         or
1189                         <literal>crashed_lastid</literal>(the lastid files on
1190                         the OST crashed and needs to be rebuilt).</para>
1191                       </listitem>
1192                       <listitem>
1193                         <para>Parameters: including
1194                         <literal>dryrun</literal>,
1195                         <literal>all_targets</literal> and
1196                         <literal>failout</literal>.</para>
1197                       </listitem>
1198                       <listitem>
1199                         <para>Time Since Last Completed.</para>
1200                       </listitem>
1201                       <listitem>
1202                         <para>Time Since Latest Start.</para>
1203                       </listitem>
1204                       <listitem>
1205                         <para>Time Since Last Checkpoint.</para>
1206                       </listitem>
1207                       <listitem>
1208                         <para>Latest Start Position: the position the checking
1209                         began most recently.</para>
1210                       </listitem>
1211                       <listitem>
1212                         <para>Last Checkpoint Position.</para>
1213                       </listitem>
1214                       <listitem>
1215                         <para>First Failure Position: the position for the
1216                         first object to be repaired.</para>
1217                       </listitem>
1218                       <listitem>
1219                         <para>Current Position.</para>
1220                       </listitem>
1221                     </itemizedlist>
1222                   </entry>
1223                 </row>
1224                 <row>
1225                   <entry>
1226                     <para>Statistics</para>
1227                   </entry>
1228                   <entry>
1229                     <itemizedlist>
1230                       <listitem>
1231                         <para>
1232                         <literal>Success Count:</literal> the total number of
1233                         completed LFSCK runs on the target.</para>
1234                       </listitem>
1235                       <listitem>
1236                         <para>
1237                         <literal>Repaired Dangling:</literal> total number of
1238                         MDT-objects with dangling reference have been repaired
1239                         in the scanning-phase1.</para>
1240                       </listitem>
1241                       <listitem>
1242                         <para>
1243                         <literal>Repaired Unmatched Pairs</literal> total number
1244                         of unmatched MDT and OST-object paris have been
1245                         repaired in the scanning-phase1</para>
1246                       </listitem>
1247                       <listitem>
1248                         <para>
1249                         <literal>Repaired Multiple Referenced</literal> total
1250                         number of OST-objects with multiple reference have been
1251                         repaired in the scanning-phase1.</para>
1252                       </listitem>
1253                       <listitem>
1254                         <para>
1255                         <literal>Repaired Orphan</literal> total number of
1256                         orphan OST-objects have been repaired in the
1257                         scanning-phase2.</para>
1258                       </listitem>
1259                       <listitem>
1260                         <para>
1261                         <literal>Repaired Inconsistent Owner</literal> total
1262                         number.of OST-objects with incorrect owner information
1263                         have been repaired in the scanning-phase1.</para>
1264                       </listitem>
1265                       <listitem>
1266                         <para>
1267                         <literal>Repaired Others</literal> total number of.other
1268                         inconsistency repaired in the scanning phases.</para>
1269                       </listitem>
1270                       <listitem>
1271                         <para>
1272                         <literal>Skipped</literal> Number of skipped
1273                         objects.</para>
1274                       </listitem>
1275                       <listitem>
1276                         <para>
1277                         <literal>Failed Phase1</literal> total number of objects
1278                         that failed to be repaired during
1279                         <literal>scanning-phase1</literal>.</para>
1280                       </listitem>
1281                       <listitem>
1282                         <para>
1283                         <literal>Failed Phase2</literal> total number of objects
1284                         that failed to be repaired during
1285                         <literal>scanning-phase2</literal>.</para>
1286                       </listitem>
1287                       <listitem>
1288                         <para>
1289                         <literal>Checked Phase1</literal> total number of
1290                         objects scanned during
1291                         <literal>scanning-phase1</literal>.</para>
1292                       </listitem>
1293                       <listitem>
1294                         <para>
1295                         <literal>Checked Phase2</literal> total number of
1296                         objects scanned during
1297                         <literal>scanning-phase2</literal>.</para>
1298                       </listitem>
1299                       <listitem>
1300                         <para>
1301                         <literal>Run Time Phase1</literal> the duration of the
1302                         LFSCK run during
1303                         <literal>scanning-phase1</literal>. Excluding the time
1304                         spent paused between checkpoints.</para>
1305                       </listitem>
1306                       <listitem>
1307                         <para>
1308                         <literal>Run Time Phase2</literal> the duration of the
1309                         LFSCK run during
1310                         <literal>scanning-phase2</literal>. Excluding the time
1311                         spent paused between checkpoints.</para>
1312                       </listitem>
1313                       <listitem>
1314                         <para>
1315                         <literal>Average Speed Phase1</literal> calculated by
1316                         dividing
1317                         <literal>checked_phase1</literal> by
1318                         <literal>run_time_phase1</literal>.</para>
1319                       </listitem>
1320                       <listitem>
1321                         <para>
1322                         <literal>Average Speed Phase2</literal> calculated by
1323                         dividing
1324                         <literal>checked_phase2</literal> by
1325                         <literal>run_time_phase1</literal>.</para>
1326                       </listitem>
1327                       <listitem>
1328                         <para>
1329                         <literal>Real-Time Speed Phase1</literal> the speed
1330                         since the last checkpoint if the LFSCK is running
1331                         <literal>scanning-phase1</literal>.</para>
1332                       </listitem>
1333                       <listitem>
1334                         <para>
1335                         <literal>Real-Time Speed Phase2</literal> the speed
1336                         since the last checkpoint if the LFSCK is running
1337                         <literal>scanning-phase2</literal>.</para>
1338                       </listitem>
1339                     </itemizedlist>
1340                   </entry>
1341                 </row>
1342               </tbody>
1343             </tgroup>
1344           </informaltable>
1345         </section>
1346       </section>
1347     </section>
1348     <section>
1349       <title>LFSCK adjustment interface</title>
1350       <section condition='l26'>
1351         <title>Rate control</title>
1352         <section>
1353           <title>Description</title>
1354           <para>The LFSCK upper speed limit can be changed using
1355           <literal>lctl set_param</literal> as shown in the usage below.</para>
1356         </section>
1357         <section>
1358           <title>Usage</title>
1359           <screen>lctl set_param mdd.${FSNAME}-${MDT_target}.lfsck_speed_limit=
1360 <replaceable>N</replaceable>
1361 lctl set_param obdfilter.${FSNAME}-${OST_target}.lfsck_speed_limit=
1362 <replaceable>N</replaceable></screen>
1363         </section>
1364         <section>
1365           <title>Values</title>
1366           <informaltable frame="all">
1367             <tgroup cols="2">
1368               <colspec colname="c1" colwidth="3*" />
1369               <colspec colname="c2" colwidth="7*" />
1370               <tbody>
1371                 <row>
1372                   <entry>
1373                     <para>0</para>
1374                   </entry>
1375                   <entry>
1376                     <para>No speed limit (run at maximum speed.)</para>
1377                   </entry>
1378                 </row>
1379                 <row>
1380                   <entry>
1381                     <para>positive integer</para>
1382                   </entry>
1383                   <entry>
1384                     <para>Maximum number of objects to scan per second.</para>
1385                   </entry>
1386                 </row>
1387               </tbody>
1388             </tgroup>
1389           </informaltable>
1390         </section>
1391       </section>
1392       <section xml:id="dbdoclet.lfsck_auto_scrub">
1393         <title>Auto scrub</title>
1394         <section>
1395           <title>Description</title>
1396           <para>The
1397           <literal>auto_scrub</literal> parameter controls whether OI scrub will
1398           be triggered when an inconsistency is detected during OI lookup. It
1399           can be set as described in the usage and values sections
1400           below.</para>
1401           <para>There is also a
1402           <literal>noscrub</literal> mount option (see
1403           <xref linkend="dbdoclet.50438219_12635" />) which can be used to
1404           disable automatic OI scrub upon detection of a file-level backup at
1405           mount time. If the
1406           <literal>noscrub</literal> mount option is specified,
1407           <literal>auto_scrub</literal> will also be disabled, so OI scrub will
1408           not be triggered when an OI inconsistency is detected. Auto scrub can
1409           be renabled after the mount using the command shown in the usage.
1410           Manually starting LFSCK after mounting provides finer control over
1411           the starting conditions.</para>
1412         </section>
1413         <section>
1414           <title>Usage</title>
1415           <screen>lctl set_param osd_ldiskfs.${FSNAME}-${MDT_target}.auto_scrub=<replaceable>N</replaceable></screen>
1416           <para>where
1417           <replaceable>N</replaceable>is an integer as described below.</para>
1418           <note condition='l25'><para>Lustre software 2.5 and later supports
1419           <literal>-P</literal> option that makes the
1420           <literal>set_param</literal> permanent.</para></note>
1421         </section>
1422         <section>
1423           <title>Values</title>
1424           <informaltable frame="all">
1425             <tgroup cols="2">
1426               <colspec colname="c1" colwidth="3*" />
1427               <colspec colname="c2" colwidth="7*" />
1428               <tbody>
1429                 <row>
1430                   <entry>
1431                     <para>0</para>
1432                   </entry>
1433                   <entry>
1434                     <para>Do not start OI Scrub automatically.</para>
1435                   </entry>
1436                 </row>
1437                 <row>
1438                   <entry>
1439                     <para>positive integer</para>
1440                   </entry>
1441                   <entry>
1442                     <para>Automatically start OI Scrub if inconsistency is
1443                     detected during OI lookup.</para>
1444                   </entry>
1445                 </row>
1446               </tbody>
1447             </tgroup>
1448           </informaltable>
1449         </section>
1450       </section>
1451     </section>
1452   </section>
1453 </chapter>