1 .TH lctl 8 "2017 Jan 12" Lustre "configuration utilities"
3 lctl \- Low level Lustre filesystem configuration utility
8 .B lctl --device <devno> <command [args]>
12 .B lctl --list-commands
16 is used to directly control Lustre via an ioctl interface, allowing
17 various configuration, maintenance, and debugging features to be accessed.
20 can be invoked in interactive mode by issuing lctl command. After that, commands are issued as below. The most common commands in lctl are
32 To get a complete listing of available commands, type
34 at the lctl prompt. To get basic help on the meaning and syntax of a
38 . Command completion is activated with the TAB key, and command history is available via the up- and down-arrow keys.
40 For non-interactive use, one uses the second invocation, which runs command after connecting to the device.
42 .SS Network Configuration
44 .BR network " <" up / down >|< tcp / o2ib >
45 Start or stop LNET, or select a network type for other
50 Print all Network Identifiers on the local node. LNET must be running.
52 .BI which_nid " <nidlist>"
53 From a list of nids for a remote node, show which interface communication
56 .BI replace_nids " <devicename> <nid1>[,nid2,nid3 ...]"
57 Replace the LNET Network Identifiers for a given device,
58 as when the server's IP address has changed.
59 This command must be run on the MGS node.
60 Only MGS server should be started (command execution returns error
61 in another cases). To start the MGS service only:
62 mount -t lustre <MDT partition> -o nosvc <mount point>
63 Note the replace_nids command skips any invalidated records in the configuration log.
64 The previous log is backed up with the suffix '.bak'.
66 .BI ping " <nid> timeout"
67 Check LNET connectivity via an LNET ping. This will use the fabric
68 appropriate to the specified NID. By default lctl will attempt to
69 reach the remote node up to 120 seconds and then timeout. To disable
70 the timeout just specify an negative timeout value.
73 Print the network interface information for a given
78 Print the known peers for a given
83 Print all the connected remote NIDs for a given
88 Print the complete routing table.
92 .BI device " <devname> "
93 This will select the specified OBD device. All other commands depend on the device being set.
96 Show all the local Lustre OBDs. AKA
101 .BI list_param " [-F|-R] <param_search ...>"
102 List the Lustre or LNet parameter name
104 Add '/', '@' or '=' for dirs, symlinks and writeable files, respectively.
107 Recursively list all parameters under the specified parameter search string. If
109 is unspecified, all the parameters will be shown.
114 # lctl list_param ost.*
121 # lctl list_param -F ost.* debug
130 # lctl list_param -R mdt
136 mdt.lustre-MDT0000.capa
138 mdt.lustre-MDT0000.capa_count
140 mdt.lustre-MDT0000.capa_key_timeout
142 mdt.lustre-MDT0000.capa_timeout
144 mdt.lustre-MDT0000.commit_on_sharing
146 mdt.lustre-MDT0000.evict_client
150 .BI get_param " [-F|-n|-N|-R] <parameter ...>"
151 Get the value of Lustre or LNET parameter.
154 When -N specified, add '/', '@' or '=' for directories, symlinks and writeable files, respectively.
158 Print only the value and not parameter name.
161 Print only matched parameter names and not the values. (Especially useful when using patterns.)
164 Print all of the parameter names below the specified name.
169 # lctl get_param ost.*
176 # lctl get_param -n debug timeout
178 super warning dlmtrace error emerg ha rpctrace vfstrace config console
183 # lctl get_param -N ost.* debug
191 lctl "get_param -NF" is equivalent to "list_param -F".
193 .BI set_param " [-n] [-P] [-d] <parameter=value ...>"
194 Set the value of Lustre or LNET parameter.
197 Disable printing of the key name when printing values.
200 Set the parameter permanently, filesystem-wide.
201 This parameters are only visible to 2.5.0 and later clients, older clients will not see these parameters.
204 Remove the permanent setting (only with -P option)
209 # lctl set_param fail_loc=0 timeout=20
216 # lctl set_param -n fail_loc=0 timeout=20
223 # lctl set_param -P osc.*.max_dirty_mb=32
226 .BI conf_param " [-d] <device|fsname>.<parameter>=<value>"
227 Set a permanent configuration parameter for any device via the MGS. This
228 command must be run on the MGS node.
230 .B -d <device|fsname>.<parameter>
231 Delete a parameter setting (use the default value at the next restart). A null value for <value> also deletes the parameter setting.
235 All of the writable parameters under
238 .I lctl list_param -F osc.*.* | grep =
239 ) can be permanently set using
241 , but the format is slightly different. For conf_param, the device is specified first, then the obdtype. (See examples below.) Wildcards are not supported.
243 Additionally, failover nodes may be added (or removed), and some system-wide parameters may be set as well (sys.at_max, sys.at_min, sys.at_extra, sys.at_early_margin, sys.at_history, sys.timeout, sys.ldlm_timeout.) <device> is ignored for system wide parameters.
247 # lctl conf_param testfs.sys.at_max=1200
249 # lctl conf_param testfs.llite.max_read_ahead_mb=16
251 # lctl conf_param testfs-MDT0000.lov.stripesize=2M
253 # lctl conf_param lustre-OST0001.osc.active=0
255 # lctl conf_param testfs-OST0000.osc.max_dirty_mb=29.15
257 # lctl conf_param testfs-OST0000.ost.client_cache_seconds=15
259 # lctl conf_param testfs-OST0000.failover.node=1.2.3.4@tcp1
262 Reactivate an import after deactivating, below. This setting is only effective until the next restart (see
267 Deactivate an import, in particular meaning do not assign new file stripes
268 to an OSC. This command should be used on the OSC in the MDT LOV
269 corresponding to a failed OST device, to prevent further attempts at
270 communication with the failed OST.
273 Abort the recovery process on a restarting MDT or OST device
277 .BI changelog_register " [-n]"
278 Register a new changelog user for a particular device. Changelog entries
279 will not be purged beyond any registered users' set point. (See lfs changelog_clear.)
282 Print only the ID of the newly registered user.
284 .BI changelog_deregister " <id>"
285 Unregister an existing changelog user. If the user's "clear" record number
286 is the minimum for the device, changelog records will be purged until the
290 An identity mapping feature that facilitates mapping of client UIDs and GIDs to
291 local file system UIDs and GIDs, while maintaining POSIX ownership, permissions,
294 While the nodemap feature is enabled, all client file system access is subject
295 to the nodemap identity mapping policy, which consists of the 'default' catchall
296 nodemap, and any user-defined nodemaps. The 'default' nodemap maps all client
297 identities to 99:99 (nobody:nobody). Administrators can define nodemaps for a
298 range of client NIDs which map identities, and these nodemaps can be flagged as
299 'trusted' so identities are accepted without translation, as well as flagged
300 as 'admin' meaning that root is not squashed for these nodes.
302 Note: In the current phase of implementation, to use the nodemap functionality
303 you only need to enable and define nodemaps on the MDS. The MDSes must also be
304 in a nodemap with the admin and trusted flags set. To use quotas with nodemaps,
305 you must also use set_param to enable and define nodemaps on the OSS (matching
306 what is defined on the MDS). Nodemaps do not currently persist, unless you
307 define them with set_param and use the -P flag. Note that there is a hard limit
308 to the number of changes you can persist over the lifetime of the file system.
313 \fBlctl-nodemap-activate\fR(8)
315 Activate/deactivate the nodemap feature.
318 \fBlctl-nodemap-add\fR(8)
320 Add a new nodemap, to which NID ranges, identities, and properties can be added.
323 \fBlctl-nodemap-del\fR(8)
325 Delete an existing nodemap.
328 \fBlctl-nodemap-add-range\fR(8)
330 Define a range of NIDs for a nodemap.
333 \fBlctl-nodemap-del-range\fR(8)
335 Delete an existing NID range from a nodemap.
338 \fBlctl-nodemap-add-idmap\fR(8)
340 Add a UID or GID mapping to a nodemap.
343 \fBlctl-nodemap-del-idmap\fR(8)
345 Delete an existing UID or GID mapping from a nodemap.
348 \fBlctl-nodemap-modify\fR(8)
350 Modify a nodemap property.
354 An on-line Lustre consistency check and repair tool. It is used for totally
355 replacing the old lfsck tool for kinds of Lustre inconsistency verification,
356 including: corrupted or lost OI mapping, corrupted or lost link EA, corrupted
357 or lost FID in name entry, dangling name entry, multiple referenced name entry,
358 unmatched MDT-object and name entry pairs, orphan MDT-object, incorrect
359 MDT-object links count, corrupted namespace, corrupted or lost lov EA, lost
360 OST-object, multiple referenced OST-object, unmatched MDT-object and OST-object
361 pairs, orphan OST-object, and so on.
366 \fBlctl-lfsck-start\fR(8)
368 Start LFSCK on the specified MDT or OST device with specified parameters.
371 \fBlctl-lfsck-stop\fR(8)
373 Stop LFSCK on the specified MDT or OST device.
376 \fBlctl-lfsck-query\fR(8)
378 Get the LFSCK global status via the specified MDT device.
381 The tools set for write (modify) barrier on all MDTs.
383 .B barrier_freeze \fR<fsname> [timeout]
385 Set write barrier on all MDTs. The barrier_freeze command will not return
386 until the barrier is set (frozen) or failed. With the write barrier set,
387 any subsequent metadata modification will be blocked until the barrier is
388 thawed or expired. The barrier lifetime is started when triggering
389 barrier_freeze, and will be terminated when barrier thawed. To avoid the
390 system being frozen for very long time if miss/fail to call barrier_thaw,
391 you can specify its lifetime via the 'timeout' parameter in second, the
392 default value is 60 (seconds). If the barrier is not thawed before that,
393 it will be expired automatically.
394 A barrier_freeze can only succeed when all registered MDTs are available.
395 If some MDT has ever registered but become offline, then barrier_freeze
396 will fail. To check and update current status of MDTs, see the command
399 .B barrier_thaw \fR<fsname>
401 Reset write barrier on all MDTs. With the write barrier thawed, all blocked
402 metadata modifications (by the former barrier_freeze) will be handled normally.
404 .B barrier_stat \fR<fsname>
406 Query the write barrier status, the possible status and related meanings are
409 'init': has never set barrier on the system
410 'freezing_p1': in the first stage of setting the write barrier
411 'freezing_p2': in the second stage of setting the write barrier
412 'frozen': the write barrier has been set successfully
413 'thawing': in thawing the write barrier
414 'thawed': the write barrier has been thawed
415 'failed': fail to set write barrier
416 'expired': the write barrier is expired
417 'rescan': in scanning the MDTs status, see the command barrier_rescan
418 'unknown': other cases
420 If the barrier is in 'freezing_p1', 'freezing_p2' or 'frozen' status, then
421 the left lifetime will be returned also.
423 .B barrier_rescan \fR<fsname> [timeout]
425 Scan the system to check which MDTs are active. The status of the MDTs is
426 required because a barrier_freeze will be unsuccessful if any of the MDTs
427 are permenantly offline. During barrier_rescan, the MDT status is updated.
428 If an MDT does not respond the barrier_rescan within the given "timeout"
429 seconds (where the default value is 60 seconds), then it will be marked
430 as unavailable or inactive.
433 ZFS backend based snapshot tools set. The tool loads system configuration
434 from the file /etc/ldev.conf on the MGS, and call related ZFS commands to
435 maintain Lustre snapshot pieces on all targets (MGS/MDT/OST).
436 The configuration file /etc/ldev.conf is not only for snapshot, but also
437 for other purpose. The format is:
438 <host> foreign/- <label> <device> [journal-path]/- [raidtab]
440 The format of <label> is:
441 fsname-<role><index> or <role><index>
443 The format of <device> is:
444 [md|zfs:][pool_dir/]<pool>/<filesystem>
446 Snapshot only uses the fields <host>, <label> and <device>.
453 host-mdt1 - myfs-MDT0000 zfs:/tmp/myfs-mdt1/mdt1
454 host-mdt2 - myfs-MDT0001 zfs:myfs-mdt2/mdt2
455 host-ost1 - OST0000 zfs:/tmp/myfs-ost1/ost1
456 host-ost2 - OST0001 zfs:myfs-ost2/ost2
458 For old snasphot tools, the configration is in /etc/lsnapshot/${fsname}.conf,
459 the format is as following (per target, per line):
460 <host> <pool_dir> <pool> <local_filesystem> <role(,s)> <index>
465 # cat /etc/lsnapshot/testfs.conf
467 VM6_1 /tmp testfs-mdt1 mdt1 MGS,MDT 0
468 VM6_2 /tmp testfs-mdt2 mdt2 MDT 1
469 VM6_3 /tmp testfs-ost1 ost1 OST 0
470 VM6_3 /tmp testfs-ost2 ost2 OST 1
471 VM6_4 /tmp testfs-ost3 ost3 OST 2
472 VM6_4 /tmp testfs-ost4 ost4 OST 3
475 .B snapshot_create \fR[-b | --barrier [on | off]] [-c | --comment comment]
476 \fR<-F | --fsname fsname> [-h | --help] <-n | --name ssname>
477 \fR[-r | --rsh remote_shell] [-t | --timeout timeout]
479 Create snapshot with the given name.
481 -b, --barrier [on | off]
482 Set write barrier on all MDTs before creating the snapshot. The default behavior
483 is 'on'. If you are confident about the system consistency, or you do not care
484 about the system consistency when create the snapshot, then you can specify
485 barrier 'off'. That will save your time of creating the snapshot. If the barrier
486 is 'on', then the timeout of the barrier can be specified via '-t' option as
487 described in the subsequent section.
489 -c, --comment <comment>
490 Add an optional comment to the snapshot_create request. The comment can include
491 anything to describe what the snapshot is for or for reminder. The comment can
492 be shown via snapshot_list.
498 For help information.
501 The snapshot's name must be specified. It follows the general ZFS snapshot name
502 rules, such as the max length is 256 bytes, cannot be conflict with the reserved
505 -r, --rsh <remote_shell>
506 Specify a shell to communicate with remote targets. The default value is 'ssh'.
507 It is the system admin's duty to guarantee that the specified 'remote_shell'
508 works well among targets without password authentication.
510 -t, --timeout <timeout>
511 If write barrier is 'on', then the 'timeout' specified the write barrier's
512 lifetime in second. The default vaule is 60 (seconds).
514 .B snapshot_destroy \fR[-f | --force] <-F | --fsname fsname> [-h | --help]
515 \fR<-n | --name ssname> [-r | --rsh remote_shell]
517 Destroy the specified snapshot.
520 Destory the specified snapshot by force. If the snapshot is mounted, it will be
521 umounted firstly, then destroyed. Even if some pieces of the snapshot are lost
522 or broken for some reason(s), the remained parts of the snapshot still can be
523 destroyed with this option specified.
529 For help information.
532 The snapshot (to be destroyed) name must be specified.
534 -r, --rsh <remote_shell>
535 Specify a shell to communicate with remote targets. The default value is 'ssh'.
536 It is the system admin's duty to guarantee that the specified 'remote_shell'
537 works well among targets without password authentication.
539 .B snapshot_modify \fR[-c | --comment comment] <-F | --fsname fsname>
540 \fR[-h | --help] <-n | --name ssname> [-N | --new new_ssname]
541 \fR[-r | --rsh remote_shell]
543 Modify the specified snapshot.
545 -c, --comment <comment>
546 Add comment (if it has not been specified when snapshot_create) or change the
547 comment for the given snapshot.
553 For help information.
556 The snapshot (to be modified) name must be specified.
558 -N, --new <new_ssname>
559 Rename the snapshot to the new name. It follows the general ZFS snapshot name
560 rules, such as the max length is 256 bytes, cannot be conflict with the reserved
563 -r, --rsh <remote_shell>
564 Specify a shell to communicate with remote targets. The default value is 'ssh'.
565 It is the system admin's duty to guarantee that the specified 'remote_shell'
566 works well among targets without password authentication.
568 .B snapshot_list \fR[-d | --detail] <-F | --fsname fsname> [-h | --help]
569 \fR[-n | --name ssname] [-r | --rsh remote_shell]
571 Query the snapshot information, such as fsname of the snapshot, comment,
572 create time, the latest modification time, whether mounted or not, and so on.
575 List all the information available for each piece of the snapshot on each
576 target. Usually, the information for each piece of the snapshot are the same
577 unless an error occurred during the snapshot operations, such as partly
578 modification or mount. This option allow to check related issues.
584 For help information.
587 The snapshot's name to be queried. If no name is specified, then all the
588 snapshots belong to current Lustre filesystem will be listed.
590 -r, --rsh <remote_shell>
591 Specify a shell to communicate with remote targets. The default value is 'ssh'.
592 It is the system admin's duty to guarantee that the specified 'remote_shell'
593 works well among targets without password authentication.
595 .B snapshot_mount \fR<-F | --fsname fsname> [-h | --help] <-n | --name ssname>
596 \fR[-r | --rsh remote_shell]
598 Mount the specified snapshot on the servers. Be as read only mode Lustre
599 filesystem, if the snapshot is mounted, then it cannot be renamed. It is
600 the user's duty to mount client (must as read only mode "-o ro") to the
602 NOTE: the snapshot has its own fsname that is different from the original
603 filesystem fsname, it can be queried via snapshot_list.
609 For help information.
612 The snapshot (to be mounted) name must be specified.
614 -r, --rsh <remote_shell>
615 Specify a shell to communicate with remote targets. The default value is 'ssh'.
616 It is the system admin's duty to guarantee that the specified 'remote_shell'
617 works well among targets without password authentication.
619 .B snapshot_umount \fR<-F | --fsname fsname> [-h | --help] <-n | --name ssname>
620 \fR[-r | --rsh remote_shell]
622 Umount the specified snapshot.
628 For help information.
631 The snapshot (to be umounted) name must be specified.
633 -r, --rsh <remote_shell>
634 Specify a shell to communicate with remote targets. The default value is 'ssh'.
635 It is the system admin's duty to guarantee that the specified 'remote_shell'
636 works well among targets without password authentication.
640 Start and stop the debug daemon, and control the output filename and size.
642 .BI debug_kernel " [file] [raw]"
643 Dump the kernel debug buffer to stdout or file.
645 .BI debug_file " <input> [output]"
646 Convert kernel-dumped debug log from binary to plain text format.
649 Clear the kernel debug buffer.
652 Insert marker text in the kernel debug buffer.
654 .BI filter " <subsystem id/debug mask>"
655 Filter kernel debug messages by subsystem or mask.
657 .BI show " <subsystem id/debug mask>"
658 Show specific type of messages.
660 .BI debug_list " <subs/types>"
661 List all the subsystem and debug types.
663 .BI modules " <path>"
664 Provide gdb-friendly module information.
667 The following options can be used to invoke lctl.
670 The device to be used for the operation. This can be specified by name or
674 .B --ignore_errors | ignore_errors
675 Ignore errors during script processing
677 .B lustre_build_version
678 Output the build version of the Lustre kernel modules
681 Output the build version of the lctl utility
684 Output a list of the commands supported by the lctl utility
687 Provides brief help on the various arguments
690 Quit the interactive lctl session
696 0 UP mgc MGC192.168.0.20@tcp bfbb24e3-7deb-2ffa-eab0-44dffe00f692 5
697 1 UP ost OSS OSS_uuid 3
698 2 UP obdfilter testfs-OST0000 testfs-OST0000_UUID 3
701 Debug log: 87 lines, 87 kept, 0 dropped.
713 .BR mount.lustre (8),
715 .BR lctl-lfsck-start (8),
716 .BR lctl-lfsck-stop (8),
717 .BR lctl-lfsck-query (8),
718 .BR lctl-llog_catlist (8),
719 .BR lctl-llog_info (8),
720 .BR lctl-llog_print (8),
721 .BR lctl-network (8),
722 .BR lctl-nodemap-activate (8),
723 .BR lctl-nodemap-add-idmap (8),
724 .BR lctl-nodemap-add-range (8),
725 .BR lctl-nodemap-add (8),
726 .BR lctl-nodemap-del-idmap (8),
727 .BR lctl-nodemap-del-range (8),
728 .BR lctl-nodemap-del (8),
729 .BR lctl-nodemap-modify (8),