1 <?xml version='1.0' encoding='UTF-8'?>
2 <chapter xmlns="http://docbook.org/ns/docbook"
3 xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
4 xml:id="zfssnapshots" condition='l2A'>
5 <title xml:id="zfssnapshots.title">Lustre ZFS Snapshots</title>
6 <para>This chapter describes the ZFS Snapshot feature support in Lustre and
7 contains following sections:</para>
10 <para><xref linkend="dbdoclet.zfssnapshotIntro"/></para>
13 <para><xref linkend="dbdoclet.zfssnapshotConfig"/></para>
16 <para><xref linkend="dbdoclet.zfssnapshotOps"/></para>
19 <para><xref linkend="dbdoclet.zfssnapshotBarrier"/></para>
22 <para><xref linkend="dbdoclet.zfssnapshotLogs"/></para>
25 <para><xref linkend="dbdoclet.zfssnapshotLustreLogs"/></para>
28 <section xml:id="dbdoclet.zfssnapshotIntro">
29 <title><indexterm><primary>Introduction</primary>
30 </indexterm>Introduction</title>
31 <para>Snapshots provide fast recovery of files from a previously created
32 checkpoint without recourse to an offline backup or remote replica.
33 Snapshots also provide a means to version-control storage, and can be used
34 to recover lost files or previous versions of files.</para>
35 <para>Filesystem snapshots are intended to be mounted on user-accessible
36 nodes, such as login nodes, so that users can restore files (e.g. after
37 accidental delete or overwrite) without administrator intervention. It
38 would be possible to mount the snapshot filesystem(s) via automount when
39 users access them, rather than mounting all snapshots, to reduce overhead
40 on login nodes when the snapshots are not in use.</para>
41 <para>Recovery of lost files from a snapshot is usually considerably
42 faster than from any offline backup or remote replica. However, note that
43 snapshots do not improve storage reliability and are just as exposed to
44 hardware failure as any other storage volume.</para>
45 <section xml:id="dbdoclet.zfssnapshotsReq">
46 <title><indexterm><primary>Introduction</primary>
47 <secondary>Requirements</secondary></indexterm>Requirements
49 <para>All Lustre server targets must be ZFS file systems running
50 Lustre version 2.10 or later. In addition, the MGS must be able to
51 communicate via ssh or another remote access protocol, without
52 password authentication, to all other servers.</para>
53 <para>The feature is enabled by default and cannot be disabled. The
54 management of snapshots is done through <literal>lctl</literal>
55 commands on the MGS.</para>
56 <para>Lustre snapshot is based on Copy-On-Write; the snapshot and file
57 system may share a single copy of the data until a file is changed on
58 the file system. The snapshot will prevent the space of deleted or
59 overwritten files from being released until the snapshot(s)
60 referencing those files is deleted. The file system administrator
61 needs to establish a snapshot create/backup/remove policy according to
62 their system’s actual size and usage.</para>
65 <section xml:id="dbdoclet.zfssnapshotConfig">
66 <title><indexterm><primary>feature overview</primary>
67 <secondary>configuration</secondary></indexterm>Configuration
69 <para>The snapshot tool loads system configuration from the
70 <literal>/etc/ldev.conf</literal> file on the MGS and calls related
71 ZFS commands to maintian the Lustre snapshot pieces on all targets
72 (MGS/MDT/OST). Please note that the <literal>/etc/ldev.conf</literal>
73 file is used for other purposes as well.</para>
74 <para>The format of the file is:</para>
75 <screen><host> foreign/- <label> <device> [journal-path]/- [raidtab]</screen>
76 <para>The format of <literal><label></literal> is:</para>
77 <screen>fsname-<role><index> or <role><index></screen>
78 <para>The format of <device> is:</para>
79 <screen>[md|zfs:][pool_dir/]<pool>/<filesystem></screen>
80 <para>Snapshot only uses the fields <host>, <label> and
81 <device>.</para>
83 <screen>mgs# cat /etc/ldev.conf
84 host-mdt1 - myfs-MDT0000 zfs:/tmp/myfs-mdt1/mdt1
85 host-mdt2 - myfs-MDT0001 zfs:myfs-mdt2/mdt2
86 host-ost1 - OST0000 zfs:/tmp/myfs-ost1/ost1
87 host-ost2 - OST0001 zfs:myfs-ost2/ost2</screen>
88 <para>The configuration file is edited manually.</para>
89 <para> Once the configuration file is updated to reflect the current
90 file system setup, you are ready to create a file system snapshot.
93 <section xml:id="dbdoclet.zfssnapshotOps">
94 <title><indexterm><primary>operations</primary>
95 </indexterm>Snapshot Operations</title>
96 <section xml:id="dbdoclet.zfssnapshotCreate">
97 <title><indexterm><primary>operations</primary>
98 <secondary>create</secondary></indexterm>Creating a Snapshot
100 <para>To create a snapshot of an existing Lustre file system, run the
101 following <literal>lctl</literal> command on the MGS:</para>
102 <screen>lctl snapshot_create [-b | --barrier [on | off]] [-c | --comment
103 comment] -F | --fsname fsname> [-h | --help] -n | --name ssname>
104 [-r | --rsh remote_shell][-t | --timeout timeout]</screen>
105 <informaltable frame="all">
107 <colspec colname="c1" colwidth="50*"/>
108 <colspec colname="c2" colwidth="50*"/>
112 <para><emphasis role="bold">Option</emphasis></para>
115 <para><emphasis role="bold">Description</emphasis></para>
122 <para> <literal>-b</literal></para>
125 <para>set write barrier before creating snapshot. The
126 default value is 'on'.</para>
131 <para> <literal>-c</literal></para>
134 <para>a description for the purpose of the snapshot
140 <para> <literal>-F</literal></para>
143 <para>the filesystem name</para>
148 <para> <literal>-h</literal></para>
151 <para>help information</para>
156 <para> <literal>-n</literal></para>
159 <para>the name of the snapshot</para>
164 <para> <literal>-r</literal></para>
167 <para>the remote shell used for communication with
168 remote target. The default value is 'ssh'</para>
173 <para> <literal>-t</literal></para>
176 <para>the lifetime (seconds) for write barrier. The
177 default value is 30 seconds</para>
184 <section xml:id="dbdoclet.zfssnapshotDelete">
185 <title><indexterm><primary>operations</primary>
186 <secondary>delete</secondary></indexterm>Delete a Snapshot
188 <para>To delete an existing snapshot, run the following
189 <literal>lctl</literal> command on the MGS:</para>
190 <screen>lctl snapshot_destroy [-f | --force] <-F | --fsname fsname>
191 <-n | --name ssname> [-r | --rsh remote_shell]</screen>
192 <informaltable frame="all">
194 <colspec colname="c1" colwidth="50*"/>
195 <colspec colname="c2" colwidth="50*"/>
199 <para><emphasis role="bold">Option</emphasis></para>
202 <para><emphasis role="bold">Description</emphasis>
210 <para> <literal>-f</literal></para>
213 <para>destroy the snapshot by force</para>
218 <para> <literal>-F</literal></para>
221 <para>the filesystem name</para>
226 <para> <literal>-h</literal></para>
229 <para>help information</para>
234 <para> <literal>-n</literal></para>
237 <para>the name of the snapshot</para>
242 <para> <literal>-r</literal></para>
245 <para>the remote shell used for communication with
246 remote target. The default value is 'ssh'</para>
253 <section xml:id="dbdoclet.zfssnapshotMount">
254 <title><indexterm><primary>operations</primary>
255 <secondary>mount</secondary></indexterm>Mounting a Snapshot
257 <para>Snapshots are treated as separate file systems and can be mounted on
258 Lustre clients. The snapshot file system must be mounted as a
259 read-only file system with the <literal>-o ro</literal> option.
260 If the <literal>mount</literal> command does not include the read-only
261 option, the mount will fail.</para>
262 <note><para>Before a snapshot can be mounted on the client, the snapshot
263 must first be mounted on the servers using the <literal>lctl</literal>
264 utility.</para></note>
265 <para>To mount a snapshot on the server, run the following lctl command
267 <screen>lctl snapshot_mount <-F | --fsname fsname> [-h | --help]
268 <-n | --name ssname> [-r | --rsh remote_shell]</screen>
269 <informaltable frame="all">
271 <colspec colname="c1" colwidth="50*"/>
272 <colspec colname="c2" colwidth="50*"/>
276 <para><emphasis role="bold">Option</emphasis></para>
279 <para><emphasis role="bold">Description</emphasis>
287 <para> <literal>-F</literal></para>
290 <para>the filesystem name</para>
295 <para> <literal>-h</literal></para>
298 <para>help information</para>
303 <para> <literal>-n</literal></para>
306 <para>the name of the snapshot</para>
311 <para> <literal>-r</literal></para>
314 <para>the remote shell used for communication with
315 remote target. The default value is 'ssh'</para>
321 <para>After the successful mounting of the snapshot on the server, clients
322 can now mount the snapshot as a read-only filesystem. For example, to
323 mount a snapshot named <replaceable>snapshot_20170602</replaceable> for a
324 filesystem named <replaceable>myfs</replaceable>, the following mount
325 command would be used:</para>
326 <screen>mgs# lctl snapshot_mount -F myfs -n snapshot_20170602</screen>
327 <para>After mounting on the server, use
328 <literal>lctl snapshot_list</literal> to get the fsname for the snapshot
329 itself as follows:</para>
330 <screen>ss_fsname=$(lctl snapshot_list -F myfs -n snapshot_20170602 |
331 awk '/^snapshot_fsname/ { print $2 }')</screen>
332 <para>Finally, mount the snapshot on the client:</para>
333 <screen>mount -t lustre -o ro $MGS_nid:/$ss_fsname $local_mount_point</screen>
335 <section xml:id="dbdoclet.zfssnapshotUnmount">
336 <title><indexterm><primary>operations</primary>
337 <secondary>unmount</secondary></indexterm>Unmounting a Snapshot
339 <para>To unmount a snapshot from the servers, first unmount the snapshot
340 file system from all clients, using the standard <literal>umount</literal>
341 command on each client. For example, to unmount the snapshot file system
342 named <replaceable>snapshot_20170602</replaceable> run the following
343 command on each client that has it mounted:</para>
344 <screen>client# umount $local_mount_point</screen>
345 <para>After all clients have unmounted the snapshot file system, run the
346 following <literal>lctl</literal>command on a server node where the
347 snapshot is mounted:</para>
348 <screen>lctl snapshot_umount [-F | --fsname fsname] [-h | --help]
349 <-n | -- name ssname> [-r | --rsh remote_shell]</screen>
350 <informaltable frame="all">
352 <colspec colname="c1" colwidth="50*"/>
353 <colspec colname="c2" colwidth="50*"/>
357 <para><emphasis role="bold">Option</emphasis></para>
360 <para><emphasis role="bold">Description</emphasis>
368 <para> <literal>-F</literal></para>
371 <para>the filesystem name</para>
376 <para> <literal>-h</literal></para>
379 <para>help information</para>
384 <para> <literal>-n</literal></para>
387 <para>the name of the snapshot</para>
392 <para> <literal>-r</literal></para>
395 <para>the remote shell used for communication with
396 remote target. The default value is 'ssh'</para>
402 <para>For example:</para>
403 <screen>lctl snapshot_umount -F myfs -n snapshot_20170602</screen>
405 <section xml:id="dbdoclet.zfssnapshotList">
406 <title><indexterm><primary>operations</primary>
407 <secondary>list</secondary></indexterm>List Snapshots
409 <para>To list the available snapshots for a given file system, use the
410 following <literal>lctl</literal> command on the MGS:</para>
411 <screen>lctl snapshot_list [-d | --detail] <-F | --fsname fsname>
412 [-h | -- help] [-n | --name ssname] [-r | --rsh remote_shell]</screen>
413 <informaltable frame="all">
415 <colspec colname="c1" colwidth="50*"/>
416 <colspec colname="c2" colwidth="50*"/>
420 <para><emphasis role="bold">Option</emphasis></para>
423 <para><emphasis role="bold">Description</emphasis>
431 <para> <literal>-d</literal></para>
434 <para>list every piece for the specified snapshot
440 <para> <literal>-F</literal></para>
443 <para>the filesystem name</para>
448 <para> <literal>-h</literal></para>
451 <para>help information</para>
456 <para> <literal>-n</literal></para>
459 <para>the snapshot's name. If the snapshot name is not
460 supplied, all snapshots for this file system will be
466 <para> <literal>-r</literal></para>
469 <para>the remote shell used for communication with
470 remote target. The default value is 'ssh'</para>
477 <section xml:id="dbdoclet.zfssnapshotModify">
478 <title><indexterm><primary>operations</primary>
479 <secondary>modify</secondary></indexterm>Modify Snapshot Attributes
481 <para>Currently, Lustre snapshot has five user visible attributes;
482 snapshot name, snapshot comment, create time, modification time, and
483 snapshot file system name. Among them, the former two attributes can be
484 modified. Renaming follows the general ZFS snapshot name rules, such as
485 the maximum length is 256 bytes, cannot conflict with the reserved names,
487 <para>To modify a snapshot’s attributes, use the following
488 <literal>lctl</literal> command on the MGS:</para>
489 <screen>lctl snapshot_modify [-c | --comment comment]
490 <-F | --fsname fsname> [-h | --help] <-n | --name ssname>
491 [-N | --new new_ssname] [-r | --rsh remote_shell]</screen>
492 <informaltable frame="all">
494 <colspec colname="c1" colwidth="50*"/>
495 <colspec colname="c2" colwidth="50*"/>
499 <para><emphasis role="bold">Option</emphasis></para>
502 <para><emphasis role="bold">Description</emphasis>
510 <para> <literal>-c</literal></para>
513 <para>update the snapshot's comment</para>
518 <para> <literal>-F</literal></para>
521 <para>the filesystem name</para>
526 <para> <literal>-h</literal></para>
529 <para>help information</para>
534 <para> <literal>-n</literal></para>
537 <para>the snapshot's name</para>
542 <para> <literal>-N</literal></para>
545 <para>rename the snapshot's name as
546 <replaceable>new_ssname</replaceable></para>
551 <para> <literal>-r</literal></para>
554 <para>the remote shell used for communication with
555 remote target. The default value is 'ssh'</para>
563 <section xml:id="dbdoclet.zfssnapshotBarrier">
564 <title><indexterm><primary>barrier</primary>
565 </indexterm>Global Write Barriers</title>
566 <para>Snapshots are non-atomic across multiple MDTs and OSTs, which means
567 that if there is activity on the file system while a snapshot is being
568 taken, there may be user-visible namespace inconsistencies with files
569 created or destroyed in the interval between the MDT and OST snapshots.
570 In order to create a consistent snapshot of the file system, we are able
571 to set a global write barrier, or “freeze” the system. Once set, all
572 metadata modifications will be blocked until the write barrier is actively
573 removed (“thawed”) or expired. The user can set a timeout parameter on a
574 global barrier or the barrier can be explicitly removed. The default
575 timeout period is 30 seconds.</para>
576 <para>It is important to note that snapshots are usable without the global
577 barrier. Only files that are currently being modified by clients (write,
578 create, unlink) may be inconsistent as noted above if the barrier is not
579 used. Other files not curently being modified would be usable even
580 without the barrier.</para>
581 <para>The snapshot create command will call the write barrier internally
582 when requested using the <literal>-b</literal> option to
583 <literal>lctl snapshot_create</literal>. So, explicit use of the barrier
584 is not required when using snapshots but included here as an option to
585 quiet the file system before a snapshot is created.</para>
586 <section xml:id="dbdoclet.zfssnapshotBarrierImpose">
587 <title><indexterm><primary>barrier</primary>
588 <secondary>impose</secondary></indexterm>Impose Barrier
590 <para>To impose a global write barrier, run the
591 <literal>lctl barrier_freeze</literal> command on the MGS:</para>
592 <screen>lctl barrier_freeze <fsname> [timeout (in seconds)]
593 where timeout default is 30.</screen>
594 <para>For example, to freeze the filesystem
595 <replaceable>testfs</replaceable> for <literal>15</literal> seconds:
597 <screen>mgs# lctl barrier_freeze testfs 15</screen>
598 <para>If the command is successful, there will be no output from
599 the command. Otherwise, an error message will be printed.</para>
601 <section xml:id="dbdoclet.zfssnapshotBarrierRemove">
602 <title><indexterm><primary>barrier</primary>
603 <secondary>remove</secondary></indexterm>Remove Barrier
605 <para>To remove a global write barrier, run the
606 <literal>lctl barrier_thaw</literal> command on the MGS:</para>
607 <screen>lctl barrier_thaw <fsname></screen>
608 <para>For example, to thaw the write barrier for the filesystem
609 <replaceable>testfs</replaceable>:
611 <screen>mgs# lctl barrier_thaw testfs</screen>
612 <para>If the command is successful, there will be no output from
613 the command. Otherwise, an error message will be printed.</para>
615 <section xml:id="dbdoclet.zfssnapshotBarrierQuery">
616 <title><indexterm><primary>barrier</primary>
617 <secondary>query</secondary></indexterm>Query Barrier
619 <para>To see how much time is left on a global write barrier, run the
620 <literal>lctl barrier_stat</literal> command on the MGS:</para>
621 <screen># lctl barrier_stat <fsname></screen>
622 <para>For example, to stat the write barrier for the filesystem
623 <replaceable>testfs</replaceable>:
625 <screen>mgs# lctl barrier_stat testfs
626 The barrier for testfs is in 'frozen'
627 The barrier will be expired after 7 seconds</screen>
628 <para>If the command is successful, a status from the table below
629 will be printed. Otherwise, an error message will be printed.</para>
630 <para>The possible status and related meanings for the write barrier
631 are as follows:</para>
632 <table frame="all" xml:id="writebarrierstatus.tab1">
633 <title>Write Barrier Status</title>
635 <colspec colname="c1" colwidth="50*"/>
636 <colspec colname="c2" colwidth="50*"/>
640 <para><emphasis role="bold">Status</emphasis>
644 <para><emphasis role="bold">Meaning</emphasis>
652 <para> <literal>init</literal></para>
655 <para>barrier has never been set on the system
661 <para> <literal>freezing_p1</literal></para>
664 <para>In the first stage of setting the write
670 <para> <literal>freezing_p2</literal></para>
673 <para> the second stage of setting the write
679 <para> <literal>frozen</literal></para>
682 <para>the write barrier has been set successfully
688 <para> <literal>thawing</literal></para>
691 <para>In thawing the write barrier</para>
696 <para> <literal>thawed</literal></para>
699 <para>The write barrier has been thawed</para>
704 <para> <literal>failed</literal></para>
707 <para>Failed to set write barrier</para>
712 <para> <literal>expired</literal></para>
715 <para>The write barrier is expired</para>
720 <para> <literal>rescan</literal></para>
723 <para>In scanning the MDTs status, see the command
724 <literal>barrier_rescan</literal></para>
729 <para> <literal>unknown</literal></para>
732 <para>Other cases</para>
738 <para>If the barrier is in ’freezing_p1’, ’freezing_p2’ or ’frozen’
739 status, then the remaining lifetime will be returned also.</para>
741 <section xml:id="dbdoclet.zfssnapshotBarrierRescan">
742 <title><indexterm><primary>barrier</primary>
743 <secondary>rescan</secondary></indexterm>Rescan Barrier
745 <para> To rescan a global write barrier to check which MDTs are
746 active, run the <literal>lctl barrier_rescan</literal> command on the
748 <screen>lctl barrier_rescan <fsname> [timeout (in seconds)],
749 where the default timeout is 30 seconds.</screen>
750 <para>For example, to rescan the barrier for filesystem
751 <replaceable>testfs</replaceable>:</para>
752 <screen>mgs# lctl barrier_rescan testfs
753 1 of 4 MDT(s) in the filesystem testfs are inactive</screen>
754 <para>If the command is successful, the number of MDTs that are
755 unavailable against the total MDTs will be reported. Otherwise, an
756 error message will be printed.</para>
759 <section xml:id="dbdoclet.zfssnapshotLogs">
760 <title><indexterm><primary>logs</primary>
761 </indexterm>Snapshot Logs</title>
762 <para>A log of all snapshot activity can be found in the following file:
763 <literal>/var/log/lsnapshot.log</literal>. This file contains information
764 on when a snapshot was created, an attribute was changed, when it was
765 mounted, and other snapshot information.</para>
766 <para>The following is a sample <literal>/var/log/lsnapshot</literal>
768 <screen>Mon Mar 21 19:43:06 2016
769 (15826:jt_snapshot_create:1138:scratch:ssh): Create snapshot lss_0_0
770 successfully with comment <(null)>, barrier <enable>, timeout <30>
771 Mon Mar 21 19:43:11 2016(13030:jt_snapshot_create:1138:scratch:ssh):
772 Create snapshot lss_0_1 successfully with comment <(null)>, barrier
773 <disable>, timeout <-1>
774 Mon Mar 21 19:44:38 2016 (17161:jt_snapshot_mount:2013:scratch:ssh):
775 The snapshot lss_1a_0 is mounted
776 Mon Mar 21 19:44:46 2016
777 (17662:jt_snapshot_umount:2167:scratch:ssh): the snapshot lss_1a_0
779 Mon Mar 21 19:47:12 2016
780 (20897:jt_snapshot_destroy:1312:scratch:ssh): Destroy snapshot
781 lss_2_0 successfully with force <disable></screen>
783 <section xml:id="dbdoclet.zfssnapshotLustreLogs">
784 <title><indexterm><primary>configlogs</primary>
785 </indexterm>Lustre Configuration Logs</title>
786 <para>A snapshot is independent from the original file system that it is
787 derived from and is treated as a new file system name that can be mounted
788 by Lustre client nodes. The file system name is part of the configuration
789 log names and exists in configuration log entries. Two commands exist to
790 manipulate configuration logs: <literal>lctl fork_lcfg</literal> and
791 <literal>lctl erase_lcfg</literal>.</para>
792 <para>The snapshot commands will use configuration log functionality
793 internally when needed. So, use of the barrier is not required to use
794 snapshots but included here as an option. The following configuration log
795 commands are independent of snapshots and can be used independent of
797 <para>To fork a configuration log, run the following
798 <literal>lctl</literal> command on the MGS:</para>
799 <screen>lctl fork_lcfg</screen>
800 <para>Usage: fork_lcfg <fsname> <newname></para>
801 <para>To erase a configuration log, run the following
802 <literal>lctl</literal> command on the MGS:</para>
803 <screen>lctl erase_lcfg</screen>
804 <para>Usage: erase_lcfg <fsname></para>
808 vim:expandtab:shiftwidth=2:tabstop=8: