1 <?xml version='1.0' encoding='UTF-8'?><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="zfssnapshots">
2 <title xml:id="zfssnapshots.title">Lustre ZFS Snapshots</title>
3 <para>This chapter describes the ZFS Snapshot feature support in Lustre and
4 contains following sections:</para>
7 <para><xref linkend="dbdoclet.zfssnapshotIntro"/></para>
10 <para><xref linkend="dbdoclet.zfssnapshotConfig"/></para>
13 <para><xref linkend="dbdoclet.zfssnapshotOps"/></para>
16 <para><xref linkend="dbdoclet.zfssnapshotBarrier"/></para>
19 <para><xref linkend="dbdoclet.zfssnapshotLogs"/></para>
22 <para><xref linkend="dbdoclet.zfssnapshotLustreLogs"/></para>
25 <section xml:id="dbdoclet.zfssnapshotIntro">
26 <title><indexterm><primary>Introduction</primary>
27 </indexterm>Introduction</title>
28 <para>Snapshots provide fast recovery of files from a previously created
29 checkpoint without recourse to an offline backup or remote replica.
30 Snapshots also provide a means to version-control storage, and can be used
31 to recover lost files or previous versions of files.</para>
32 <para>Filesystem snapshots are intended to be mounted on user-accessible
33 nodes, such as login nodes, so that users can restore files (e.g. after
34 accidental delete or overwrite) without administrator intervention. It
35 would be possible to mount the snapshot filesystem(s) via automount when
36 users access them, rather than mounting all snapshots, to reduce overhead
37 on login nodes when the snapshots are not in use.</para>
38 <para>Recovery of lost files from a snapshot is usually considerably
39 faster than from any offline backup or remote replica. However, note that
40 snapshots do not improve storage reliability and are just as exposed to
41 hardware failure as any other storage volume.</para>
42 <section xml:id="dbdoclet.zfssnapshotsReq">
43 <title><indexterm><primary>Introduction</primary>
44 <secondary>Requirements</secondary></indexterm>Requirements
46 <para>All Lustre server targets must be ZFS file systems running
47 Lustre version 2.10 or later. In addition, the MGS must be able to
48 communicate via ssh or another remote access protocol, without
49 password authentication, to all other servers.</para>
50 <para>The feature is enabled by default and cannot be disabled. The
51 management of snapshots is done through <literal>lctl</literal>
52 commands on the MGS.</para>
53 <para>Lustre snapshot is based on Copy-On-Write; the snapshot and file
54 system may share a single copy of the data until a file is changed on
55 the file system. The snapshot will prevent the space of deleted or
56 overwritten files from being released until the snapshot(s)
57 referencing those files is deleted. The file system administrator
58 needs to establish a snapshot create/backup/remove policy according to
59 their system’s actual size and usage.</para>
62 <section xml:id="dbdoclet.zfssnapshotConfig">
63 <title><indexterm><primary>feature overview</primary>
64 <secondary>configuration</secondary></indexterm>Configuration
66 <para>The snapshot tool loads system configuration from the
67 <literal>/etc/ldev.conf</literal> file on the MGS and calls related
68 ZFS commands to maintian the Lustre snapshot pieces on all targets
69 (MGS/MDT/OST). Please note that the <literal>/etc/ldev.conf</literal>
70 file is used for other purposes as well.</para>
71 <para>The format of the file is:</para>
72 <screen><host> foreign/- <label> <device> [journal-path]/- [raidtab]</screen>
73 <para>The format of <literal><label></literal> is:</para>
74 <screen>fsname-<role><index> or <role><index></screen>
75 <para>The format of <device> is:</para>
76 <screen>[md|zfs:][pool_dir/]<pool>/<filesystem></screen>
77 <para>Snapshot only uses the fields <host>, <label> and
78 <device>.</para>
80 <screen>mgs# cat /etc/ldev.conf
81 host-mdt1 - myfs-MDT0000 zfs:/tmp/myfs-mdt1/mdt1
82 host-mdt2 - myfs-MDT0001 zfs:myfs-mdt2/mdt2
83 host-ost1 - OST0000 zfs:/tmp/myfs-ost1/ost1
84 host-ost2 - OST0001 zfs:myfs-ost2/ost2</screen>
85 <para>The configuration file is edited manually.</para>
86 <para> Once the configuration file is updated to reflect the current
87 file system setup, you are ready to create a file system snapshot.
90 <section xml:id="dbdoclet.zfssnapshotOps">
91 <title><indexterm><primary>operations</primary>
92 </indexterm>Snapshot Operations</title>
93 <section xml:id="dbdoclet.zfssnapshotCreate">
94 <title><indexterm><primary>operations</primary>
95 <secondary>create</secondary></indexterm>Creating a Snapshot
97 <para>To create a snapshot of an existing Lustre file system, run the
98 following <literal>lctl</literal> command on the MGS:</para>
99 <screen>lctl snapshot_create [-b | --barrier [on | off]] [-c | --comment
100 comment] -F | --fsname fsname> [-h | --help] -n | --name ssname>
101 [-r | --rsh remote_shell][-t | --timeout timeout]</screen>
102 <informaltable frame="all">
104 <colspec colname="c1" colwidth="50*"/>
105 <colspec colname="c2" colwidth="50*"/>
109 <para><emphasis role="bold">Option</emphasis></para>
112 <para><emphasis role="bold">Description</emphasis></para>
119 <para> <literal>-b</literal></para>
122 <para>set write barrier before creating snapshot. The
123 default value is 'on'.</para>
128 <para> <literal>-c</literal></para>
131 <para>a description for the purpose of the snapshot
137 <para> <literal>-F</literal></para>
140 <para>the filesystem name</para>
145 <para> <literal>-h</literal></para>
148 <para>help information</para>
153 <para> <literal>-n</literal></para>
156 <para>the name of the snapshot</para>
161 <para> <literal>-r</literal></para>
164 <para>the remote shell used for communication with
165 remote target. The default value is 'ssh'</para>
170 <para> <literal>-t</literal></para>
173 <para>the lifetime (seconds) for write barrier. The
174 default value is 30 seconds</para>
181 <section xml:id="dbdoclet.zfssnapshotDelete">
182 <title><indexterm><primary>operations</primary>
183 <secondary>delete</secondary></indexterm>Delete a Snapshot
185 <para>To delete an existing snapshot, run the following
186 <literal>lctl</literal> command on the MGS:</para>
187 <screen>lctl snapshot_destroy [-f | --force] <-F | --fsname fsname>
188 <-n | --name ssname> [-r | --rsh remote_shell]</screen>
189 <informaltable frame="all">
191 <colspec colname="c1" colwidth="50*"/>
192 <colspec colname="c2" colwidth="50*"/>
196 <para><emphasis role="bold">Option</emphasis></para>
199 <para><emphasis role="bold">Description</emphasis>
207 <para> <literal>-f</literal></para>
210 <para>destroy the snapshot by force</para>
215 <para> <literal>-F</literal></para>
218 <para>the filesystem name</para>
223 <para> <literal>-h</literal></para>
226 <para>help information</para>
231 <para> <literal>-n</literal></para>
234 <para>the name of the snapshot</para>
239 <para> <literal>-r</literal></para>
242 <para>the remote shell used for communication with
243 remote target. The default value is 'ssh'</para>
250 <section xml:id="dbdoclet.zfssnapshotMount">
251 <title><indexterm><primary>operations</primary>
252 <secondary>mount</secondary></indexterm>Mounting a Snapshot
254 <para>Snapshots are treated as separate file systems and can be mounted on
255 Lustre clients. The snapshot file system must be mounted as a
256 read-only file system with the <literal>-o ro</literal> option.
257 If the <literal>mount</literal> command does not include the read-only
258 option, the mount will fail.</para>
259 <note><para>Before a snapshot can be mounted on the client, the snapshot
260 must first be mounted on the servers using the <literal>lctl</literal>
261 utility.</para></note>
262 <para>To mount a snapshot on the server, run the following lctl command
264 <screen>lctl snapshot_mount <-F | --fsname fsname> [-h | --help]
265 <-n | --name ssname> [-r | --rsh remote_shell]</screen>
266 <informaltable frame="all">
268 <colspec colname="c1" colwidth="50*"/>
269 <colspec colname="c2" colwidth="50*"/>
273 <para><emphasis role="bold">Option</emphasis></para>
276 <para><emphasis role="bold">Description</emphasis>
284 <para> <literal>-F</literal></para>
287 <para>the filesystem name</para>
292 <para> <literal>-h</literal></para>
295 <para>help information</para>
300 <para> <literal>-n</literal></para>
303 <para>the name of the snapshot</para>
308 <para> <literal>-r</literal></para>
311 <para>the remote shell used for communication with
312 remote target. The default value is 'ssh'</para>
318 <para>After the successful mounting of the snapshot on the server, clients
319 can now mount the snapshot as a read-only filesystem. For example, to
320 mount a snapshot named <replaceable>snapshot_20170602</replaceable> for a
321 filesystem named <replaceable>myfs</replaceable>, the following mount
322 command would be used:</para>
323 <screen>mgs# lctl snapshot_mount -F myfs -n snapshot_20170602</screen>
324 <para>After mounting on the server, use
325 <literal>lctl snapshot_list</literal> to get the fsname for the snapshot
326 itself as follows:</para>
327 <screen>ss_fsname=$(lctl snapshot_list -F myfs -n snapshot_20170602 |
328 awk '/^snapshot_fsname/ { print $2 }')</screen>
329 <para>Finally, mount the snapshot on the client:</para>
330 <screen>mount -t lustre -o ro $MGS_nid:/$ss_fsname $local_mount_point</screen>
332 <section xml:id="dbdoclet.zfssnapshotUnmount">
333 <title><indexterm><primary>operations</primary>
334 <secondary>unmount</secondary></indexterm>Unmounting a Snapshot
336 <para>To unmount a snapshot from the servers, first unmount the snapshot
337 file system from all clients, using the standard <literal>umount</literal>
338 command on each client. For example, to unmount the snapshot file system
339 named <replaceable>snapshot_20170602</replaceable> run the following
340 command on each client that has it mounted:</para>
341 <screen>client# umount $local_mount_point</screen>
342 <para>After all clients have unmounted the snapshot file system, run the
343 following <literal>lctl</literal>command on a server node where the
344 snapshot is mounted:</para>
345 <screen>lctl snapshot_umount [-F | --fsname fsname] [-h | --help]
346 <-n | -- name ssname> [-r | --rsh remote_shell]</screen>
347 <informaltable frame="all">
349 <colspec colname="c1" colwidth="50*"/>
350 <colspec colname="c2" colwidth="50*"/>
354 <para><emphasis role="bold">Option</emphasis></para>
357 <para><emphasis role="bold">Description</emphasis>
365 <para> <literal>-F</literal></para>
368 <para>the filesystem name</para>
373 <para> <literal>-h</literal></para>
376 <para>help information</para>
381 <para> <literal>-n</literal></para>
384 <para>the name of the snapshot</para>
389 <para> <literal>-r</literal></para>
392 <para>the remote shell used for communication with
393 remote target. The default value is 'ssh'</para>
399 <para>For example:</para>
400 <screen>lctl snapshot_umount -F myfs -n snapshot_20170602</screen>
402 <section xml:id="dbdoclet.zfssnapshotList">
403 <title><indexterm><primary>operations</primary>
404 <secondary>list</secondary></indexterm>List Snapshots
406 <para>To list the available snapshots for a given file system, use the
407 following <literal>lctl</literal> command on the MGS:</para>
408 <screen>lctl snapshot_list [-d | --detail] <-F | --fsname fsname>
409 [-h | -- help] [-n | --name ssname] [-r | --rsh remote_shell]</screen>
410 <informaltable frame="all">
412 <colspec colname="c1" colwidth="50*"/>
413 <colspec colname="c2" colwidth="50*"/>
417 <para><emphasis role="bold">Option</emphasis></para>
420 <para><emphasis role="bold">Description</emphasis>
428 <para> <literal>-d</literal></para>
431 <para>list every piece for the specified snapshot
437 <para> <literal>-F</literal></para>
440 <para>the filesystem name</para>
445 <para> <literal>-h</literal></para>
448 <para>help information</para>
453 <para> <literal>-n</literal></para>
456 <para>the snapshot's name. If the snapshot name is not
457 supplied, all snapshots for this file system will be
463 <para> <literal>-r</literal></para>
466 <para>the remote shell used for communication with
467 remote target. The default value is 'ssh'</para>
474 <section xml:id="dbdoclet.zfssnapshotModify">
475 <title><indexterm><primary>operations</primary>
476 <secondary>modify</secondary></indexterm>Modify Snapshot Attributes
478 <para>Currently, Lustre snapshot has five user visible attributes;
479 snapshot name, snapshot comment, create time, modification time, and
480 snapshot file system name. Among them, the former two attributes can be
481 modified. Renaming follows the general ZFS snapshot name rules, such as
482 the maximum length is 256 bytes, cannot conflict with the reserved names,
484 <para>To modify a snapshot’s attributes, use the following
485 <literal>lctl</literal> command on the MGS:</para>
486 <screen>lctl snapshot_modify [-c | --comment comment]
487 <-F | --fsname fsname> [-h | --help] <-n | --name ssname>
488 [-N | --new new_ssname] [-r | --rsh remote_shell]</screen>
489 <informaltable frame="all">
491 <colspec colname="c1" colwidth="50*"/>
492 <colspec colname="c2" colwidth="50*"/>
496 <para><emphasis role="bold">Option</emphasis></para>
499 <para><emphasis role="bold">Description</emphasis>
507 <para> <literal>-c</literal></para>
510 <para>update the snapshot's comment</para>
515 <para> <literal>-F</literal></para>
518 <para>the filesystem name</para>
523 <para> <literal>-h</literal></para>
526 <para>help information</para>
531 <para> <literal>-n</literal></para>
534 <para>the snapshot's name</para>
539 <para> <literal>-N</literal></para>
542 <para>rename the snapshot's name as
543 <replaceable>new_ssname</replaceable></para>
548 <para> <literal>-r</literal></para>
551 <para>the remote shell used for communication with
552 remote target. The default value is 'ssh'</para>
560 <section xml:id="dbdoclet.zfssnapshotBarrier">
561 <title><indexterm><primary>barrier</primary>
562 </indexterm>Global Write Barriers</title>
563 <para>Snapshots are non-atomic across multiple MDTs and OSTs, which means
564 that if there is activity on the file system while a snapshot is being
565 taken, there may be user-visible namespace inconsistencies with files
566 created or destroyed in the interval between the MDT and OST snapshots.
567 In order to create a consistent snapshot of the file system, we are able
568 to set a global write barrier, or “freeze” the system. Once set, all
569 metadata modifications will be blocked until the write barrier is actively
570 removed (“thawed”) or expired. The user can set a timeout parameter on a
571 global barrier or the barrier can be explicitly removed. The default
572 timeout period is 30 seconds.</para>
573 <para>It is important to note that snapshots are usable without the global
574 barrier. Only files that are currently being modified by clients (write,
575 create, unlink) may be inconsistent as noted above if the barrier is not
576 used. Other files not curently being modified would be usable even
577 without the barrier.</para>
578 <para>The snapshot create command will call the write barrier internally
579 when requested using the <literal>-b</literal> option to
580 <literal>lctl snapshot_create</literal>. So, explicit use of the barrier
581 is not required when using snapshots but included here as an option to
582 quiet the file system before a snapshot is created.</para>
583 <section xml:id="dbdoclet.zfssnapshotBarrierImpose">
584 <title><indexterm><primary>barrier</primary>
585 <secondary>impose</secondary></indexterm>Impose Barrier
587 <para>To impose a global write barrier, run the
588 <literal>lctl barrier_freeze</literal> command on the MGS:</para>
589 <screen>lctl barrier_freeze <fsname> [timeout (in seconds)]
590 where timeout default is 30.</screen>
591 <para>For example, to freeze the filesystem
592 <replaceable>testfs</replaceable> for <literal>15</literal> seconds:
594 <screen>mgs# lctl barrier_freeze testfs 15</screen>
595 <para>If the command is successful, there will be no output from
596 the command. Otherwise, an error message will be printed.</para>
598 <section xml:id="dbdoclet.zfssnapshotBarrierRemove">
599 <title><indexterm><primary>barrier</primary>
600 <secondary>remove</secondary></indexterm>Remove Barrier
602 <para>To remove a global write barrier, run the
603 <literal>lctl barrier_thaw</literal> command on the MGS:</para>
604 <screen>lctl barrier_thaw <fsname></screen>
605 <para>For example, to thaw the write barrier for the filesystem
606 <replaceable>testfs</replaceable>:
608 <screen>mgs# lctl barrier_thaw testfs</screen>
609 <para>If the command is successful, there will be no output from
610 the command. Otherwise, an error message will be printed.</para>
612 <section xml:id="dbdoclet.zfssnapshotBarrierQuery">
613 <title><indexterm><primary>barrier</primary>
614 <secondary>query</secondary></indexterm>Query Barrier
616 <para>To see how much time is left on a global write barrier, run the
617 <literal>lctl barrier_stat</literal> command on the MGS:</para>
618 <screen># lctl barrier_stat <fsname></screen>
619 <para>For example, to stat the write barrier for the filesystem
620 <replaceable>testfs</replaceable>:
622 <screen>mgs# lctl barrier_stat testfs
623 The barrier for testfs is in 'frozen'
624 The barrier will be expired after 7 seconds</screen>
625 <para>If the command is successful, a status from the table below
626 will be printed. Otherwise, an error message will be printed.</para>
627 <para>The possible status and related meanings for the write barrier
628 are as follows:</para>
629 <table frame="all" xml:id="writebarrierstatus.tab1">
630 <title>Write Barrier Status</title>
632 <colspec colname="c1" colwidth="50*"/>
633 <colspec colname="c2" colwidth="50*"/>
637 <para><emphasis role="bold">Status</emphasis>
641 <para><emphasis role="bold">Meaning</emphasis>
649 <para> <literal>init</literal></para>
652 <para>barrier has never been set on the system
658 <para> <literal>freezing_p1</literal></para>
661 <para>In the first stage of setting the write
667 <para> <literal>freezing_p2</literal></para>
670 <para> the second stage of setting the write
676 <para> <literal>frozen</literal></para>
679 <para>the write barrier has been set successfully
685 <para> <literal>thawing</literal></para>
688 <para>In thawing the write barrier</para>
693 <para> <literal>thawed</literal></para>
696 <para>The write barrier has been thawed</para>
701 <para> <literal>failed</literal></para>
704 <para>Failed to set write barrier</para>
709 <para> <literal>expired</literal></para>
712 <para>The write barrier is expired</para>
717 <para> <literal>rescan</literal></para>
720 <para>In scanning the MDTs status, see the command
721 <literal>barrier_rescan</literal></para>
726 <para> <literal>unknown</literal></para>
729 <para>Other cases</para>
735 <para>If the barrier is in ’freezing_p1’, ’freezing_p2’ or ’frozen’
736 status, then the remaining lifetime will be returned also.</para>
738 <section xml:id="dbdoclet.zfssnapshotBarrierRescan">
739 <title><indexterm><primary>barrier</primary>
740 <secondary>rescan</secondary></indexterm>Rescan Barrier
742 <para> To rescan a global write barrier to check which MDTs are
743 active, run the <literal>lctl barrier_rescan</literal> command on the
745 <screen>lctl barrier_rescan <fsname> [timeout (in seconds)],
746 where the default timeout is 30 seconds.</screen>
747 <para>For example, to rescan the barrier for filesystem
748 <replaceable>testfs</replaceable>:</para>
749 <screen>mgs# lctl barrier_rescan testfs
750 1 of 4 MDT(s) in the filesystem testfs are inactive</screen>
751 <para>If the command is successful, the number of MDTs that are
752 unavailable against the total MDTs will be reported. Otherwise, an
753 error message will be printed.</para>
756 <section xml:id="dbdoclet.zfssnapshotLogs">
757 <title><indexterm><primary>logs</primary>
758 </indexterm>Snapshot Logs</title>
759 <para>A log of all snapshot activity can be found in the following file:
760 <literal>/var/log/lsnapshot.log</literal>. This file contains information
761 on when a snapshot was created, an attribute was changed, when it was
762 mounted, and other snapshot information.</para>
763 <para>The following is a sample <literal>/var/log/lsnapshot</literal>
765 <screen>Mon Mar 21 19:43:06 2016
766 (15826:jt_snapshot_create:1138:scratch:ssh): Create snapshot lss_0_0
767 successfully with comment <(null)>, barrier <enable>, timeout <30>
768 Mon Mar 21 19:43:11 2016(13030:jt_snapshot_create:1138:scratch:ssh):
769 Create snapshot lss_0_1 successfully with comment <(null)>, barrier
770 <disable>, timeout <-1>
771 Mon Mar 21 19:44:38 2016 (17161:jt_snapshot_mount:2013:scratch:ssh):
772 The snapshot lss_1a_0 is mounted
773 Mon Mar 21 19:44:46 2016
774 (17662:jt_snapshot_umount:2167:scratch:ssh): the snapshot lss_1a_0
776 Mon Mar 21 19:47:12 2016
777 (20897:jt_snapshot_destroy:1312:scratch:ssh): Destroy snapshot
778 lss_2_0 successfully with force <disable></screen>
780 <section xml:id="dbdoclet.zfssnapshotLustreLogs">
781 <title><indexterm><primary>configlogs</primary>
782 </indexterm>Lustre Configuration Logs</title>
783 <para>A snapshot is independent from the original file system that it is
784 derived from and is treated as a new file system name that can be mounted
785 by Lustre client nodes. The file system name is part of the configuration
786 log names and exists in configuration log entries. Two commands exist to
787 manipulate configuration logs: <literal>lctl fork_lcfg</literal> and
788 <literal>lctl erase_lcfg</literal>.</para>
789 <para>The snapshot commands will use configuration log functionality
790 internally when needed. So, use of the barrier is not required to use
791 snapshots but included here as an option. The following configuration log
792 commands are independent of snapshots and can be used independent of
794 <para>To fork a configuration log, run the following
795 <literal>lctl</literal> command on the MGS:</para>
796 <screen>lctl fork_lcfg</screen>
797 <para>Usage: fork_lcfg <fsname> <newname></para>
798 <para>To erase a configuration log, run the following
799 <literal>lctl</literal> command on the MGS:</para>
800 <screen>lctl erase_lcfg</screen>
801 <para>Usage: erase_lcfg <fsname></para>