X-Git-Url: https://git.whamcloud.com/?a=blobdiff_plain;f=LustreMaintenance.xml;h=ac0a3bf79487f8f99206cb8f016aaec2ef8fceda;hb=00c99af814574fe85ae7bea886d9fffcce4d0261;hp=4082661e1bf987076d1aa518b89c8a5bcde8f018;hpb=5a81402497580d378efc149deccfb62b17e1c0f5;p=doc%2Fmanual.git diff --git a/LustreMaintenance.xml b/LustreMaintenance.xml index 4082661..ac0a3bf 100644 --- a/LustreMaintenance.xml +++ b/LustreMaintenance.xml @@ -1,63 +1,75 @@ - + + Lustre Maintenance Once you have the Lustre file system up and running, you can use the procedures in this section to perform these basic Lustre maintenance tasks: - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + - + + + + + + + + + + -
+
<indexterm><primary>maintenance</primary></indexterm> <indexterm><primary>maintenance</primary><secondary>inactive OSTs</secondary></indexterm> @@ -74,7 +86,7 @@ <literal>exclude=testfs-OST0000:testfs-OST0001</literal>.</para> </note> </section> - <section xml:id="dbdoclet.50438199_15240"> + <section xml:id="lustremaint.findingNodes"> <title><indexterm><primary>maintenance</primary><secondary>finding nodes</secondary></indexterm> Finding Nodes in the Lustre File System There may be situations in which you need to find all nodes in @@ -105,7 +117,7 @@ Finding Nodes in the Lustre File System 0: testfs-OST0000_UUID ACTIVE 1: testfs-OST0001_UUID ACTIVE
-
+
<indexterm><primary>maintenance</primary><secondary>mounting a server</secondary></indexterm> Mounting a Server Without Lustre Service If you are using a combined MGS/MDT, but you only want to start the MGS and not the MDT, run this command: @@ -114,7 +126,7 @@ Mounting a Server Without Lustre Service In this example, the combined MGS/MDT is testfs-MDT0000 and the mount point is /mnt/test/mdt. $ mount -t lustre -L testfs-MDT0000 -o nosvc /mnt/test/mdt
-
+
<indexterm><primary>maintenance</primary><secondary>regenerating config logs</secondary></indexterm> Regenerating Lustre Configuration Logs If the Lustre file system configuration logs are in a state where @@ -221,18 +233,21 @@ mgs# lctl --device MGS llog_print fsname-OST0000 run, the configuration logs are re-generated as servers connect to the MGS.
-
+
<indexterm><primary>maintenance</primary><secondary>changing a NID</secondary></indexterm> Changing a Server NID - In Lustre software release 2.3 or earlier, the tunefs.lustre - --writeconf command is used to rewrite all of the configuration files. - If you need to change the NID on the MDT or OST, a new - replace_nids command was added in Lustre software release 2.4 to simplify - this process. The replace_nids command differs from tunefs.lustre - --writeconf in that it does not erase the entire configuration log, precluding the - need the need to execute the writeconf command on all servers and - re-specify all permanent parameter settings. However, the writeconf command - can still be used if desired. + In order to totally rewrite the Lustre configuration, the + tunefs.lustre --writeconf command is used to + rewrite all of the configuration files. + If you need to change only the NID of the MDT or OST, the + replace_nids command can simplify this process. + The replace_nids command differs from + tunefs.lustre --writeconf in that it does not + erase the entire configuration log, precluding the need the need to + execute the writeconf command on all servers and + re-specify all permanent parameter settings. However, the + writeconf command can still be used if desired. + Change a server NID in these situations: @@ -287,7 +302,7 @@ Changing a Server NID The previous configuration log is backed up on the MGS disk with the suffix '.bak'.
-
+
<indexterm> <primary>maintenance</primary> <secondary>Clearing a config</secondary> @@ -338,7 +353,7 @@ Changing a Server NID
-
+
<indexterm> <primary>maintenance</primary> <secondary>adding an MDT</secondary> @@ -350,7 +365,7 @@ Changing a Server NID user or application workloads from other users of the filesystem. It is possible to have multiple remote sub-directories reference the same MDT. However, the root directory will always be located on - MDT0. To add a new MDT into the file system: + MDT0000. To add a new MDT into the file system: Discover the maximum MDT index. Each MDT must have unique index. @@ -391,7 +406,7 @@ client# lfs mkdir -c 4 /mnt/testfs/new_directory_striped_across_4_mdts
-
+
<indexterm><primary>maintenance</primary><secondary>adding a OST</secondary></indexterm> Adding a New OST to a Lustre File System A new OST can be added to existing Lustre file system on either @@ -403,7 +418,7 @@ Adding a New OST to a Lustre File System Add a new OST by using mkfs.lustre as when the filesystem was first formatted, see - for details. Each new OST + for details. Each new OST must have a unique index number, use lctl dl to see a list of all OSTs. For example, to add a new OST at index 12 to the testfs filesystem run following commands @@ -431,11 +446,11 @@ oss# mount -t lustre /dev/sda /mnt/testfs/ost12 system on OST0004 that are larger than 4GB in size to other OSTs, enter: client# lfs find /test --ost test-OST0004 -size +4G | lfs_migrate -y - See for details. + See for details.
-
+
<indexterm><primary>maintenance</primary><secondary>restoring an OST</secondary></indexterm> <indexterm><primary>maintenance</primary><secondary>removing an OST</secondary></indexterm> Removing and Restoring MDTs and OSTs @@ -461,14 +476,14 @@ Removing and Restoring MDTs and OSTs A hard drive has failed and a RAID resync/rebuild is underway, though the OST can also be marked degraded by the RAID system to avoid allocating new files on the slow OST which - can reduce performance, see + can reduce performance, see for more details. OST is nearing its space capacity, though the MDS will already try to avoid allocating new files on overly-full OSTs if possible, - see for details. + see for details. @@ -477,7 +492,7 @@ Removing and Restoring MDTs and OSTs desire to continue using the filesystem before it is repaired. -
+
<indexterm><primary>maintenance</primary><secondary>removing an MDT</secondary></indexterm>Removing an MDT from the File System If the MDT is permanently inaccessible, lfs rm_entry {directory} can be used to delete the @@ -499,7 +514,7 @@ client$ lfs getstripe --mdt-index /mnt/lustre/local_dir0 The lfs getstripe --mdt-index command returns the index of the MDT that is serving the given directory.
-
+
<indexterm><primary>maintenance</primary></indexterm> <indexterm><primary>maintenance</primary><secondary>inactive MDTs</secondary></indexterm>Working with Inactive MDTs @@ -507,7 +522,7 @@ client$ lfs getstripe --mdt-index /mnt/lustre/local_dir0 the MDT is activated again. Clients accessing an inactive MDT will receive an EIO error.
-
+
<indexterm> <primary>maintenance</primary> <secondary>removing an OST</secondary> @@ -610,20 +625,51 @@ client$ lfs getstripe --mdt-index /mnt/lustre/local_dir0 </listitem> <listitem> <para>If there is not expected to be a replacement for this OST in - the near future, permanently deactivate it on all clients and - the MDS by running the following command on the MGS: - <screen>mgs# lctl conf_param <replaceable>ost_name</replaceable>.osc.active=0</screen></para> + the near future, permanently deactivate it on all clients and + the MDS by running the following command on the MGS: + <screen>mgs# lctl conf_param <replaceable>ost_name</replaceable>.osc.active=0</screen></para> <note><para>A deactivated OST still appears in the file system - configuration, though a replacement OST can be created using the + configuration, though a replacement OST can be created that + re-uses the same OST index with the <literal>mkfs.lustre --replace</literal> option, see - <xref linkend="section_restore_ost"/>. + <xref linkend="lustremaint.restore_ost"/>. </para></note> + <para>To totally remove the OST from the filesystem configuration, + the OST configuration records should be found in the startup + logs by running the command + "<literal>lctl --device MGS llog_print <replaceable>fsname</replaceable>-client</literal>" + on the MGS (and also + "<literal>... <replaceable>$fsname</replaceable>-MDT<replaceable>xxxx</replaceable></literal>" + for all the MDTs) to list all <literal>attach</literal>, + <literal>setup</literal>, <literal>add_osc</literal>, + <literal>add_pool</literal>, and other records related to the + removed OST(s). Once the <literal>index</literal> value is + known for each configuration record, the command + "<literal>lctl --device MGS llog_cancel <replaceable>llog_name</replaceable> -i <replaceable>index</replaceable> </literal>" + will drop that record from the configuration log + <replaceable>llog_name</replaceable> for each of the + <literal><replaceable>fsname</replaceable>-client</literal> and + <literal><replaceable>fsname</replaceable>-MDTxxxx</literal> + configuration logs so that new mounts will no longer process it. + If a whole OSS is being removed, the<literal>add_uuid</literal> + records for the OSS should similarly be canceled. + <screen> +mgs# lctl --device MGS llog_print testfs-client | egrep "192.168.10.99@tcp|OST0003" +- { index: 135, event: add_uuid, nid: 192.168.10.99@tcp(0x20000c0a80a63), node: 192.168.10.99@tcp } +- { index: 136, event: attach, device: testfs-OST0003-osc, type: osc, UUID: testfs-clilov_UUID } +- { index: 137, event: setup, device: testfs-OST0003-osc, UUID: testfs-OST0003_UUID, node: 192.168.10.99@tcp } +- { index: 138, event: add_osc, device: testfs-clilov, ost: testfs-OST0003_UUID, index: 3, gen: 1 } +mgs# lctl --device MGS llog_cancel testfs-client -i 138 +mgs# lctl --device MGS llog_cancel testfs-client -i 137 +mgs# lctl --device MGS llog_cancel testfs-client -i 136 + </screen> + </para> </listitem> </orderedlist> </listitem> </orderedlist> </section> - <section remap="h3" xml:id="section_ydg_pgt_tl"> + <section remap="h3" xml:id="lustremaint.ydg_pgt_tl"> <title><indexterm> <primary>maintenance</primary> <secondary>backing up OST config</secondary> @@ -659,7 +705,7 @@ oss# mount -t ldiskfs <replaceable>/dev/ost_device</replaceable> /mnt/ost</scree </listitem> </orderedlist> </section> - <section xml:id="section_restore_ost"> + <section xml:id="lustremaint.restore_ost"> <title><indexterm> <primary>maintenance</primary> <secondary>restoring OST config</secondary> @@ -670,7 +716,7 @@ oss# mount -t ldiskfs <replaceable>/dev/ost_device</replaceable> /mnt/ost</scree </indexterm> Restoring OST Configuration Files If the original OST is still available, it is best to follow the OST backup and restore procedure given in either - , or + , or and . To replace an OST that was removed from service due to corruption @@ -712,7 +758,7 @@ oss# mount -t ldiskfs /dev/new_ost_dev / Recreate the OST configuration files, if unavailable. Follow the procedure in - to recreate the LAST_ID + to recreate the LAST_ID file for this OST index. The last_rcvd file will be recreated when the OST is first mounted using the default parameters, which are normally correct for all file systems. The @@ -731,7 +777,7 @@ oss0# dd if=/tmp/mountdata of=/mnt/ost/CONFIGS/mountdata bs=4 count=1 seek=5 ski
-
+
<indexterm> <primary>maintenance</primary> <secondary>reintroducing an OSTs</secondary> @@ -745,7 +791,7 @@ oss0# dd if=/tmp/mountdata of=/mnt/ost/CONFIGS/mountdata bs=4 count=1 seek=5 ski client# lctl set_param osc.<replaceable>fsname</replaceable>-OST<replaceable>number</replaceable>-*.active=1</screen></para> </section> </section> - <section xml:id="dbdoclet.50438199_77819"> + <section xml:id="lustremaint.abortRecovery"> <title><indexterm><primary>maintenance</primary><secondary>aborting recovery</secondary></indexterm> <indexterm><primary>backup</primary><secondary>aborting recovery</secondary></indexterm> Aborting Recovery @@ -754,7 +800,7 @@ Aborting Recovery The recovery process is blocked until all OSTs are available.
-
+
<indexterm><primary>maintenance</primary><secondary>identifying OST host</secondary></indexterm> Determining Which Machine is Serving an OST In the course of administering a Lustre file system, you may need to determine which @@ -775,7 +821,7 @@ osc.testfs-OST0002-osc-f1579000.ost_conn_uuid=192.168.20.1@tcp osc.testfs-OST0003-osc-f1579000.ost_conn_uuid=192.168.20.1@tcp osc.testfs-OST0004-osc-f1579000.ost_conn_uuid=192.168.20.1@tcp
-
+
<indexterm><primary>maintenance</primary><secondary>changing failover node address</secondary></indexterm> Changing the Address of a Failover Node To change the address of a failover node (e.g, to use node X instead of node Y), run @@ -788,13 +834,13 @@ Changing the Address of a Failover Node --failnode options, see .
-
+
<indexterm><primary>maintenance</primary><secondary>separate a combined MGS/MDT</secondary></indexterm> Separate a combined MGS/MDT These instructions assume the MGS node will be the same as the MDS node. For instructions on how to move MGS to a different node, see - . + . These instructions are for doing the split without shutting down other servers and clients. @@ -814,7 +860,7 @@ Changing the Address of a Failover Node mds# cp -r /mdt_mount_point/CONFIGS/filesystem_name-* /mgs_mount_point/CONFIGS/. mds# umount /mgs_mount_point mds# umount /mdt_mount_point - See for alternative method. + See for alternative method. Start the MGS. @@ -834,4 +880,55 @@ Changing the Address of a Failover Node
+
+ <indexterm><primary>maintenance</primary> + <secondary>set an MDT to readonly</secondary></indexterm> + Set an MDT to read-only + It is sometimes desirable to be able to mark the filesystem + read-only directly on the server, rather than remounting the clients and + setting the option there. This can be useful if there is a rogue client + that is deleting files, or when decommissioning a system to prevent + already-mounted clients from modifying it anymore. + Set the mdt.*.readonly parameter to + 1 to immediately set the MDT to read-only. All future + MDT access will immediately return a "Read-only file system" error + (EROFS) until the parameter is set to + 0 again. + Example of setting the readonly parameter to + 1, verifying the current setting, accessing from a + client, and setting the parameter back to 0: + mds# lctl set_param mdt.fs-MDT0000.readonly=1 +mdt.fs-MDT0000.readonly=1 + +mds# lctl get_param mdt.fs-MDT0000.readonly +mdt.fs-MDT0000.readonly=1 + +client$ touch test_file +touch: cannot touch ‘test_file’: Read-only file system + +mds# lctl set_param mdt.fs-MDT0000.readonly=0 +mdt.fs-MDT0000.readonly=0 +
+
+ <indexterm><primary>maintenance</primary> + <secondary>Tune fallocate</secondary></indexterm> + Tune Fallocate for ldiskfs + This section shows how to tune/enable/disable fallocate for + ldiskfs OSTs. + The default mode=0 is the standard + "allocate unwritten extents" behavior used by ext4. This is by far the + fastest for space allocation, but requires the unwritten extents to be + split and/or zeroed when they are overwritten. + The OST fallocate mode=1 can also be set to use + "zeroed extents", which may be handled by "WRITE SAME", "TRIM zeroes data", + or other low-level functionality in the underlying block device. + mode=-1 completely disables fallocate. + Example: To completely disable fallocate + lctl set_param osd-ldiskfs.*.fallocate_zero_blocks=-1 + Example: To enable fallocate to use 'zeroed extents' + lctl set_param osd-ldiskfs.*.fallocate_zero_blocks=1 +
+