Whamcloud - gitweb
LUDOC-504 nodemap: servers must be in a trusted+admin group
[doc/manual.git] / ConfiguringFailover.xml
index dc3511d..8f029bb 100644 (file)
@@ -1,23 +1,30 @@
-<?xml version='1.0' encoding='UTF-8'?><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="configuringfailover">
-  <title xml:id="configuringfailover.title">Configuring Failover in a Lustre File System</title>
-  <para>This chapter describes how to configure failover in a Lustre file system. It
-    includes:</para>
+<?xml version='1.0' encoding='UTF-8'?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
+ xml:id="configuringfailover">
+  <title xml:id="configuringfailover.title">Configuring Failover in a Lustre
+  File System</title>
+  <para>This chapter describes how to configure failover in a Lustre file
+  system. It includes:</para>
   <itemizedlist>
     <listitem>
       <para>
-        <xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438188_82389"/></para>
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+         linkend="high_availability"/></para>
     </listitem>
     <listitem>
-      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438188_92688"
+      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+       linkend="failover_setup"
         /></para>
     </listitem>
     <listitem>
-      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="section_tnq_kbr_xl"/></para>
+      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+       linkend="administering_failover"/></para>
     </listitem>
   </itemizedlist>
   <para>For an overview of failover functionality in a Lustre file system, see <xref
       xmlns:xlink="http://www.w3.org/1999/xlink" linkend="understandingfailover"/>.</para>
-  <section xml:id="dbdoclet.50438188_82389">
+  <section xml:id="high_availability">
     <title><indexterm>
         <primary>High availability</primary>
         <see>failover</see>
           <primary>failover</primary>
           <secondary>power control device</secondary>
         </indexterm>Selecting Power Equipment</title>
-      <para>Failover in a Lustre file system requires the use of a remote power control (RPC)
-        mechanism, which comes in different configurations. For example, Lustre server nodes may be
-        equipped with IPMI/BMC devices that allow remote power control. In the past, software or
-        even “sneakerware” has been used, but these are not recommended. For recommended devices,
-        refer to the list of supported RPC devices on the website for the PowerMan cluster power
-        management utility:</para>
+      <para>Failover in a Lustre file system requires the use of a remote
+        power control (RPC) mechanism, which comes in different configurations.
+        For example, Lustre server nodes may be equipped with IPMI/BMC devices
+        that allow remote power control. In the past, software or even
+        “sneakerware” has been used, but these are not recommended. For
+        recommended devices, refer to the list of supported RPC devices on the
+        website for the PowerMan cluster power management utility:</para>
       <para><link xmlns:xlink="http://www.w3.org/1999/xlink"
-          xlink:href="http://code.google.com/p/powerman/wiki/SupportedDevs"
-          >http://code.google.com/p/powerman/wiki/SupportedDevs</link></para>
+             xlink:href="https://linux.die.net/man/7/powerman-devices">
+        https://linux.die.net/man/7/powerman-devices</link></para>
     </section>
     <section remap="h3">
       <title><indexterm>
         nodes and the risk of unrecoverable data corruption. A variety of power management tools
         will work. Two packages that have been commonly used with the Lustre software are PowerMan
         and Linux-HA (aka. STONITH ).</para>
-      <para>The PowerMan cluster power management utility is used to control RPC devices from a
-        central location. PowerMan provides native support for several RPC varieties and Expect-like
-        configuration simplifies the addition of new devices. The latest versions of PowerMan are
+      <para>The PowerMan cluster power management utility is used to control
+        RPC devices from a central location. PowerMan provides native support
+        for several RPC varieties and Expect-like configuration simplifies
+        the addition of new devices. The latest versions of PowerMan are
         available at: </para>
       <para><link xmlns:xlink="http://www.w3.org/1999/xlink"
-          xlink:href="http://code.google.com/p/powerman/"
-        >http://code.google.com/p/powerman/</link></para>
+             xlink:href="https://github.com/chaos/powerman">
+          https://github.com/chaos/powerman</link></para>
       <para>STONITH, or “Shoot The Other Node In The Head”, is a set of power management tools
         provided with the Linux-HA package prior to Red Hat Enterprise Linux 6. Linux-HA has native
         support for many power control devices, is extensible (uses Expect scripts to automate
         up failover with Pacemaker, see:</para>
       <itemizedlist>
         <listitem>
-          <para>Pacemaker Project website:  <link xmlns:xlink="http://www.w3.org/1999/xlink"
-              xlink:href="http://clusterlabs.org/"><link xlink:href="http://clusterlabs.org/"
-                >http://clusterlabs.org/</link></link></para>
+          <para>Pacemaker Project website:
+            <link xmlns:xlink="http://www.w3.org/1999/xlink"
+             xlink:href="https://clusterlabs.org/">https://clusterlabs.org/
+            </link></para>
         </listitem>
         <listitem>
-          <para>Article <emphasis role="italic">Using Pacemaker with a Lustre File
-            System</emphasis>:  <link xmlns:xlink="http://www.w3.org/1999/xlink"
-              xlink:href="https://wiki.hpdd.intel.com/display/PUB/Using+Pacemaker+with+a+Lustre+File+System"
-                ><link
-                xlink:href="https://wiki.hpdd.intel.com/display/PUB/Using+Pacemaker+with+a+Lustre+File+System"
-                >https://wiki.hpdd.intel.com/display/PUB/Using+Pacemaker+with+a+Lustre+File+System</link></link></para>
+          <para>Article
+            <emphasis role="italic">Using Pacemaker with a Lustre File System
+            </emphasis>:
+            <link xmlns:xlink="http://www.w3.org/1999/xlink"
+             xlink:href="https://wiki.whamcloud.com/display/PUB/Using+Pacemaker+with+a+Lustre+File+System">
+              https://wiki.whamcloud.com/display/PUB/Using+Pacemaker+with+a+Lustre+File+System</link></para>
         </listitem>
       </itemizedlist>
     </section>
   </section>
-  <section xml:id="dbdoclet.50438188_92688">
+  <section xml:id="failover_setup">
     <title><indexterm>
         <primary>failover</primary>
         <secondary>setup</secondary>
             information</para>
         </listitem>
       </itemizedlist></para>
-    <para>When the target is formatted using the <literal>mkfs.lustre</literal>command, the failover
+    <para>When the target is formatted using the <literal>mkfs.lustre</literal> command, the failover
       service node(s) for the target are designated using the <literal>--servicenode</literal>
       option. In the example below, an OST with index <literal>0</literal> in the  file system
         <literal>testfs</literal> is formatted with two service nodes designated to serve as a
               --index=0 --servicenode=192.168.10.7@o2ib \
               --servicenode=192.168.10.8@o2ib \  
               /dev/sdb</screen></para>
-    <para>More than two potential service nodes caan be designated for a target. The target can then
+    <para>More than two potential service nodes can be designated for a target. The target can then
       be mounted on any of the designated service nodes.</para>
-    <para>When HA is configured on a storage target, the Lustre software enables multi-mount
-      protection (MMP) on that storage target. MMP prevents multiple nodes from simultaneously
-      mounting and thus corrupting the data on the target. For more about MMP, see <xref
-        xmlns:xlink="http://www.w3.org/1999/xlink" linkend="managingfailover"/>.</para>
+    <para>When HA is configured on a storage target, the Lustre software
+      enables multi-mount protection (MMP) on that storage target. MMP prevents
+      multiple nodes from simultaneously mounting and thus corrupting the data
+      on the target. For more about MMP, see
+      <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+      linkend="managingfailover"/>.</para>
     <para>If the MGT has been formatted with multiple service nodes designated, this information
       must be conveyed to the Lustre client in the mount command used to mount the file system. In
       the example below, NIDs for two MGSs that have been designated as service nodes for the MGT
       associated with each target and the service node on which the target is mounted. Later, when
       the client attempts to access data on a target, it will try the NID for each specified service
       node until it connects to the target.</para>
-    <para>Previous to Lustre software release 2.0, the <literal>--failnode</literal> option to
-        <literal>mkfs.lustre</literal> was used to designate a failover service node for a primary
-      server for a target. When the <literal>--failnode</literal> option is used, certain
-      restrictions apply:<itemizedlist>
-        <listitem>
-          <para>The target must be initially mounted on the primary service node, not the failover
-            node designated by the <literal>--failnode</literal> option.</para>
-        </listitem>
-        <listitem>
-          <para>If the <literal>tunefs.lustre –-writeconf</literal> option is used to erase and
-            regenerate the configuration log for the file system, a target cannot be initially
-            mounted on a designated failnode.</para>
-        </listitem>
-        <listitem>
-          <para>If a <literal>--failnode</literal> option is added to a target to designate a
-            failover server for the target, the target must be re-mounted on the primary node before
-            the <literal>--failnode</literal> option takes effect</para>
-        </listitem>
-      </itemizedlist></para>
   </section>
-  <section xml:id="section_tnq_kbr_xl">
+  <section xml:id="administering_failover">
     <title>Administering Failover in a Lustre File System</title>
     <para>For additional information about administering failover features in a Lustre file system, see:<itemizedlist>
         <listitem>
-          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438194_57420"
-            /></para>
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="failover_ost" /></para>
         </listitem>
         <listitem>
-          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438194_41817"
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="failover_nids"
             /></para>
         </listitem>
         <listitem>
-          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438199_62333"
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="lustremaint.ChangeAddrFailoverNode"
             /></para>
         </listitem>
         <listitem>
-          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink" linkend="dbdoclet.50438219_75432"
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="mkfs.lustre"
             /></para>
         </listitem>
       </itemizedlist></para>
   </section>
 </chapter>
+<!--
+  vim:expandtab:shiftwidth=2:tabstop=8:
+  -->