Whamcloud - gitweb
LUDOC-394 manual: Remove extra 'held' word
[doc/manual.git] / ConfiguringFailover.xml
index 53399f6..8f029bb 100644 (file)
-<?xml version="1.0" encoding="UTF-8"?>
-<article version="5.0" xml:lang="en-US" xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink">
-  <info>
-    <title>Configuring Lustre Failover</title>
-  </info>
-  <informaltable frame="none">
-    <tgroup cols="2">
-      <colspec colname="c1" colwidth="50*"/>
-      <colspec colname="c2" colwidth="50*"/>
-      
-      
-      <tbody>
-        <row>
-          <entry align="left"><para>Lustre 2.0 Operations Manual</para></entry>
-          <entry align="right" valign="top"><para><link xl:href="index.html"><inlinemediaobject><imageobject role="html">
-                    <imagedata contentdepth="26" contentwidth="30" fileref="./shared/toc01.gif" scalefit="1"/>
-                  </imageobject>
-<imageobject role="fo">
-                    <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/toc01.gif" scalefit="1" width="100%"/>
-                  </imageobject>
-</inlinemediaobject></link><link xl:href="ConfiguringLustre.html"><inlinemediaobject><imageobject role="html">
-                    <imagedata contentdepth="26" contentwidth="30" fileref="./shared/prev01.gif" scalefit="1"/>
-                  </imageobject>
-<imageobject role="fo">
-                    <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/prev01.gif" scalefit="1" width="100%"/>
-                  </imageobject>
-</inlinemediaobject></link><link xl:href="III_LustreAdministration.html"><inlinemediaobject><imageobject role="html">
-                    <imagedata contentdepth="26" contentwidth="30" fileref="./shared/next01.gif" scalefit="1"/>
-                  </imageobject>
-<imageobject role="fo">
-                    <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/next01.gif" scalefit="1" width="100%"/>
-                  </imageobject>
-</inlinemediaobject></link><link xl:href="ix.html"><inlinemediaobject><imageobject role="html">
-                    <imagedata contentdepth="26" contentwidth="30" fileref="./shared/index01.gif" scalefit="1"/>
-                  </imageobject>
-<imageobject role="fo">
-                    <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/index01.gif" scalefit="1" width="100%"/>
-                  </imageobject>
-</inlinemediaobject></link></para></entry>
-        </row>
-      </tbody>
-    </tgroup>
-  </informaltable>
-  <para><link xl:href=""/></para>
-  <informaltable frame="none">
-    <tgroup cols="1">
-      <colspec colname="c1" colwidth="100*"/>
-      
-      <tbody>
-        <row>
-          <entry align="right"><para><anchor xml:id="dbdoclet.50438188_pgfId-874" xreflabel=""/>C H A P T E R  11<anchor xml:id="dbdoclet.50438188_30183" xreflabel=""/></para></entry>
-        </row>
-      </tbody>
-    </tgroup>
-  </informaltable>
-  <informaltable frame="none">
-    <tgroup cols="1">
-      <colspec colname="c1" colwidth="100*"/>
-      
-      <tbody>
-        <row>
-          <entry align="right"><para><anchor xml:id="dbdoclet.50438188_pgfId-1292188" xreflabel=""/><anchor xml:id="dbdoclet.50438188_50628" xreflabel=""/>Configuring Lustre Failover</para></entry>
-        </row>
-      </tbody>
-    </tgroup>
-  </informaltable>
-  <para><anchor xml:id="dbdoclet.50438188_pgfId-1292189" xreflabel=""/>This chapter describes how to configure Lustre failover using the Heartbeat cluster infrastructure daemon. It includes:</para>
-  <itemizedlist><listitem>
-      <para><anchor xml:id="dbdoclet.50438188_pgfId-1292193" xreflabel=""/><link xl:href="ConfiguringFailover.html#50438188_82389">Creating a Failover Environment</link></para>
+<?xml version='1.0' encoding='UTF-8'?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+ xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
+ xml:id="configuringfailover">
+  <title xml:id="configuringfailover.title">Configuring Failover in a Lustre
+  File System</title>
+  <para>This chapter describes how to configure failover in a Lustre file
+  system. It includes:</para>
+  <itemizedlist>
+    <listitem>
+      <para>
+        <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+         linkend="high_availability"/></para>
     </listitem>
-<listitem>
-      <para> </para>
+    <listitem>
+      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+       linkend="failover_setup"
+        /></para>
     </listitem>
-<listitem>
-      <para><anchor xml:id="dbdoclet.50438188_pgfId-1293185" xreflabel=""/><link xl:href="ConfiguringFailover.html#50438188_92688">Setting up High-Availability (HA) Software with Lustre</link></para>
+    <listitem>
+      <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+       linkend="administering_failover"/></para>
     </listitem>
-<listitem>
-      <para> </para>
-    </listitem>
-</itemizedlist>
-   <informaltable frame="none">
-    <tgroup cols="1">
-      <colspec colname="c1" colwidth="100*"/>
-      <tbody>
-        <row>
-          <entry><para><emphasis role="bold">Note -</emphasis><anchor xml:id="dbdoclet.50438188_pgfId-1292610" xreflabel=""/><emphasis>Using Lustre Failover is optional.</emphasis></para></entry>
-        </row>
-      </tbody>
-    </tgroup>
-  </informaltable>
-   <section remap="h2">
-    <title><anchor xml:id="dbdoclet.50438188_pgfId-1292208" xreflabel=""/></title>
-    <section remap="h2">
-      <title>11.1 <anchor xml:id="dbdoclet.50438188_82389" xreflabel=""/><anchor xml:id="dbdoclet.50438188_60346" xreflabel=""/>Creating a Failover Environment</title>
-      <para><anchor xml:id="dbdoclet.50438188_pgfId-1292209" xreflabel=""/>Lustre provides failover mechanisms only at the file system level. No failover functionality is provided for system-level components, such as node failure detection or power control, as would typically be provided in a complete failover solution. Additional tools are also needed to provide resource fencing, control and monitoring.</para>
-      <section remap="h3">
-        <title><anchor xml:id="dbdoclet.50438188_pgfId-1292210" xreflabel=""/>11.1.1 Power Management Software</title>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292211" xreflabel=""/>Lustre failover requires power control and management capability to verify that a failed node is shut down before I/O is directed to the failover node. This avoids double-mounting the two nodes, and the risk of unrecoverable data corruption. A variety of power management tools will work, but two packages that are commonly used with Lustre are STONITH and PowerMan.</para>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292212" xreflabel=""/>Shoot The Other Node In The HEAD (STONITH), is a set of power management tools provided with the Linux-HA package. STONITH has native support for many power control devices and is extensible. It uses expect scripts to automate control.</para>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292213" xreflabel=""/>PowerMan, available from the Lawrence Livermore National Laboratory (LLNL), is used to control remote power control (RPC) devices from a central location. PowerMan provides native support for several RPC varieties and expect-like configuration simplifies the addition of new devices.</para>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292214" xreflabel=""/>The latest versions of PowerMan are available at:</para>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292216" xreflabel=""/><link xl:href="http://sourceforge.net/projects/powerman">http://sourceforge.net/projects/powerman</link></para>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292217" xreflabel=""/>For more information about PowerMan, go to:</para>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292219" xreflabel=""/><link xl:href="https://computing.llnl.gov/linux/powerman.html">https://computing.llnl.gov/linux/powerman.html</link></para>
-      </section>
-      <section remap="h3">
-        <title><anchor xml:id="dbdoclet.50438188_pgfId-1292220" xreflabel=""/>11.1.2 Power Equipment</title>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292221" xreflabel=""/>Lustre failover also requires the use of RPC devices, which come in different configurations. Lustre server nodes may be equipped with some kind of service processor that allows remote power control. If a Lustre server node is not equipped with a service processor, then a multi-port, Ethernet-addressable RPC may be used as an alternative. For recommended products, refer to the list of supported RPC devices on the PowerMan website.</para>
-        <para><anchor xml:id="dbdoclet.50438188_pgfId-1292223" xreflabel=""/><link xl:href="https://computing.llnl.gov/linux/powerman.html">https://computing.llnl.gov/linux/powerman.html</link></para>
-      </section>
+  </itemizedlist>
+  <para>For an overview of failover functionality in a Lustre file system, see <xref
+      xmlns:xlink="http://www.w3.org/1999/xlink" linkend="understandingfailover"/>.</para>
+  <section xml:id="high_availability">
+    <title><indexterm>
+        <primary>High availability</primary>
+        <see>failover</see>
+      </indexterm><indexterm>
+        <primary>failover</primary>
+      </indexterm>Setting Up a Failover Environment</title>
+    <para>The Lustre software provides failover mechanisms only at the layer of the Lustre file
+      system. No failover functionality is provided for system-level components such as failing
+      hardware or applications, or even for the entire failure of a node, as would typically be
+      provided in a complete failover solution. Failover functionality such as node monitoring,
+      failure detection, and resource fencing must be provided by external HA software, such as
+      PowerMan or the open source Corosync and Pacemaker packages provided by Linux operating system
+      vendors. Corosync provides support for detecting failures, and Pacemaker provides the actions
+      to take once a failure has been detected.</para>
+    <section remap="h3">
+      <title><indexterm>
+          <primary>failover</primary>
+          <secondary>power control device</secondary>
+        </indexterm>Selecting Power Equipment</title>
+      <para>Failover in a Lustre file system requires the use of a remote
+        power control (RPC) mechanism, which comes in different configurations.
+        For example, Lustre server nodes may be equipped with IPMI/BMC devices
+        that allow remote power control. In the past, software or even
+        “sneakerware” has been used, but these are not recommended. For
+        recommended devices, refer to the list of supported RPC devices on the
+        website for the PowerMan cluster power management utility:</para>
+      <para><link xmlns:xlink="http://www.w3.org/1999/xlink"
+             xlink:href="https://linux.die.net/man/7/powerman-devices">
+        https://linux.die.net/man/7/powerman-devices</link></para>
+    </section>
+    <section remap="h3">
+      <title><indexterm>
+          <primary>failover</primary>
+          <secondary>power management software</secondary>
+        </indexterm>Selecting Power Management Software</title>
+      <para>Lustre failover requires RPC and management capability to verify that a failed node is
+        shut down before I/O is directed to the failover node. This avoids double-mounting the two
+        nodes and the risk of unrecoverable data corruption. A variety of power management tools
+        will work. Two packages that have been commonly used with the Lustre software are PowerMan
+        and Linux-HA (aka. STONITH ).</para>
+      <para>The PowerMan cluster power management utility is used to control
+        RPC devices from a central location. PowerMan provides native support
+        for several RPC varieties and Expect-like configuration simplifies
+        the addition of new devices. The latest versions of PowerMan are
+        available at: </para>
+      <para><link xmlns:xlink="http://www.w3.org/1999/xlink"
+             xlink:href="https://github.com/chaos/powerman">
+          https://github.com/chaos/powerman</link></para>
+      <para>STONITH, or “Shoot The Other Node In The Head”, is a set of power management tools
+        provided with the Linux-HA package prior to Red Hat Enterprise Linux 6. Linux-HA has native
+        support for many power control devices, is extensible (uses Expect scripts to automate
+        control), and provides the software to detect and respond to failures. With Red Hat
+        Enterprise Linux 6, Linux-HA is being replaced in the open source community by the
+        combination of Corosync and Pacemaker. For Red Hat Enterprise Linux subscribers, cluster
+        management using CMAN is available from Red Hat.</para>
     </section>
-    <section remap="h2">
-      <title>11.2 <anchor xml:id="dbdoclet.50438188_92688" xreflabel=""/>Setting up High-Availability (HA) Software with Lustre</title>
-      <para><anchor xml:id="dbdoclet.50438188_pgfId-1292225" xreflabel=""/>Lustre must be combined with high-availability (HA) software to enable a complete Lustre failover solution. Lustre can be used with several HA packages including:</para>
-      <itemizedlist><listitem>
-          <para><anchor xml:id="dbdoclet.50438188_pgfId-1293083" xreflabel=""/><emphasis>Red Hat Cluster Manager</emphasis>  - For more information about setting up Lustre failover with Red Hat Cluster Manager, see the Lustre wiki topic <link xl:href="http://wiki.lustre.org/index.php/Using_Red_Hat_Cluster_Manager_with_Lustre">Using Red Hat Cluster Manager with Lustre</link>.</para>
+    <section>
+      <title><indexterm>
+          <primary>failover</primary>
+          <secondary>high-availability (HA)  software</secondary>
+        </indexterm>Selecting High-Availability (HA) Software</title>
+      <para>The Lustre file system must be set up with high-availability (HA) software to enable a
+        complete Lustre failover solution. Except for PowerMan, the HA software packages mentioned
+        above provide both power management and cluster management.  For information about setting
+        up failover with Pacemaker, see:</para>
+      <itemizedlist>
+        <listitem>
+          <para>Pacemaker Project website:
+            <link xmlns:xlink="http://www.w3.org/1999/xlink"
+             xlink:href="https://clusterlabs.org/">https://clusterlabs.org/
+            </link></para>
         </listitem>
-<listitem>
-          <para> </para>
+        <listitem>
+          <para>Article
+            <emphasis role="italic">Using Pacemaker with a Lustre File System
+            </emphasis>:
+            <link xmlns:xlink="http://www.w3.org/1999/xlink"
+             xlink:href="https://wiki.whamcloud.com/display/PUB/Using+Pacemaker+with+a+Lustre+File+System">
+              https://wiki.whamcloud.com/display/PUB/Using+Pacemaker+with+a+Lustre+File+System</link></para>
         </listitem>
-<listitem>
-          <para><anchor xml:id="dbdoclet.50438188_pgfId-1293110" xreflabel=""/><emphasis>Pacemaker</emphasis>  - For more information about setting up Lustre failover with Pacemaker, see the Lustre wiki topic <link xl:href="http://wiki.lustre.org/index.php/Using_Pacemaker_with_Lustre">Using Pacemaker with Lustre</link>.<anchor xml:id="dbdoclet.50438188_61775" xreflabel=""/></para>
+      </itemizedlist>
+    </section>
+  </section>
+  <section xml:id="failover_setup">
+    <title><indexterm>
+        <primary>failover</primary>
+        <secondary>setup</secondary>
+      </indexterm>Preparing a Lustre File System for Failover</title>
+    <para>To prepare a Lustre file system to be configured and managed as an HA system by a
+      third-party HA application, each storage target (MGT, MGS, OST) must be associated with a
+      second node to create a failover pair. This configuration information is then communicated by
+      the MGS to a client when the client mounts the file system.</para>
+    <para>The per-target configuration is relayed to the MGS at mount time. Some rules related to
+      this are:<itemizedlist>
+        <listitem>
+          <para> When a target is <emphasis role="underline"><emphasis role="italic"
+                >initially</emphasis></emphasis> mounted, the MGS reads the configuration
+            information from the target (such as mgt vs. ost, failnode, fsname) to configure the
+            target into a Lustre file system. If the MGS is reading the initial mount configuration,
+            the mounting node becomes that target's “primary” node.</para>
         </listitem>
-<listitem>
-          <para> </para>
+        <listitem>
+          <para>When a target is <emphasis role="underline"><emphasis role="italic"
+                >subsequently</emphasis></emphasis> mounted, the MGS reads the current configuration
+            from the target and, as needed, will reconfigure the MGS database target
+            information</para>
         </listitem>
-</itemizedlist>
-      <!--
-Begin SiteCatalyst code version: G.5.
--->
-      <!--
-End SiteCatalyst code version: G.5.
--->
-        <informaltable frame="none">
-        <tgroup cols="3">
-          <colspec colname="c1" colwidth="33*"/>
-          <colspec colname="c2" colwidth="33*"/>
-          <colspec colname="c3" colwidth="33*"/>
-          
-          
-          
-          <tbody>
-            <row>
-              <entry align="left"><para>Lustre 2.0 Operations Manual</para></entry>
-              <entry align="right"><para>821-2076-10</para></entry>
-              <entry align="right" valign="top"><para><link xl:href="index.html"><inlinemediaobject><imageobject role="html">
-                        <imagedata contentdepth="26" contentwidth="30" fileref="./shared/toc01.gif" scalefit="1"/>
-                      </imageobject>
-<imageobject role="fo">
-                        <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/toc01.gif" scalefit="1" width="100%"/>
-                      </imageobject>
-</inlinemediaobject></link><link xl:href="ConfiguringLustre.html"><inlinemediaobject><imageobject role="html">
-                        <imagedata contentdepth="26" contentwidth="30" fileref="./shared/prev01.gif" scalefit="1"/>
-                      </imageobject>
-<imageobject role="fo">
-                        <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/prev01.gif" scalefit="1" width="100%"/>
-                      </imageobject>
-</inlinemediaobject></link><link xl:href="III_LustreAdministration.html"><inlinemediaobject><imageobject role="html">
-                        <imagedata contentdepth="26" contentwidth="30" fileref="./shared/next01.gif" scalefit="1"/>
-                      </imageobject>
-<imageobject role="fo">
-                        <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/next01.gif" scalefit="1" width="100%"/>
-                      </imageobject>
-</inlinemediaobject></link><link xl:href="ix.html"><inlinemediaobject><imageobject role="html">
-                        <imagedata contentdepth="26" contentwidth="30" fileref="./shared/index01.gif" scalefit="1"/>
-                      </imageobject>
-<imageobject role="fo">
-                        <imagedata contentdepth="100%" contentwidth="" depth="" fileref="./shared/index01.gif" scalefit="1" width="100%"/>
-                      </imageobject>
-</inlinemediaobject></link></para></entry>
-            </row>
-          </tbody>
-        </tgroup>
-      </informaltable>
-      <para><link xl:href=""/></para>
-      <para><link xl:href="copyright.html">Copyright</link> © 2011, Oracle and/or its affiliates. All rights reserved.</para>
-    </section>
+      </itemizedlist></para>
+    <para>When the target is formatted using the <literal>mkfs.lustre</literal> command, the failover
+      service node(s) for the target are designated using the <literal>--servicenode</literal>
+      option. In the example below, an OST with index <literal>0</literal> in the  file system
+        <literal>testfs</literal> is formatted with two service nodes designated to serve as a
+      failover
+      pair:<screen>mkfs.lustre --reformat --ost --fsname testfs --mgsnode=192.168.10.1@o3ib \  
+              --index=0 --servicenode=192.168.10.7@o2ib \
+              --servicenode=192.168.10.8@o2ib \  
+              /dev/sdb</screen></para>
+    <para>More than two potential service nodes can be designated for a target. The target can then
+      be mounted on any of the designated service nodes.</para>
+    <para>When HA is configured on a storage target, the Lustre software
+      enables multi-mount protection (MMP) on that storage target. MMP prevents
+      multiple nodes from simultaneously mounting and thus corrupting the data
+      on the target. For more about MMP, see
+      <xref xmlns:xlink="http://www.w3.org/1999/xlink"
+      linkend="managingfailover"/>.</para>
+    <para>If the MGT has been formatted with multiple service nodes designated, this information
+      must be conveyed to the Lustre client in the mount command used to mount the file system. In
+      the example below, NIDs for two MGSs that have been designated as service nodes for the MGT
+      are specified in the mount command executed on the
+      client:<screen>mount -t lustre 10.10.120.1@tcp1:10.10.120.2@tcp1:/testfs /lustre/testfs</screen></para>
+    <para>When a client mounts the file system, the MGS provides configuration information to the
+      client for the MDT(s) and OST(s) in the file system along with the NIDs for all service nodes
+      associated with each target and the service node on which the target is mounted. Later, when
+      the client attempts to access data on a target, it will try the NID for each specified service
+      node until it connects to the target.</para>
+  </section>
+  <section xml:id="administering_failover">
+    <title>Administering Failover in a Lustre File System</title>
+    <para>For additional information about administering failover features in a Lustre file system, see:<itemizedlist>
+        <listitem>
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="failover_ost" /></para>
+        </listitem>
+        <listitem>
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="failover_nids"
+            /></para>
+        </listitem>
+        <listitem>
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="lustremaint.ChangeAddrFailoverNode"
+            /></para>
+        </listitem>
+        <listitem>
+          <para><xref xmlns:xlink="http://www.w3.org/1999/xlink"
+                 linkend="mkfs.lustre"
+            /></para>
+        </listitem>
+      </itemizedlist></para>
   </section>
-</article>
+</chapter>
+<!--
+  vim:expandtab:shiftwidth=2:tabstop=8:
+  -->