LUDOC-379: Ladvise Lockahead

[doc/manual.git] / ConfiguringQuotas.xml
diff --git a/ConfiguringQuotas.xml b/ConfiguringQuotas.xml

index fa124b7..7e6838b 100644 (file)
--- a/ConfiguringQuotas.xml
+++ b/ConfiguringQuotas.xml
@@ -1,577 +1,889 @@
-<?xml version='1.0' encoding='UTF-8'?>
-<!-- This document was created with Syntext Serna Free. --><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="configuringquotas">
-  <title xml:id="configuringquotas.title">Configuring and Managing Quotas</title>
-  <para>This chapter describes how to configure quotas and includes the following sections:</para>
-  <itemizedlist>
-    <listitem>
-      <para><xref linkend="dbdoclet.50438217_54945"/></para>
-    </listitem>
-    <listitem>
-      <para><xref linkend="dbdoclet.50438217_31982"/></para>
-    </listitem>
-    <listitem>
-      <para><xref linkend="dbdoclet.50438217_49939"/></para>
-    </listitem>
-    <listitem>
-      <para><xref linkend="dbdoclet.50438217_15106"/></para>
-    </listitem>
-    <listitem>
-      <para><xref linkend="dbdoclet.50438217_27895"/></para>
-    </listitem>
-    <listitem>
-      <para><xref linkend="dbdoclet.50438217_20772"/></para>
-    </listitem>
-  </itemizedlist>
-  <section xml:id="dbdoclet.50438217_54945">
-      <title>
-          <indexterm><primary>Quotas</primary><secondary>configuring</secondary></indexterm>
-          Working with Quotas</title>
-    <para>Quotas allow a system administrator to limit the amount of disk space a user or group can use in a directory. Quotas are set by root, and can be specified for individual users and/or groups. Before a file is written to a partition where quotas are set, the quota of the creator&apos;s group is checked. If a quota exists, then the file size counts towards the group&apos;s quota. If no quota exists, then the owner&apos;s user quota is checked before the file is written. Similarly, inode usage for specific functions can be controlled if a user over-uses the allocated space.</para>
-    <para>Lustre quota enforcement differs from standard Linux quota enforcement in several ways:</para>
+<?xml version='1.0' encoding='utf-8'?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
+xml:id="configuringquotas">
+  <title xml:id="configuringquotas.title">Configuring and Managing
+  Quotas</title>
+  <section xml:id="quota_configuring">
+    <title>
+    <indexterm>
+      <primary>Quotas</primary>
+      <secondary>configuring</secondary>
+    </indexterm>Working with Quotas</title>
+    <para>Quotas allow a system administrator to limit the amount of disk
+    space a user, group, or project can use. Quotas are set by root, and can
+    be specified for individual users, groups, and/or projects. Before a file
+    is written to a partition where quotas are set, the quota of the creator's
+    group is checked. If a quota exists, then the file size counts towards
+    the group's quota. If no quota exists, then the owner's user quota is
+    checked before the file is written. Similarly, inode usage for specific
+    functions can be controlled if a user over-uses the allocated space.</para>
+    <para>Lustre quota enforcement differs from standard Linux quota
+    enforcement in several ways:</para>
      <itemizedlist>
        <listitem>
-        <para>Quotas are administered via the <literal>lfs</literal> command (post-mount).</para>
+        <para>Quotas are administered via the
+        <literal>lfs</literal> and
+        <literal>lctl</literal> commands (post-mount).</para>
        </listitem>
        <listitem>
-        <para>Quotas are distributed (as Lustre is a distributed file system), which has several ramifications.</para>
-      </listitem>
-      <listitem>
-        <para>Quotas are allocated and consumed in a quantized fashion.</para>
-      </listitem>
-      <listitem>
-        <para>Client does not set the <literal>usrquota</literal> or <literal>grpquota</literal> options to mount. When quota is enabled, it is enabled for all clients of the file system; started automatically using <literal>quota_type</literal> or started manually with <literal>lfs quotaon</literal>.</para>
-      </listitem>
-    </itemizedlist>
-    <caution>
-      <para>Although quotas are available in Lustre, root quotas are NOT enforced.</para>
-      <para><literal>lfs setquota -u root</literal> (limits are not enforced)</para>
-      <para><literal>lfs quota -u root</literal> (usage includes internal Lustre data that is dynamic in size and does not accurately reflect mount point visible block and inode usage).</para>
-    </caution>
-  </section>
-  <section xml:id="dbdoclet.50438217_31982">
-    <title><indexterm><primary>Quotas</primary><secondary>enabling disk</secondary></indexterm>Enabling Disk Quotas</title>
-    <para>Use this procedure to enable (configure) disk quotas in Lustre.</para>
-    <orderedlist>
-      <listitem>
-        <para>If you have re-complied your Linux kernel, be sure that <literal>CONFIG_QUOTA</literal> and <literal>CONFIG_QUOTACTL</literal> are enabled. Also, verify that <literal>CONFIG_QFMT_V1</literal> and/or <literal>CONFIG_QFMT_V2</literal> are enabled.</para>
-        <para>Quota is enabled in all Linux 2.6 kernels supplied for Lustre.</para>
-      </listitem>
-      <listitem>
-        <para>Start the server.</para>
-      </listitem>
-      <listitem>
-        <para>
-        Mount the Lustre file system on the client and verify that the <literal>lquota</literal> module has loaded properly by using the <literal>lsmod</literal> command.
-        
-        </para>
-        <screen>$ lsmod
-[root@oss161 ~]# lsmod
-Module                     Size                    Used by
-obdfilter          220532                  1
-fsfilt_ldiskfs             52228                   1
-ost                        96712                   1
-mgc                        60384                   1
-ldiskfs                    186896                  2 fsfilt_ldiskfs
-lustre                     401744                  0
-lov                        289064                  1 lustre
-lquota                     107048                  4 obdfilter
-mdc                        95016                   1 lustre
-ksocklnd           111812                  1</screen>
-      </listitem>
-    </orderedlist>
-    <para>The Lustre mount command no longer recognizes the <literal>usrquota</literal> and <literal>grpquota</literal> options. If they were previously specified, remove them from <literal>/etc/fstab</literal>.</para>
-    <para>When quota is enabled, it is enabled for all file system clients (started automatically using <literal>quota_type</literal> or manually with <literal>lfs quotaon</literal>).</para>
-    <note>
-      <para>Lustre with the Linux kernel 2.4 does <emphasis>not</emphasis> support quotas.</para>
-    </note>
-    <para>To enable quotas automatically when the file system is started, you must set the <literal>mdt.quota_type</literal> and <literal>ost.quota_type</literal> parameters, respectively, on the MDT and OSTs. The parameters can be set to the string <literal>u</literal> (user), <literal>g</literal> (group) or <literal>ug</literal> for both users and groups.</para>
-    <para>You can enable quotas at <literal>mkfs</literal> time (<literal>mkfs.lustre --param mdt.quota_type=ug</literal>) or with <literal>tunefs.lustre</literal>. As an example:</para>
-    <screen>tunefs.lustre --param ost.quota_type=ug $ost_dev</screen>
-    <caution>
-      <para>If you are using <literal>mkfs.lustre --param mdt.quota_type=ug</literal> or <literal>tunefs.lustre --param ost.quota_type=ug</literal>, be sure to run the command on all OSTs and the MDT. Otherwise, abnormal results may occur.</para>
-    </caution>
-    <section remap="h4">
-      <title><indexterm><primary>Quotas</primary><secondary>administrating</secondary></indexterm>Administrative and Operational Quotas</title>
-      <para>Lustre has two kinds of quota files:</para>
-      <itemizedlist>
-        <listitem>
-          <para>Administrative quotas (for the MDT), which contain limits for users/groups for the entire cluster.</para>
-        </listitem>
-        <listitem>
-          <para>Operational quotas (for the MDT and OSTs), which contain quota information dedicated to a cluster node.</para>
-        </listitem>
-      </itemizedlist>
-      <para>Lustre 1.6.5 introduced the v2 file format for administrative quota files, with continued support for the old file format (v1). The mdt.quota_type parameter also handles &apos;1&apos; and &apos;2&apos; options, to specify the Lustre quota versions that will be used. For example:</para>
-      <screen>--param mdt.quota_type=ug1
---param mdt.quota_type=u2</screen>
-      <para>Lustre 1.6.6 introduced the v2 file format for operational quotas, with continued support for the old file format (v1). The ost.quota_type parameter handles &apos;1&apos; and &apos;2&apos; options, to specify the Lustre quota versions that will be used. For example:</para>
-      <screen>--param ost.quota_type=ug2
---param ost.quota_type=u1</screen>
-      <para>For more information about the v1 and v2 formats, see <xref linkend="dbdoclet.50438217_66360"/>.</para>
-    </section>
-  </section>
-  <section xml:id="dbdoclet.50438217_49939">
-    <title><indexterm><primary>Quotas</primary><secondary>creating</secondary></indexterm>Creating Quota Files and Quota Administration</title>
-    <para>Once each quota-enabled file system is remounted, it is capable of working with disk quotas. However, the file system is not yet ready to support quotas. If <literal>umount</literal> has been done regularly, run the <literal>lfs</literal> command with the <literal>quotaon</literal> option. If <literal>umount</literal> has not been done, perform these steps:</para>
-    <orderedlist>
-      <listitem>
-        <para>Take Lustre &apos;&apos;offline&apos;&apos;.</para>
-        <para>That is, verify that no write operations (append, write, truncate, create or delete) are being performed (preparing to run <literal>lfs quotacheck</literal>). Operations that do not change Lustre files (such as read or mount) are okay to run.</para>
-        <caution>
-          <para>When <literal>lfs quotacheck</literal> is run, Lustre must NOT be performing any write operations. Failure to follow this caution may cause the statistic information of quota to be inaccurate. For example, the number of blocks used by OSTs for users or groups will be inaccurate, which can cause unexpected quota problems.</para>
-        </caution>
-      </listitem>
-      <listitem>
-        <para> Run the <emphasis role="bold">
-            <literal>lfs</literal>
-          </emphasis> command with the <emphasis role="bold">
-            <literal>quotacheck</literal>
-          </emphasis> option:</para>
-        <screen># lfs quotacheck -ug /mnt/lustre</screen>
-        <para>By default, quota is turned on after <literal>quotacheck</literal> completes. Available options are:</para>
+               <para>The quota feature in Lustre software is distributed
+        throughout the system (as the Lustre file system is a distributed file
+        system). Because of this, quota setup and behavior on Lustre is
+        different from local disk quotas in the following ways:</para>
          <itemizedlist>
-          <listitem>
-            <para><literal>u</literal>  -- checks the user disk quota information</para>
+        <listitem>
+          <para>No single point of administration: some commands must be
+          executed on the MGS, other commands on the MDSs and OSSs, and still
+          other commands on the client.</para>
            </listitem>
            <listitem>
-            <para><literal>g</literal>  -- checks the group disk quota information</para>
+          <para>Granularity: a local quota is typically specified for
+          kilobyte resolution, Lustre uses one megabyte as the smallest quota
+          resolution.</para>
            </listitem>
+          <listitem>
+          <para>Accuracy: quota information is distributed throughout
+the file system and can only be accurately calculated with a completely
+quite file system.</para>
+        </listitem>
          </itemizedlist>
        </listitem>
-    </orderedlist>
-    <para>The lfsquotacheck command checks all objects on all OSTs and the MDS to sum up for every UID/GID. It reads all Lustre metadata and re-computes the number of blocks/inodes that each UID/GID has used. If there are many files in Lustre, it may take a long time to complete.</para>
-    <note>
-      <para>User and group quotas are separate. If either quota limit is reached, a process with the corresponding UID/GID cannot allocate more space on the file system.</para>
-    </note>
-    <note>
-      <para>When <literal>lfs quotacheck</literal> runs, it creates a quota file -- a sparse file with a size proportional to the highest UID in use and UID/GID distribution. As a general rule, if the highest UID in use is large, then the sparse file will be large, which may affect functions such as creating a snapshot.</para>
-    </note>
-    <note>
-      <para>For Lustre 1.6 releases before version 1.6.5, and 1.4 releases before version 1.4.12, if the underlying <literal>ldiskfs</literal> file system has not unmounted gracefully (due to a crash, for example), re-run <literal>quotacheck</literal> to obtain accurate quota information. Lustre 1.6.5 and 1.4.12 use journaled quota, so it is not necessary to run <literal>quotacheck</literal> after an unclean shutdown.</para>
-      <para>In certain failure situations (e.g., when a broken Lustre installation or build is used), re-run <literal>quotacheck</literal> after checking the server kernel logs and fixing the root problem.</para>
-    </note>
-    <para>The <literal>lfs</literal> command includes several command options to work with quotas:</para>
-    <itemizedlist>
-      <listitem>
-        <para><varname>quotaon</varname>  -- enables disk quotas on the specified file system. The file system quota files must be present in the root directory of the file system.</para>
-      </listitem>
        <listitem>
-        <para><varname>quotaoff</varname>  -- disables disk quotas on the specified file system.</para>
-      </listitem>
-      <listitem>
-        <para><varname>quota</varname>  -- displays general quota information (disk usage and limits)</para>
+        <para>Quotas are allocated and consumed in a quantized fashion.</para>
        </listitem>
        <listitem>
-        <para><varname>setquota</varname>  -- specifies quota limits and tunes the grace period. By default, the grace period is one week.</para>
+        <para>Client does not set the
+        <literal>usrquota</literal> or
+        <literal>grpquota</literal> options to mount. As of Lustre software
+        release 2.4, space accounting is always enabled by default and quota
+        enforcement can be enabled/disabled on a per-file system basis with
+        <literal>lctl conf_param</literal>. It is worth noting that both
+        <literal>lfs quotaon</literal> and
+        <literal>quota_type</literal> are deprecated as of Lustre software
+        release 2.4.0.</para>
        </listitem>
      </itemizedlist>
-    <para> Usage:</para>
-    <screen>lfs quotaon [-ugf] &lt;filesystem&gt;
-lfs quotaoff [-ug] &lt;filesystem&gt;
-lfs quota [-q] [-v] [-o obd_uuid] [-u|-g &lt;uname&gt;|uid|gname|gid&gt;]  &lt;filesystem&gt;
-lfs quota -t &lt;-u|-g&gt; &lt;filesystem&gt;
-lfs setquota &lt;-u|--user|-g|--group&gt; &lt;username|groupname&gt; [-b &lt;block-softlimit&gt;] [\
--B &lt;block-hardlimit&gt;] [-i &lt;inode-softlimit&gt;] [-I &lt;inode-hardlimit&gt;] &lt;filesystem&gt;</screen>
-    <para>Examples:</para>
-    <para>In all of the examples below, the file system is <literal>/mnt</literal> lustre.</para>
-    <para>To turn on user and group quotas, run:</para>
-    <screen>$ lfs quotaon -ug /mnt/lustre</screen>
-    <para>To turn off user and group quotas, run:</para>
-    <screen>$ lfs quotaoff -ug /mnt/lustre</screen>
-    <para>To display general quota information (disk usage and limits) for the user running the command and his primary group, run:</para>
-    <screen>$ lfs quota /mnt/lustre </screen>
-    <para>To display general quota information for a specific user (&quot;<literal>bob</literal>&quot; in this example), run:</para>
-    <screen>$ lfs quota -u bob /mnt/lustre</screen>
-    <para>To display general quota information for a specific user (&quot;<literal>bob</literal>&quot; in this example) and detailed quota statistics for each MDT and OST, run:</para>
-    <screen>$ lfs quota -u bob -v /mnt/lustre</screen>
-    <para>To display general quota information for a specific group (&quot;<literal>eng</literal>&quot; in this example), run:</para>
-    <screen>$ lfs quota -g eng /mnt/lustre</screen>
-    <para>To display block and inode grace times for user quotas, run:</para>
-    <screen>$ lfs quota -t -u /mnt/lustre</screen>
-    <para>To set user and group quotas for a specific user (&quot;bob&quot; in this example), run:</para>
-    <screen>$ lfs setquota -u bob 307200 309200 10000 11000 /mnt/lustre</screen>
-    <para>In this example, the quota for user &quot;bob&quot; is set to 300 MB (309200*1024) and the hard limit is 11,000 files. Therefore, the inode hard limit should be 11000.</para>
-    <note>
-      <para>For the Lustre command <literal>$lfssetquota/quota ...</literal> the qunit for block is KB (1024) and the qunit for inode is 1.</para>
-    </note>
-    <para>The quota command displays the quota allocated and consumed for each Lustre device. Using the previous <literal>setquota</literal> example, running this <literal>lfs</literal> quota command:</para>
-    <screen>$ lfs quota -u bob -v /mnt/lustre </screen>
-    <para>displays this command output:</para>
-    <screen>Disk quotas for user bob (uid 6000):
-Filesystem         kbytes          quota           limit           grace   \
-        files           quota           limit           grace
-/mnt/lustre                0               30720           30920           \
--               0               10000           11000           -
-lustre-MDT0000_UUID        0               -               16384           \
--               0               -               2560            -
-lustre-OST0000_UUID        0               -               16384           \
--               0               -               0               -
-lustre-OST0001_UUID        0               -               16384           \
--               0               -               0               -</screen>
+    <caution>
+      <para>Although a quota feature is available in the Lustre software, root
+      quotas are NOT enforced.</para>
+      <para>
+      <literal>lfs setquota -u root</literal> (limits are not enforced)</para>
+      <para>
+      <literal>lfs quota -u root</literal> (usage includes internal Lustre data
+      that is dynamic in size and does not accurately reflect mount point
+      visible block and inode usage).</para>
+    </caution>
    </section>
-  <section xml:id="dbdoclet.50438217_15106">
-    <title><indexterm><primary>Quotas</primary><secondary>allocating</secondary></indexterm>Quota Allocation</title>
-    <para>In Lustre, quota must be properly allocated or users may experience unnecessary failures. The file system block quota is divided up among the OSTs within the file system. Each OST requests an allocation which is increased up to the quota limit. The quota allocation is then <emphasis role="italic">quantized</emphasis> to reduce the number of quota-related request traffic. By default, Lustre supports both user and group quotas to limit disk usage and file counts.</para>
-    <para>The quota system in Lustre is completely compatible with the quota systems used on other file systems. The Lustre quota system distributes quotas from the quota master. Generally, the MDS is the quota master for both inodes and blocks. All OSTs and the MDS are quota slaves to the OSS nodes. To reduce quota requests and get reasonably accurate quota distribution, the transfer quota unit (qunit) between quota master and quota slaves is changed dynamically by the lquota module. The default minimum value of qunit is 1 MB for blocks and 2 for inodes. The proc entries to set these values are: <literal>/proc/fs/lustre/mds/lustre-MDT*/quota_least_bunit</literal> and <literal>/proc/fs/lustre/mds/lustre-MDT*/quota_least_iunit</literal>. The default maximum value of <literal>qunit</literal> is 128 MB for blocks and 5120 for inodes. The proc entries to set these values are <literal>quota_bunit_sz</literal> and <literal>quota_iunit_sz</literal> in the MDT and OSTs.</para>
-    <note>
-      <para>In general, the <literal>quota_bunit_sz</literal> value should be larger than 1 MB. For testing purposes, it can be set to 4 KB, if necessary.</para>
-    </note>
-    <para>The file system block quota is divided up among the OSTs and the MDS within the file system. Only the MDS uses the file system inode quota.</para>
-    <para>This means that the minimum quota for block is 1 MB* (the number of OSTs + the number of MDSs), which is 1 MB* (number of OSTs + 1). If you attempt to assign a smaller quota, users maybe not be able to create files. As noted, the default minimum quota for inodes is 2. The default is established at file system creation time, but can be tuned via <literal>/proc</literal> values (described below). The inode quota is also allocated in a quantized manner on the MDS.</para>
-    <para>If we look at the <literal>setquota</literal> example again, running this <literal>lfs quota</literal> command:</para>
-    <screen># lfs quota -u bob -v /mnt/lustre
+  <section xml:id="enabling_disk_quotas">
+    <title>
+    <indexterm>
+      <primary>Quotas</primary>
+      <secondary>enabling disk</secondary>
+    </indexterm>Enabling Disk Quotas</title>
+    <para>The design of quotas on Lustre has management and enforcement
+    separated from resource usage and accounting. Lustre software is
+    responsible for management and enforcement. The back-end file
+    system is responsible for resource usage and accounting. Because of
+    this, it is necessary to begin enabling quotas by enabling quotas on the
+    back-end disk system. Because quota setup is dependent on the Lustre
+    software version in use, you may first need to run
+    <literal>lctl get_param version</literal> to identify
+    <xref linkend="whichversion"/> you are currently using.
+    </para>
+    <section>
+      <title>Enabling Disk Quotas (Lustre Software Prior to Release 2.4)
+      </title>
+      <para>
+      For Lustre software releases older than release 2.4,
+      <literal>lfs quotacheck</literal> must be first run from a client node to
+      create quota files on the Lustre targets (i.e. the MDT and OSTs).
+      <literal>lfs quotacheck</literal> requires the file system to be quiescent
+      (i.e. no modifying operations like write, truncate, create or delete
+      should run concurrently). Failure to follow this caution may result in
+      inaccurate user/group disk usage. Operations that do not change Lustre
+      files (such as read or mount) are okay to run.
+      <literal>lfs quotacheck</literal> performs a scan on all the Lustre
+      targets to calculates the block/inode usage for each user/group. If the
+      Lustre file system has many files,
+      <literal>quotacheck</literal> may take a long time to complete. Several
+      options can be passed to
+      <literal>lfs quotacheck</literal>:</para>
+      <screen>
+# lfs quotacheck -ug /mnt/testfs
  </screen>
-    <para>displays this command output:</para>
-    <screen>Disk quotas for user bob (uid 500):
-Filesystem         kbytes          quota           limit           grace   \
-        files           quota           limit           grace
-/mnt/lustre                30720*          30720           30920           \
-6d23h56m44s     10101*          10000           11000           6d23h59m50s
-lustre-MDT0000_UUID        0               -               1024            \
--               10101           -               10240
-lustre-OST0000_UUID        0               -               1024            \
--               -               -               -
-lustre-OST0001_UUID        30720*          -               28872           \
--               -               -               -</screen>
-    <para>The total quota limit of 30,920 is allotted to user bob, which is further distributed to two OSTs and one MDS.</para>
-    <note>
-      <para>Values appended with &apos;<literal>*</literal>&apos; show the limit that has been over-used (exceeding the quota), and receives this message Disk quota exceeded. For example:</para>
-      <para><screen>$ cp: writing `/mnt/lustre/var/cache/fontconfig/ beeeeb3dfe132a8a0633a017c99ce0-x86.cache&apos;: Disk quota exceeded.</screen></para>
-    </note>
-    <para>The requested quota of 300 MB is divided across the OSTs.</para>
-    <note>
-      <para>It is very important to note that the block quota is consumed per OST and the MDS per block and inode (there is only one MDS for inodes). Therefore, when the quota is consumed on one OST, the client may not be able to create files regardless of the quota available on other OSTs.</para>
-    </note>
-    <section remap="h5">
-      <title>Additional information:</title>
-      <para><emphasis role="bold">Grace period</emphasis> -- The period of time (in seconds) within which users are allowed to exceed their soft limit. There are four types of grace periods:</para>
        <itemizedlist>
          <listitem>
-          <para> user block soft limit</para>
+          <para>
+          <literal>u</literal>-- checks the user disk quota information</para>
          </listitem>
          <listitem>
-          <para> user inode soft limit</para>
-        </listitem>
-        <listitem>
-          <para> group block soft limit</para>
-        </listitem>
-        <listitem>
-          <para> group inode soft limit</para>
+          <para>
+          <literal>g</literal>-- checks the group disk quota information</para>
          </listitem>
        </itemizedlist>
-      <para>The grace periods are applied to all users. The user block soft limit is for all users who are using a blocks quota.</para>
-      <para><emphasis role="bold">Soft limit</emphasis> -- Once you are beyond the soft limit, the quota module begins to time, but you still can write block and inode. When you are always beyond the soft limit and use up your grace time, you get the same result as the hard limit. For inodes and blocks, it is the same. Usually, the soft limit MUST be less than the hard limit; if not, the quota module never triggers the timing. If the soft limit is not needed, leave it as zero (0).</para>
-      <para><emphasis role="bold">Hard limit</emphasis> -- When you are beyond the hard limit, you get <literal>-EQUOTA</literal> and cannot write inode/block any more. The hard limit is the absolute limit. When a grace period is set, you can exceed the soft limit within the grace period if are under the hard limits.</para>
-      <para>Lustre quota allocation is controlled by two variables, <literal>quota_bunit_sz</literal> and <literal>quota_iunit_sz</literal> referring to KBs and inodes, respectively. These values can be accessed on the MDS as <literal>/proc/fs/lustre/mds/*/quota_*</literal> and on the OST as <literal>/proc/fs/lustre/obdfilter/*/quota_*</literal>. The <literal>quota_bunit_sz </literal>and <literal>quota_iunit_sz</literal> variables are the maximum qunit values for blocks and inodes, respectively. At any time, module lquota chooses a reasonable qunit between the minimum and maximum values.</para>
-      <para>The /proc values are bounded by two other variables <literal>quota_btune_sz</literal> and <literal>quota_itune_sz</literal>. By default, the <literal>*tune_sz</literal> variables are set at 1/2 the <literal>*unit_sz</literal> variables, and you cannot set <literal>*tune_sz</literal> larger than <literal>*unit_sz</literal>. You must set <literal>bunit_sz</literal> first if it is increasing by more than 2x, and <literal>btune_sz</literal> first if it is decreasing by more than 2x.</para>
-      <para><emphasis role="bold">Total number of inodes</emphasis> -- To determine the total number of inodes, use <literal>lfs df -i</literal> (and also <literal>/proc/fs/lustre/*/*/filestotal</literal>). For more information on using the <literal>lfs df -i</literal> command and the command output, see <xref linkend="dbdoclet.50438209_35838"/>.</para>
-      <para>Unfortunately, the <literal>statfs</literal> interface does not report the free inode count directly, but instead reports the total inode and used inode counts. The free inode count is calculated for <literal>df</literal> from (total inodes - used inodes).</para>
-      <para>It is not critical to know a file system&apos;s total inode count. Instead, you should know (accurately), the free inode count and the used inode count for a file system. Lustre manipulates the total inode count in order to accurately report the other two values.</para>
-      <para>The values set for the MDS must match the values set on the OSTs.</para>
-      <para>The <literal>quota_bunit_sz</literal> parameter displays bytes, however <literal>lfs setquota</literal> uses KBs. The <literal>quota_bunit_sz</literal> parameter must be a multiple of 1024. A proper minimum KB size for <literal>lfs setquota</literal> can be calculated as:</para>
-      <informalexample>
+      <para>By default, quota is turned on after
+      <literal>quotacheck</literal> completes. However, this setting isn't
+      persistent and quota will have to be enabled again (via
+      <literal>lfs quotaon</literal>) if one of the Lustre targets is
+      restarted.
+      <literal>lfs quotaoff</literal> is used to turn off quota.</para>
+      <para>To enable quota permanently with a Lustre software release older
+      than release 2.4, the
+      <literal>quota_type</literal> parameter must be used. This requires
+      setting
+      <literal>mdd.quota_type</literal> and
+      <literal>ost.quota_type</literal>, respectively, on the MDT and OSTs.
+      <literal>quota_type</literal> can be set to the string
+      <literal>u</literal> (user),
+      <literal>g</literal> (group) or
+      <literal>ug</literal> for both users and groups. This parameter can be
+      specified at
+      <literal>mkfs</literal> time (
+      <literal>mkfs.lustre --param mdd.quota_type=ug</literal>) or with
+      <literal>tunefs.lustre</literal>. As an example:</para>
+      <screen>
+tunefs.lustre --param ost.quota_type=ug $ost_dev
+</screen>
+      <para>When using
+      <literal>mkfs.lustre --param mdd.quota_type=ug</literal> or
+      <literal>tunefs.lustre --param ost.quota_type=ug</literal>, be sure to
+      run the command on all OSTs and the MDT. Otherwise, abnormal results may
+      occur.</para>
+      <warning>
          <para>
-          <emphasis role="bold">Size in KBs = minimum_quota_bunit_sz * (number of OSTS + 1) = 1024 * (number of OSTs +1)</emphasis>
-        </para>
-      </informalexample>
-      <para>We add one (1) to the number of OSTs as the MDS also consumes KBs. As inodes are only consumed on the MDS, the minimum inode size for <literal>lfs setquota</literal> is equal to <literal>quota_iunit_sz</literal>.</para>
-      <note>
-        <para>Setting the quota below this limit may prevent the user from all file creation.</para>
-      </note>
+        In Lustre software releases before 2.4, when new OSTs are
+        added to the file system, quotas are not automatically propagated to
+        the new OSTs. As a workaround, clear and then reset quotas for each
+        user or group using the
+        <literal>lfs setquota</literal> command. In the example below, quotas
+        are cleared and reset for user
+        <literal>bob</literal> on file system
+        <literal>testfs</literal>:
+        <screen>
+$ lfs setquota -u bob -b 0 -B 0 -i 0 -I 0 /mnt/testfs
+$ lfs setquota -u bob -b 307200 -B 309200 -i 10000 -I 11000 /mnt/testfs
+</screen></para>
+      </warning>
      </section>
-  </section>
-  <section xml:id="dbdoclet.50438217_27895">
-    <title><indexterm><primary>Quotas</primary><secondary>known issues</secondary></indexterm>Known Issues with Quotas</title>
-    <para>Using quotas in Lustre can be complex and there are several known issues.</para>
-    <section remap="h3">
-      <title>Granted Cache and Quota Limits</title>
-      <para>In Lustre, granted cache does not respect quota limits. In this situation, OSTs grant cache to Lustre client to accelerate I/O. Granting cache causes writes to be successful in OSTs, even if they exceed the quota limits, and will overwrite them.</para>
-      <para>The sequence is:</para>
-      <orderedlist>
-        <listitem>
-          <para>A user writes files to Lustre.</para>
-        </listitem>
+    <section remap="h3" condition="l24">
+      <title>Enabling Disk Quotas (Lustre Software Release 2.4 and
+      later)</title>
+         <para>Quota setup is orchestrated by the MGS and <emphasis>all setup
+      commands in this section must be run on the MGS and project quotas need
+      lustre Relase 2.10 and later</emphasis>. Once setup, verification of the
+      quota state must be performed on the MDT. Although quota enforcement is
+      managed by the Lustre software, each OSD implementation relies on the
+      back-end file system to maintain per-user/group/project block and inode
+      usage. Hence, differences exist when setting up quotas with ldiskfs or
+      ZFS back-ends:</para>
+      <itemizedlist>
          <listitem>
-          <para>If the Lustre client has enough granted cache, then it returns &apos;success&apos; to users and arranges the writes to the OSTs.</para>
+          <para>For ldiskfs backends,
+          <literal>mkfs.lustre</literal> now creates empty quota files and
+          enables the QUOTA feature flag in the superblock which turns quota
+          accounting on at mount time automatically. e2fsck was also modified
+          to fix the quota files when the QUOTA feature flag is present. The
+             project quota feature is disabled by default, and
+          <literal>tune2fs</literal> needs to be run to enable every target
+          manually.</para>
          </listitem>
          <listitem>
-          <para>Because Lustre clients have delivered success to users, the OSTs cannot fail these writes.</para>
+          <para>For ZFS backend, <emphasis>the project quota feature is not
+             supported yet.</emphasis> Accounting ZAPs are created and maintained
+          by the ZFS file system itself. While ZFS tracks per-user and group
+             block usage, it does not handle inode accounting for ZFS versions
+          prior to zfs-0.7.0. The ZFS OSD implements its own support for inode
+          tracking. Two options are available:</para>
+          <orderedlist>
+            <listitem>
+              <para>The ZFS OSD can estimate the number of inodes in-use based
+              on the number of blocks used by a given user or group. This mode
+              can be enabled by running the following command on the server
+              running the target:
+              <literal>lctl set_param
+              osd-zfs.${FSNAME}-${TARGETNAME}.quota_iused_estimate=1</literal>.
+              </para>
+            </listitem>
+            <listitem>
+              <para>Similarly to block accounting, dedicated ZAPs are also
+              created the ZFS OSD to maintain per-user and group inode usage.
+              This is the default mode which corresponds to
+              <literal>quota_iused_estimate</literal> set to 0.</para>
+            </listitem>
+          </orderedlist>
          </listitem>
-      </orderedlist>
-      <para>Because of granted cache, writes always overwrite quota limitations. For example, if you set a 400 GB quota on user A and use IOR to write for user A from a bundle of clients, you will write much more data than 400 GB, and cause an out-of-quota error (<literal>-EDQUOT</literal>).</para>
+      </itemizedlist>
        <note>
-        <para>The effect of granted cache on quota limits can be mitigated, but not eradicated. Reduce the <literal>max_dirty_buffer</literal> in the clients (can be set from 0 to 512). To set <literal>max_dirty_buffer</literal> to 0:</para>
-        <itemizedlist>
-          <listitem>
-            <para>In releases after Lustre 1.6.5, <literal>lctl set_param osc.*.max_dirty_mb=0</literal>.</para>
-          </listitem>
-          <listitem>
-            <para>In releases before Lustre 1.6.5, <literal>proc/fs/lustre/osc/*/max_dirty_mb; do echo 512 &gt; $O</literal></para>
-          </listitem>
-        </itemizedlist>
+      <para>Lustre file systems formatted with a Lustre release prior to 2.4.0
+      can be still safely upgraded to release 2.4.0, but will not have
+      functional space usage report until
+      <literal>tunefs.lustre --quota</literal> is run against all targets. This
+      command sets the QUOTA feature flag in the superblock and runs e2fsck (as
+      a result, the target must be offline) to build the per-UID/GID disk usage
+      database.</para>
+      <para condition="l2A">Lustre filesystems formatted with a Lustre release
+      prior to 2.10 can be still safely upgraded to release 2.10, but will not
+      have project quota usage reporting functional until
+      <literal>tune2fs -O project</literal> is run against all ldiskfs backend
+      targets. This command sets the PROJECT feature flag in the superblock and
+      runs e2fsck (as a result, the target must be offline). See
+      <xref linkend="quota_interoperability"/> for further important
+      considerations.</para>
        </note>
-    </section>
-    <section remap="h3">
-      <title><indexterm><primary>Quotas</primary><secondary>limits</secondary></indexterm>Quota Limits</title>
-      <para>Available quota limits depend on the Lustre version you are using.</para>
+      <caution>
+        <para>Lustre software release 2.4 and later requires a version of
+        e2fsprogs that supports quota (i.e. newer or equal to 1.42.13.wc5,
+       1.42.13.wc6 or newer is needed for project quota support) to be
+       installed on the server nodes using ldiskfs backend (e2fsprogs is not
+       needed with ZFS backend). In general, we recommend to use the latest
+       e2fsprogs version available on
+       <link xl:href="http://downloads.hpdd.intel.com/e2fsprogs/">
+        http://downloads.hpdd.intel.com/public/e2fsprogs/</link>.</para>
+        <para>The ldiskfs OSD relies on the standard Linux quota to maintain
+        accounting information on disk. As a consequence, the Linux kernel
+        running on the Lustre servers using ldiskfs backend must have
+        <literal>CONFIG_QUOTA</literal>,
+        <literal>CONFIG_QUOTACTL</literal> and
+        <literal>CONFIG_QFMT_V2</literal> enabled.</para>
+      </caution>
+      <para>As of Lustre software release 2.4.0, quota enforcement is thus
+      turned on/off independently of space accounting which is always enabled.
+      <literal>lfs quota
+      <replaceable>on|off</replaceable></literal> as well as the per-target
+      <literal>quota_type</literal> parameter are deprecated in favor of a
+      single per-file system quota parameter controlling inode/block quota
+      enforcement. Like all permanent parameters, this quota parameter can be
+      set via
+      <literal>lctl conf_param</literal> on the MGS via the following
+      syntax:</para>
+      <screen>
+lctl conf_param <replaceable>fsname</replaceable>.quota.<replaceable>ost|mdt</replaceable>=<replaceable>u|g|p|ugp|none</replaceable>
+</screen>
        <itemizedlist>
          <listitem>
-          <para> Lustre version 1.4.11 and earlier (for 1.4.x releases) and Lustre version 1.6.4 and earlier (for 1.6.x releases) support quota limits less than 4 TB.</para>
+          <para>
+          <literal>ost</literal> -- to configure block quota managed by
+          OSTs</para>
          </listitem>
          <listitem>
-          <para> Lustre versions 1.4.12, 1.6.5 and later support quota limits of 4 TB and greater in Lustre configurations with OST storage limits of 4 TB and less.</para>
+          <para>
+          <literal>mdt</literal> -- to configure inode quota managed by
+          MDTs</para>
          </listitem>
          <listitem>
-          <para> Future Lustre versions are expected to support quota limits of 4 TB and greater with no OST storage limits.</para>
+          <para>
+          <literal>u</literal> -- to enable quota enforcement for users
+          only</para>
          </listitem>
          <listitem>
-          <informaltable frame="all">
-            <tgroup cols="3">
-              <colspec colname="c1" colwidth="33*"/>
-              <colspec colname="c2" colwidth="33*"/>
-              <colspec colname="c3" colwidth="33*"/>
-              <thead>
-                <row>
-                  <entry>
-                    <para><emphasis role="bold">Lustre Version</emphasis></para>
-                  </entry>
-                  <entry>
-                    <para><emphasis role="bold">Quota Limit Per User/Per Group</emphasis></para>
-                  </entry>
-                  <entry>
-                    <para><emphasis role="bold">OST Storage Limit</emphasis></para>
-                  </entry>
-                </row>
-              </thead>
-              <tbody>
-                <row>
-                  <entry>
-                    <para> 1.4.11 and earlier</para>
-                  </entry>
-                  <entry>
-                    <para> &lt; 4TB</para>
-                  </entry>
-                  <entry>
-                    <para> n/a</para>
-                  </entry>
-                </row>
-                <row>
-                  <entry>
-                    <para> 1.4.12</para>
-                  </entry>
-                  <entry>
-                    <para> =&gt; 4TB</para>
-                  </entry>
-                  <entry>
-                    <para> &lt;= 4TB of storage</para>
-                  </entry>
-                </row>
-                <row>
-                  <entry>
-                    <para> 1.6.4 and earlier</para>
-                  </entry>
-                  <entry>
-                    <para> &lt; 4TB</para>
-                  </entry>
-                  <entry>
-                    <para> n/a</para>
-                  </entry>
-                </row>
-                <row>
-                  <entry>
-                    <para> 1.6.5</para>
-                  </entry>
-                  <entry>
-                    <para> =&gt; 4TB</para>
-                  </entry>
-                  <entry>
-                    <para> &lt;= 4TB of storage</para>
-                  </entry>
-                </row>
-                <row>
-                  <entry>
-                    <para> Future Lustre versions</para>
-                  </entry>
-                  <entry>
-                    <para> =&gt; 4TB</para>
-                  </entry>
-                  <entry>
-                    <para> No storage limit</para>
-                  </entry>
-                </row>
-              </tbody>
-            </tgroup>
-          </informaltable>
+          <para>
+          <literal>g</literal> -- to enable quota enforcement for groups
+          only</para>
          </listitem>
-      </itemizedlist>
-    </section>
-    <section xml:id="dbdoclet.50438217_66360">
-      <title><indexterm><primary>Quotas</primary><secondary>file formats</secondary></indexterm>Quota File Formats</title>
-      <para>Lustre 1.6.5 introduced the v2 file format for administrative quotas, with 64-bit limits that support large-limits handling. The old quota file format (v1), with 32-bit limits, is also supported. Lustre 1.6.6 introduced the v2 file format for operational quotas. A few notes regarding the current quota file formats:</para>
-      <para>Lustre 1.6.5 and later use <literal>mdt.quota_type</literal> to force a specific administrative quota version (v2 or v1).</para>
-      <itemizedlist>
          <listitem>
-          <para> For the v2 quota file format, (<literal>OBJECTS/admin_quotafile_v2.{usr,grp}</literal>)</para>
+          <para>
+          <literal>p</literal> -- to enable quota enforcement for projects
+          only</para>
          </listitem>
          <listitem>
-          <para> For the v1 quota file format, (<literal>OBJECTS/admin_quotafile.{usr,grp}</literal>)</para>
+          <para>
+          <literal>ugp</literal> -- to enable quota enforcement for all users,
+          groups and projects</para>
          </listitem>
-      </itemizedlist>
-      <para>Lustre 1.6.6 and later use <literal>ost.quota_type</literal> to force a specific operational quota version (v2 or v1).</para>
-      <itemizedlist>
          <listitem>
-          <para>For the v2 quota file format, (<literal>lquota_v2.{user,group}</literal>)</para>
-        </listitem>
-        <listitem>
-          <para>For the v1 quota file format, (<literal>lquota.{user,group}</literal>)</para>
+          <para>
+          <literal>none</literal> -- to disable quota enforcement for all users,
+          groups and projects</para>
          </listitem>
        </itemizedlist>
-      <para>The quota_type specifier can be used to set different combinations of administrative/operational quota file versions on a Lustre node:</para>
+      <para>Examples:</para>
+      <para>To turn on user, group, and project quotas for block only on
+      file system
+      <literal>testfs1</literal>, <emphasis>on the MGS</emphasis> run:</para>
+      <screen>$ lctl conf_param testfs1.quota.ost=ugp
+</screen>
+      <para>To turn on group quotas for inodes on file system
+      <literal>testfs2</literal>, on the MGS run:</para>
+      <screen>$ lctl conf_param testfs2.quota.mdt=g
+</screen>
+      <para>To turn off user, group, and project quotas for both inode and block
+      on file system
+      <literal>testfs3</literal>, on the MGS run:</para>
+      <screen>$ lctl conf_param testfs3.quota.ost=none
+</screen>
+      <screen>$ lctl conf_param testfs3.quota.mdt=none
+</screen>
+      <section>
+           <title>
+           <indexterm>
+             <primary>Quotas</primary>
+             <secondary>verifying</secondary>
+           </indexterm>Quota Verification</title>
+         <para>Once the quota parameters have been configured, all targets
+      which are part of the file system will be automatically notified of the
+      new quota settings and enable/disable quota enforcement as needed. The
+      per-target enforcement status can still be verified by running the
+      following <emphasis>command on the MDS(s)</emphasis>:</para>
+      <screen>
+$ lctl get_param osd-*.*.quota_slave.info
+osd-zfs.testfs-MDT0000.quota_slave.info=
+target name:    testfs-MDT0000
+pool ID:        0
+type:           md
+quota enabled:  ug
+conn to master: setup
+user uptodate:  glb[1],slv[1],reint[0]
+group uptodate: glb[1],slv[1],reint[0]
+</screen>
+      </section>
+    </section>
+  </section>
+  <section xml:id="quota_administration">
+    <title>
+    <indexterm>
+      <primary>Quotas</primary>
+      <secondary>creating</secondary>
+    </indexterm>Quota Administration</title>
+       <para>Once the file system is up and running, quota limits on blocks
+    and inodes can be set for user, group, and project. This is <emphasis>
+    controlled entirely from a client</emphasis> via three quota
+    parameters:</para>
+    <para>
+    <emphasis role="bold">Grace period</emphasis>-- The period of time (in
+    seconds) within which users are allowed to exceed their soft limit. There
+    are six types of grace periods:</para>
+    <itemizedlist>
+      <listitem>
+        <para>user block soft limit</para>
+      </listitem>
+      <listitem>
+        <para>user inode soft limit</para>
+      </listitem>
+      <listitem>
+        <para>group block soft limit</para>
+      </listitem>
+      <listitem>
+        <para>group inode soft limit</para>
+      </listitem>
+      <listitem>
+        <para>project block soft limit</para>
+      </listitem>
+      <listitem>
+        <para>project inode soft limit</para>
+      </listitem>
+    </itemizedlist>
+    <para>The grace period applies to all users. The user block soft limit is
+    for all users who are using a blocks quota.</para>
+    <para>
+    <emphasis role="bold">Soft limit</emphasis> -- The grace timer is started
+    once the soft limit is exceeded. At this point, the user/group/project
+    can still allocate block/inode. When the grace time expires and if the
+    user is still above the soft limit, the soft limit becomes a hard limit
+    and the user/group/project can't allocate any new block/inode any more.
+    The user/group/project should then delete files to be under the soft limit.
+    The soft limit MUST be smaller than the hard limit. If the soft limit is
+    not needed, it should be set to zero (0).</para>
+    <para>
+    <emphasis role="bold">Hard limit</emphasis> -- Block or inode allocation
+    will fail with
+    <literal>EDQUOT</literal>(i.e. quota exceeded) when the hard limit is
+    reached. The hard limit is the absolute limit. When a grace period is set,
+    one can exceed the soft limit within the grace period if under the hard
+    limit.</para>
+    <para>Due to the distributed nature of a Lustre file system and the need to
+    maintain performance under load, those quota parameters may not be 100%
+    accurate. The quota settings can be manipulated via the
+    <literal>lfs</literal> command, executed on a client, and includes several
+    options to work with quotas:</para>
+    <itemizedlist>
+      <listitem>
+        <para>
+        <varname>quota</varname> -- displays general quota information (disk
+        usage and limits)</para>
+      </listitem>
+      <listitem>
+        <para>
+        <varname>setquota</varname> -- specifies quota limits and tunes the
+        grace period. By default, the grace period is one week.</para>
+      </listitem>
+    </itemizedlist>
+    <para>Usage:</para>
+    <screen>
+lfs quota [-q] [-v] [-h] [-o obd_uuid] [-u|-g|-p <replaceable>uname|uid|gname|gid|projid</replaceable>] <replaceable>/mount_point</replaceable>
+lfs quota -t {-u|-g|-p} <replaceable>/mount_point</replaceable>
+lfs setquota {-u|--user|-g|--group|-p|--project} <replaceable>username|groupname</replaceable> [-b <replaceable>block-softlimit</replaceable>] \
+             [-B <replaceable>block_hardlimit</replaceable>] [-i <replaceable>inode_softlimit</replaceable>] \
+             [-I <replaceable>inode_hardlimit</replaceable>] <replaceable>/mount_point</replaceable>
+</screen>
+    <para>To display general quota information (disk usage and limits) for the
+    user running the command and his primary group, run:</para>
+    <screen>
+$ lfs quota /mnt/testfs
+</screen>
+    <para>To display general quota information for a specific user ("
+    <literal>bob</literal>" in this example), run:</para>
+    <screen>
+$ lfs quota -u bob /mnt/testfs
+</screen>
+    <para>To display general quota information for a specific user ("
+    <literal>bob</literal>" in this example) and detailed quota statistics for
+    each MDT and OST, run:</para>
+    <screen>
+$ lfs quota -u bob -v /mnt/testfs
+</screen>
+    <para>To display general quota information for a specific project ("
+    <literal>1</literal>" in this example), run:</para>
+    <screen>
+$ lfs quota -p 1 /mnt/testfs
+</screen>
+    <para>To display general quota information for a specific group ("
+    <literal>eng</literal>" in this example), run:</para>
+    <screen>
+$ lfs quota -g eng /mnt/testfs
+</screen>
+    <para>To limit quota usage for a specific project ID on a specific
+    directory ("<literal>/mnt/testfs/dir</literal>" in this example), run:</para>
+    <screen>
+$ chattr +P /mnt/testfs/dir
+$ chattr -p 1 /mnt/testfs/dir
+$ lfs setquota -p 1 -b 307200 -B 309200 -i 10000 -I 11000 /mnt/testfs
+</screen>
+    <para>Please note that if it is desired to have
+    <literal>lfs quota -p</literal> show the space/inode usage under the
+    directory properly (much faster than <literal>du</literal>), then the
+    user/admin needs to use different project IDs for different directories.
+    </para>
+    <para>To display block and inode grace times for user quotas, run:</para>
+    <screen>
+$ lfs quota -t -u /mnt/testfs
+</screen>
+    <para>To set user or group quotas for a specific ID ("bob" in this
+    example), run:</para>
+    <screen>
+$ lfs setquota -u bob -b 307200 -B 309200 -i 10000 -I 11000 /mnt/testfs
+</screen>
+    <para>In this example, the quota for user "bob" is set to 300 MB
+    (309200*1024) and the hard limit is 11,000 files. Therefore, the inode hard
+    limit should be 11000.</para>
+    <para>The quota command displays the quota allocated and consumed by each
+    Lustre target. Using the previous
+    <literal>setquota</literal> example, running this
+    <literal>lfs</literal> quota command:</para>
+    <screen>
+$ lfs quota -u bob -v /mnt/testfs
+</screen>
+    <para>displays this command output:</para>
+    <screen>
+Disk quotas for user bob (uid 6000):
+Filesystem          kbytes quota limit grace files quota limit grace
+/mnt/testfs         0      30720 30920 -     0     10000 11000 -
+testfs-MDT0000_UUID 0      -      8192 -     0     -     2560  -
+testfs-OST0000_UUID 0      -      8192 -     0     -     0     -
+testfs-OST0001_UUID 0      -      8192 -     0     -     0     -
+Total allocated inode limit: 2560, total allocated block limit: 24576
+</screen>
+    <para>Global quota limits are stored in dedicated index files (there is one
+    such index per quota type) on the quota master target (aka QMT). The QMT
+    runs on MDT0000 and exports the global indexes via /proc. The global
+    indexes can thus be dumped via the following command:
+    <screen>
+# lctl get_param qmt.testfs-QMT0000.*.glb-*
+</screen>The format of global indexes depends on the OSD type. The ldiskfs OSD
+uses an IAM files while the ZFS OSD creates dedicated ZAPs.</para>
+    <para>Each slave also stores a copy of this global index locally. When the
+    global index is modified on the master, a glimpse callback is issued on the
+    global quota lock to notify all slaves that the global index has been
+    modified. This glimpse callback includes information about the identifier
+    subject to the change. If the global index on the QMT is modified while a
+    slave is disconnected, the index version is used to determine whether the
+    slave copy of the global index isn't up to date any more. If so, the slave
+    fetches the whole index again and updates the local copy. The slave copy of
+    the global index is also exported via /proc and can be accessed via the
+    following command:
+    <screen>
+lctl get_param osd-*.*.quota_slave.limit*
+</screen></para>
+    <note>
+      <para>Prior to 2.4, global quota limits used to be stored in
+      administrative quota files using the on-disk format of the linux quota
+      file. When upgrading MDT0000 to 2.4, those administrative quota files are
+      converted into IAM indexes automatically, conserving existing quota
+      limits previously set by the administrator.</para>
+    </note>
+  </section>
+  <section xml:id="quota_allocation">
+    <title>
+    <indexterm>
+      <primary>Quotas</primary>
+      <secondary>allocating</secondary>
+    </indexterm>Quota Allocation</title>
+    <para>In a Lustre file system, quota must be properly allocated or users
+    may experience unnecessary failures. The file system block quota is divided
+    up among the OSTs within the file system. Each OST requests an allocation
+    which is increased up to the quota limit. The quota allocation is then
+    <emphasis role="italic">quantized</emphasis> to reduce the number of
+    quota-related request traffic.</para>
+    <para>The Lustre quota system distributes quotas from the Quota Master
+    Target (aka QMT). Only one QMT instance is supported for now and only runs
+    on the same node as MDT0000. All OSTs and MDTs set up a Quota Slave Device
+    (aka QSD) which connects to the QMT to allocate/release quota space. The
+    QSD is setup directly from the OSD layer.</para>
+    <para>To reduce quota requests, quota space is initially allocated to QSDs
+    in very large chunks. How much unused quota space can be hold by a target
+    is controlled by the qunit size. When quota space for a given ID is close
+    to exhaustion on the QMT, the qunit size is reduced and QSDs are notified
+    of the new qunit size value via a glimpse callback. Slaves are then
+    responsible for releasing quota space above the new qunit value. The qunit
+    size isn't shrunk indefinitely and there is a minimal value of 1MB for
+    blocks and 1,024 for inodes. This means that the quota space rebalancing
+    process will stop when this minimum value is reached. As a result, quota
+    exceeded can be returned while many slaves still have 1MB or 1,024 inodes
+    of spare quota space.</para>
+    <para>If we look at the
+    <literal>setquota</literal> example again, running this
+    <literal>lfs quota</literal> command:</para>
+    <screen>
+# lfs quota -u bob -v /mnt/testfs
+</screen>
+    <para>displays this command output:</para>
+    <screen>
+Disk quotas for user bob (uid 500):
+Filesystem          kbytes quota limit grace       files  quota limit grace
+/mnt/testfs         30720* 30720 30920 6d23h56m44s 10101* 10000 11000
+6d23h59m50s
+testfs-MDT0000_UUID 0      -     0     -           10101  -     10240
+testfs-OST0000_UUID 0      -     1024  -           -      -     -
+testfs-OST0001_UUID 30720* -     29896 -           -      -     -
+Total allocated inode limit: 10240, total allocated block limit: 30920
+</screen>
+    <para>The total quota limit of 30,920 is allocated to user bob, which is
+    further distributed to two OSTs.</para>
+    <para>Values appended with '
+    <literal>*</literal>' show that the quota limit has been exceeded, causing
+    the following error when trying to write or create a file:</para>
+    <para>
+      <screen>
+$ cp: writing `/mnt/testfs/foo`: Disk quota exceeded.
+</screen>
+    </para>
+    <note>
+      <para>It is very important to note that the block quota is consumed per
+      OST and the inode quota per MDS. Therefore, when the quota is consumed on
+      one OST (resp. MDT), the client may not be able to create files
+      regardless of the quota available on other OSTs (resp. MDTs).</para>
+      <para>Setting the quota limit below the minimal qunit size may prevent
+      the user/group from all file creation. It is thus recommended to use
+      soft/hard limits which are a multiple of the number of OSTs * the minimal
+      qunit size.</para>
+    </note>
+    <para>To determine the total number of inodes, use
+    <literal>lfs df -i</literal>(and also
+    <literal>lctl get_param *.*.filestotal</literal>). For more information on
+    using the
+    <literal>lfs df -i</literal> command and the command output, see
+    <xref linkend="dbdoclet.50438209_35838" />.</para>
+    <para>Unfortunately, the
+    <literal>statfs</literal> interface does not report the free inode count
+    directly, but instead reports the total inode and used inode counts. The
+    free inode count is calculated for
+    <literal>df</literal> from (total inodes - used inodes). It is not critical
+    to know the total inode count for a file system. Instead, you should know
+    (accurately), the free inode count and the used inode count for a file
+    system. The Lustre software manipulates the total inode count in order to
+    accurately report the other two values.</para>
+  </section>
+  <section xml:id="quota_interoperability">
+    <title>
+    <indexterm>
+      <primary>Quotas</primary>
+      <secondary>Interoperability</secondary>
+    </indexterm>Quotas and Version Interoperability</title>
+    <para>The new quota protocol introduced in Lustre software release 2.4.0
+    <emphasis role="bold">is not compatible</emphasis> with previous
+    versions. As a consequence,
+    <emphasis role="bold">all Lustre servers must be upgraded to release 2.4.0
+    for quota to be functional</emphasis>. Quota limits set on the Lustre file
+    system prior to the upgrade will be automatically migrated to the new quota
+    index format. As for accounting information with ldiskfs backend, they will
+    be regenerated by running
+    <literal>tunefs.lustre --quota</literal> against all targets. It is worth
+    noting that running
+    <literal>tunefs.lustre --quota</literal> is
+    <emphasis role="bold">mandatory</emphasis> for all targets formatted with a
+    Lustre software release older than release 2.4.0, otherwise quota
+    enforcement as well as accounting won't be functional.</para>
+    <para>Besides, the quota protocol in release 2.4 takes for granted that the
+    Lustre client supports the
+    <literal>OBD_CONNECT_EINPROGRESS</literal> connect flag. Clients supporting
+    this flag will retry indefinitely when the server returns
+    <literal>EINPROGRESS</literal> in a reply. Here is the list of Lustre client
+    version which are compatible with release 2.4:</para>
+    <itemizedlist>
+      <listitem>
+        <para>Release 2.3-based clients and later</para>
+      </listitem>
+      <listitem>
+        <para>Release 1.8 clients newer or equal to release 1.8.9-wc1</para>
+      </listitem>
+      <listitem>
+        <para>Release 2.1 clients newer or equal to release 2.1.4</para>
+      </listitem>
+    </itemizedlist>
+    <para condition="l2A">To use the project quota functionality introduced in
+    Lustre 2.10, <emphasis role="bold">all Lustre servers and clients must be
+    upgraded to Lustre release 2.10 or later for project quota to work
+    correctly</emphasis>.  Otherwise, project quota will be inaccessible on
+    clients and not be accounted for on OSTs.</para>
+  </section>
+  <section xml:id="granted_cache_and_quota_limits">
+    <title>
+    <indexterm>
+      <primary>Quotas</primary>
+      <secondary>known issues</secondary>
+    </indexterm>Granted Cache and Quota Limits</title>
+    <para>In a Lustre file system, granted cache does not respect quota limits.
+    In this situation, OSTs grant cache to a Lustre client to accelerate I/O.
+    Granting cache causes writes to be successful in OSTs, even if they exceed
+    the quota limits, and will overwrite them.</para>
+    <para>The sequence is:</para>
+    <orderedlist>
+      <listitem>
+        <para>A user writes files to the Lustre file system.</para>
+      </listitem>
+      <listitem>
+        <para>If the Lustre client has enough granted cache, then it returns
+        'success' to users and arranges the writes to the OSTs.</para>
+      </listitem>
+      <listitem>
+        <para>Because Lustre clients have delivered success to users, the OSTs
+        cannot fail these writes.</para>
+      </listitem>
+    </orderedlist>
+    <para>Because of granted cache, writes always overwrite quota limitations.
+    For example, if you set a 400 GB quota on user A and use IOR to write for
+    user A from a bundle of clients, you will write much more data than 400 GB,
+    and cause an out-of-quota error (
+    <literal>EDQUOT</literal>).</para>
+    <note>
+      <para>The effect of granted cache on quota limits can be mitigated, but
+      not eradicated. Reduce the maximum amount of dirty data on the clients
+      (minimal value is 1MB):</para>
        <itemizedlist>
          <listitem>
-          <para>&quot;1&quot; - v1 (32-bit) administrative quota file, v1 (32-bit) operational quota file (default in releases <emphasis role="underline">before</emphasis> Lustre 1.6.5)</para>
-        </listitem>
-        <listitem>
-          <para>&quot;2&quot; - v2 (64-bit) administrative quota file, v1 (32-bit) operational quota file (default in Lustre 1.6.5)</para>
-        </listitem>
-        <listitem>
-          <para>&quot;3&quot; - v2 (64-bit) administrative quota file, v2 (64-bit) operational quota file (default in releases <emphasis role="underline">after</emphasis> Lustre 1.6.5)</para>
+          <para>
+            <literal>lctl set_param osc.*.max_dirty_mb=8</literal>
+          </para>
          </listitem>
        </itemizedlist>
-      <para>If quotas do not exist or look broken, then <literal>quotacheck</literal> creates quota files of a required name and format.</para>
-      <para>If Lustre is using the v2 quota file format when only v1 quota files exist, then <literal>quotacheck</literal> converts old v1 quota files to new v2 quota files. This conversion is triggered automatically, and is transparent to users. If an old quota file does not exist or looks broken, then the new v2 quota file will be empty. In case of an error, details can be found in the kernel log of the corresponding MDS/OST. During conversion of a v1 quota file to a v2 quota file, the v2 quota file is marked as broken, to avoid it being used if a crash occurs. The quota module does not use broken quota files (keeping quota off).</para>
-      <para>In most situations, Lustre administrators do not need to set specific versioning options. Upgrading Lustre without using <literal>quota_type</literal> to force specific quota file versions results in quota files being upgraded automatically to the latest version. The option ensures backward compatibility, preventing a quota file upgrade to a version which is not supported by earlier Lustre versions.</para>
-    </section>
+    </note>
    </section>
-  <section xml:id="dbdoclet.50438217_20772">
-    <title><indexterm><primary>Quotas</primary><secondary>statistics</secondary></indexterm>Lustre Quota Statistics</title>
-    <para>Lustre includes statistics that monitor quota activity, such as the kinds of quota RPCs sent during a specific period, the average time to complete the RPCs, etc. These statistics are useful to measure performance of a Lustre file system.</para>
-    <para>Each quota statistic consists of a quota event and <literal>min_time</literal>, <literal>max_time</literal> and <literal>sum_time</literal> values for the event.</para>
+  <section xml:id="lustre_quota_statistics">
+    <title>
+    <indexterm>
+      <primary>Quotas</primary>
+      <secondary>statistics</secondary>
+    </indexterm>Lustre Quota Statistics</title>
+    <para>The Lustre software includes statistics that monitor quota activity,
+    such as the kinds of quota RPCs sent during a specific period, the average
+    time to complete the RPCs, etc. These statistics are useful to measure
+    performance of a Lustre file system.</para>
+    <para>Each quota statistic consists of a quota event and
+    <literal>min_time</literal>,
+    <literal>max_time</literal> and
+    <literal>sum_time</literal> values for the event.</para>
      <informaltable frame="all">
        <tgroup cols="2">
-        <colspec colname="c1" colwidth="50*"/>
-        <colspec colname="c2" colwidth="50*"/>
+        <colspec colname="c1" colwidth="50*" />
+        <colspec colname="c2" colwidth="50*" />
          <thead>
            <row>
              <entry>
-              <para><emphasis role="bold">Quota Event</emphasis></para>
+              <para>
+                <emphasis role="bold">Quota Event</emphasis>
+              </para>
              </entry>
              <entry>
-              <para><emphasis role="bold">Description</emphasis></para>
+              <para>
+                <emphasis role="bold">Description</emphasis>
+              </para>
              </entry>
            </row>
          </thead>
          <tbody>
            <row>
              <entry>
-              <para> <emphasis role="bold">sync_acq_req</emphasis></para>
+              <para>
+                <emphasis role="bold">sync_acq_req</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> Quota slaves send a acquiring_quota request and wait for its return.</para>
+              <para>Quota slaves send a acquiring_quota request and wait for
+              its return.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">sync_rel_req</emphasis></para>
+              <para>
+                <emphasis role="bold">sync_rel_req</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> Quota slaves send a releasing_quota request and wait for its return.</para>
+              <para>Quota slaves send a releasing_quota request and wait for
+              its return.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">async_acq_req</emphasis></para>
+              <para>
+                <emphasis role="bold">async_acq_req</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> Quota slaves send an acquiring_quota request and do not wait for its return.</para>
+              <para>Quota slaves send an acquiring_quota request and do not
+              wait for its return.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">async_rel_req</emphasis></para>
+              <para>
+                <emphasis role="bold">async_rel_req</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> Quota slaves send a releasing_quota request and do not wait for its return.</para>
+              <para>Quota slaves send a releasing_quota request and do not wait
+              for its return.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">wait_for_blk_quota (lquota_chkquota)</emphasis></para>
+              <para>
+                <emphasis role="bold">wait_for_blk_quota
+                (lquota_chkquota)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> Before data is written to OSTs, the OSTs check if the remaining block quota is sufficient. This is done in the lquota_chkquota function.</para>
+              <para>Before data is written to OSTs, the OSTs check if the
+              remaining block quota is sufficient. This is done in the
+              lquota_chkquota function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">wait_for_ino_quota (lquota_chkquota)</emphasis></para>
+              <para>
+                <emphasis role="bold">wait_for_ino_quota
+                (lquota_chkquota)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> Before files are created on the MDS, the MDS checks if the remaining inode quota is sufficient. This is done in the lquota_chkquota function.</para>
+              <para>Before files are created on the MDS, the MDS checks if the
+              remaining inode quota is sufficient. This is done in the
+              lquota_chkquota function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">wait_for_blk_quota (lquota_pending_commit)</emphasis></para>
+              <para>
+                <emphasis role="bold">wait_for_blk_quota
+                (lquota_pending_commit)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> After blocks are written to OSTs, relative quota information is updated. This is done in the lquota_pending_commit function.</para>
+              <para>After blocks are written to OSTs, relative quota
+              information is updated. This is done in the lquota_pending_commit
+              function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">wait_for_ino_quota (lquota_pending_commit)</emphasis></para>
+              <para>
+                <emphasis role="bold">wait_for_ino_quota
+                (lquota_pending_commit)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> After files are created, relative quota information is updated. This is done in the lquota_pending_commit function.</para>
+              <para>After files are created, relative quota information is
+              updated. This is done in the lquota_pending_commit
+              function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">wait_for_pending_blk_quota_req (qctxt_wait_pending_dqacq)</emphasis></para>
+              <para>
+                <emphasis role="bold">wait_for_pending_blk_quota_req
+                (qctxt_wait_pending_dqacq)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> On the MDS or OSTs, there is one thread sending a quota request for a specific UID/GID for block quota at any time. At that time, if other threads need to do this too, they should wait. This is done in the qctxt_wait_pending_dqacq function.</para>
+              <para>On the MDS or OSTs, there is one thread sending a quota
+              request for a specific UID/GID for block quota at any time. At
+              that time, if other threads need to do this too, they should
+              wait. This is done in the qctxt_wait_pending_dqacq
+              function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">wait_for_pending_ino_quota_req (qctxt_wait_pending_dqacq)</emphasis></para>
+              <para>
+                <emphasis role="bold">wait_for_pending_ino_quota_req
+                (qctxt_wait_pending_dqacq)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> On the MDS, there is one thread sending a quota request for a specific UID/GID for inode quota at any time. If other threads need to do this too, they should wait. This is done in the qctxt_wait_pending_dqacq function.</para>
+              <para>On the MDS, there is one thread sending a quota request for
+              a specific UID/GID for inode quota at any time. If other threads
+              need to do this too, they should wait. This is done in the
+              qctxt_wait_pending_dqacq function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">nowait_for_pending_blk_quota_req (qctxt_wait_pending_dqacq)</emphasis></para>
+              <para>
+                <emphasis role="bold">nowait_for_pending_blk_quota_req
+                (qctxt_wait_pending_dqacq)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> On the MDS or OSTs, there is one thread sending a quota request for a specific UID/GID for block quota at any time. When threads enter qctxt_wait_pending_dqacq, they do not need to wait. This is done in the qctxt_wait_pending_dqacq function.</para>
+              <para>On the MDS or OSTs, there is one thread sending a quota
+              request for a specific UID/GID for block quota at any time. When
+              threads enter qctxt_wait_pending_dqacq, they do not need to wait.
+              This is done in the qctxt_wait_pending_dqacq function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">nowait_for_pending_ino_quota_req (qctxt_wait_pending_dqacq)</emphasis></para>
+              <para>
+                <emphasis role="bold">nowait_for_pending_ino_quota_req
+                (qctxt_wait_pending_dqacq)</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> On the MDS, there is one thread sending a quota request for a specific UID/GID for inode quota at any time. When threads enter qctxt_wait_pending_dqacq, they do not need to wait. This is done in the qctxt_wait_pending_dqacq function.</para>
+              <para>On the MDS, there is one thread sending a quota request for
+              a specific UID/GID for inode quota at any time. When threads
+              enter qctxt_wait_pending_dqacq, they do not need to wait. This is
+              done in the qctxt_wait_pending_dqacq function.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">quota_ctl</emphasis></para>
+              <para>
+                <emphasis role="bold">quota_ctl</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> The quota_ctl statistic is generated when lfs <literal>setquota</literal>, <literal>lfs quota</literal> and so on, are issued.</para>
+              <para>The quota_ctl statistic is generated when lfs
+              <literal>setquota</literal>,
+              <literal>lfs quota</literal> and so on, are issued.</para>
              </entry>
            </row>
            <row>
              <entry>
-              <para> <emphasis role="bold">adjust_qunit</emphasis></para>
+              <para>
+                <emphasis role="bold">adjust_qunit</emphasis>
+              </para>
              </entry>
              <entry>
-              <para> Each time qunit is adjusted, it is counted.</para>
+              <para>Each time qunit is adjusted, it is counted.</para>
              </entry>
            </row>
          </tbody>
@@ -579,21 +891,40 @@ lustre-OST0001_UUID        30720*          -               28872           \
      </informaltable>
      <section remap="h3">
        <title>Interpreting Quota Statistics</title>
-      <para>Quota statistics are an important measure of a Lustre file system&apos;s performance. Interpreting these statistics correctly can help you diagnose problems with quotas, and may indicate adjustments to improve system performance.</para>
+      <para>Quota statistics are an important measure of the performance of a
+      Lustre file system. Interpreting these statistics correctly can help you
+      diagnose problems with quotas, and may indicate adjustments to improve
+      system performance.</para>
        <para>For example, if you run this command on the OSTs:</para>
-      <screen>cat /proc/fs/lustre/lquota/lustre-OST0000/stats</screen>
+      <screen>
+lctl get_param lquota.testfs-OST0000.stats
+</screen>
        <para>You will get a result similar to this:</para>
-      <screen>snapshot_time                                1219908615.506895 secs.usecs
+      <screen>
+snapshot_time                                1219908615.506895 secs.usecs
  async_acq_req                              1 samples [us]  32 32 32
  async_rel_req                              1 samples [us]  5 5 5
  nowait_for_pending_blk_quota_req(qctxt_wait_pending_dqacq) 1 samples [us] 2\
   2 2
  quota_ctl                          4 samples [us]  80 3470 4293
  adjust_qunit                               1 samples [us]  70 70 70
-....</screen>
-      <para>In the first line, <literal>snapshot_time</literal> indicates when the statistics were taken. The remaining lines list the quota events and their associated data.</para>
-      <para>In the second line, the <literal>async_acq_req</literal> event occurs one time. The <literal>min_time</literal>, <literal>max_time</literal> and <literal>sum_time</literal> statistics for this event are 32, 32 and 32, respectively. The unit is microseconds (μs).</para>
-      <para>In the fifth line, the quota_ctl event occurs four times. The <literal>min_time</literal>, <literal>max_time</literal> and <literal>sum_time</literal> statistics for this event are 80, 3470 and 4293, respectively. The unit is microseconds (μs).</para>
+....
+</screen>
+      <para>In the first line,
+      <literal>snapshot_time</literal> indicates when the statistics were taken.
+      The remaining lines list the quota events and their associated
+      data.</para>
+      <para>In the second line, the
+      <literal>async_acq_req</literal> event occurs one time. The
+      <literal>min_time</literal>,
+      <literal>max_time</literal> and
+      <literal>sum_time</literal> statistics for this event are 32, 32 and 32,
+      respectively. The unit is microseconds (μs).</para>
+      <para>In the fifth line, the quota_ctl event occurs four times. The
+      <literal>min_time</literal>,
+      <literal>max_time</literal> and
+      <literal>sum_time</literal> statistics for this event are 80, 3470 and
+      4293, respectively. The unit is microseconds (μs).</para>
      </section>
    </section>
  </chapter>