ConfiguringQuotas.xml

   1 <?xml version='1.0' encoding='utf-8'?>
   2 <chapter xmlns="http://docbook.org/ns/docbook"
   3 xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
   4 xml:id="configuringquotas">
   5   <title xml:id="configuringquotas.title">Configuring and Managing
   6   Quotas</title>
   7   <para>This chapter describes how to configure quotas and includes the
   8   following sections:</para>
   9   <itemizedlist>
  10     <listitem>
  11       <para>
  12         <xref linkend="quota_configuring" />
  13       </para>
  14     </listitem>
  15     <listitem>
  16       <para>
  17         <xref linkend="enabling_disk_quotas" />
  18       </para>
  19     </listitem>
  20     <listitem>
  21       <para>
  22         <xref linkend="quota_administration" />
  23       </para>
  24     </listitem>
  25     <listitem>
  26       <para>
  27         <xref linkend="quota_allocation" />
  28       </para>
  29     </listitem>
  30     <listitem>
  31       <para>
  32         <xref linkend="quota_interoperability" />
  33       </para>
  34     </listitem>
  35     <listitem>
  36       <para>
  37         <xref linkend="granted_cache_and_quota_limits" />
  38       </para>
  39     </listitem>
  40     <listitem>
  41       <para>
  42         <xref linkend="lustre_quota_statistics" />
  43       </para>
  44     </listitem>
  45   </itemizedlist>
  46   <section xml:id="quota_configuring">
  47     <title>
  48     <indexterm>
  49       <primary>Quotas</primary>
  50       <secondary>configuring</secondary>
  51     </indexterm>Working with Quotas</title>
  52     <para>Quotas allow a system administrator to limit the amount of disk space
  53     a user or group can use. Quotas are set by root, and can be specified for
  54     individual users and/or groups. Before a file is written to a partition
  55     where quotas are set, the quota of the creator's group is checked. If a
  56     quota exists, then the file size counts towards the group's quota. If no
  57     quota exists, then the owner's user quota is checked before the file is
  58     written. Similarly, inode usage for specific functions can be controlled if
  59     a user over-uses the allocated space.</para>
  60     <para>Lustre quota enforcement differs from standard Linux quota
  61     enforcement in several ways:</para>
  62     <itemizedlist>
  63       <listitem>
  64         <para>Quotas are administered via the
  65         <literal>lfs</literal> and
  66         <literal>lctl</literal> commands (post-mount).</para>
  67       </listitem>
  68       <listitem>
  69         <para>Quotas are distributed (as the Lustre file system is a
  70         distributed file system), which has several ramifications.</para>
  71       </listitem>
  72       <listitem>
  73         <para>Quotas are allocated and consumed in a quantized fashion.</para>
  74       </listitem>
  75       <listitem>
  76         <para>Client does not set the
  77         <literal>usrquota</literal> or
  78         <literal>grpquota</literal> options to mount. As of Lustre software
  79         release 2.4, space accounting is always enabled by default and quota
  80         enforcement can be enabled/disabled on a per-file system basis with
  81         <literal>lctl conf_param</literal>. It is worth noting that both
  82         <literal>lfs quotaon</literal> and
  83         <literal>quota_type</literal> are deprecated as of Lustre software
  84         release 2.4.0.</para>
  85       </listitem>
  86     </itemizedlist>
  87     <caution>
  88       <para>Although a quota feature is available in the Lustre software, root
  89       quotas are NOT enforced.</para>
  90       <para>
  91       <literal>lfs setquota -u root</literal> (limits are not enforced)</para>
  92       <para>
  93       <literal>lfs quota -u root</literal> (usage includes internal Lustre data
  94       that is dynamic in size and does not accurately reflect mount point
  95       visible block and inode usage).</para>
  96     </caution>
  97   </section>
  98   <section xml:id="enabling_disk_quotas">
  99     <title>
 100     <indexterm>
 101       <primary>Quotas</primary>
 102       <secondary>enabling disk</secondary>
 103     </indexterm>Enabling Disk Quotas</title>
 104     <para>Prior to Lustre software release 2.4.0, enabling quota involved a
 105     full file system scan via
 106     <literal>lfs quotacheck</literal>. All file systems formatted with Lustre
 107     software release 2.4.0 or newer no longer require quotacheck to be run
 108     since up-to-date accounting information are now always maintained by the
 109     OSD layer, regardless of the quota enforcement status.</para>
 110     <section remap="h3" condition="l24">
 111       <title>Enabling Disk Quotas (Lustre Software Release 2.4 and
 112       later)</title>
 113       <para>Although quota enforcement is managed by the Lustre software, each
 114       OSD implementation relies on the backend file system to maintain
 115       per-user/group block and inode usage:</para>
 116       <itemizedlist>
 117         <listitem>
 118           <para>For ldiskfs backends,
 119           <literal>mkfs.lustre</literal> now creates empty quota files and
 120           enables the QUOTA feature flag in the superblock which turns quota
 121           accounting on at mount time automatically. e2fsck was also modified
 122           to fix the quota files when the QUOTA feature flag is present.</para>
 123         </listitem>
 124         <listitem>
 125           <para>For ZFS backend, accounting ZAPs are created and maintained by
 126           the ZFS file system itself. While ZFS tracks per-user and group block
 127           usage, it does not handle inode accounting. The ZFS OSD implements
 128           its own support for inode tracking. Two options are available:</para>
 129           <orderedlist>
 130             <listitem>
 131               <para>The ZFS OSD can estimate the number of inodes in-use based
 132               on the number of blocks used by a given user or group. This mode
 133               can be enabled by running the following command on the server
 134               running the target:
 135               <literal>lctl set_param
 136               osd-zfs.${FSNAME}-${TARGETNAME}.quota_iused_estimate=1</literal>.</para>
 137             </listitem>
 138             <listitem>
 139               <para>Similarly to block accounting, dedicated ZAPs are also
 140               created the ZFS OSD to maintain per-user and group inode usage.
 141               This is the default mode which corresponds to
 142               <literal>quota_iused_estimate</literal> set to 0.</para>
 143             </listitem>
 144           </orderedlist>
 145         </listitem>
 146       </itemizedlist>
 147       <para>As a result,
 148       <literal>lfs quotacheck</literal> is now deprecated and not required any
 149       more when running Lustre software release 2.4 on the servers.</para>
 150       <para>Lustre file systems formatted with a Lustre release prior to 2.4.0
 151       can be still safely upgraded to release 2.4.0, but won't have functional
 152       space usage report until
 153       <literal>tunefs.lustre --quota</literal> is run against all targets. This
 154       command sets the QUOTA feature flag in the superblock and runs e2fsck (as
 155       a result, the target must be offline) to build the per-UID/GID disk usage
 156       database.</para>
 157       <caution>
 158         <para>Lustre software release 2.4 and beyond requires a version of
 159         e2fsprogs that supports quota (i.e. newer or equal to 1.42.3.wc1) to be
 160         installed on the server nodes using ldiskfs backend (e2fsprogs isn't
 161         needed with ZFS backend). In general, we recommend to use the latest
 162         e2fsprogs version available on
 163         <link xl:href="http://downloads.hpdd.intel.com/e2fsprogs/">
 164         http://downloads.hpdd.intel.com/public/e2fsprogs/</link>.</para>
 165         <para>The ldiskfs OSD relies on the standard Linux quota to maintain
 166         accounting information on disk. As a consequence, the Linux kernel
 167         running on the Lustre servers using ldiskfs backend must have
 168         <literal>CONFIG_QUOTA</literal>,
 169         <literal>CONFIG_QUOTACTL</literal> and
 170         <literal>CONFIG_QFMT_V2</literal> enabled.</para>
 171       </caution>
 172       <para>As of Lustre software release 2.4.0, quota enforcement is thus
 173       turned on/off independently of space accounting which is always enabled.
 174       <literal>lfs quota
 175       <replaceable>on|off</replaceable></literal> as well as the per-target
 176       <literal>quota_type</literal> parameter are deprecated in favor of a
 177       single per-file system quota parameter controlling inode/block quota
 178       enforcement. Like all permanent parameters, this quota parameter can be
 179       set via
 180       <literal>lctl conf_param</literal> on the MGS via the following
 181       syntax:</para>
 182       <screen>
 183 lctl conf_param
 184 <replaceable>fsname</replaceable>.quota.
 185 <replaceable>ost|mdt</replaceable>=
 186 <replaceable>u|g|ug|none</replaceable>
 187 </screen>
 188       <itemizedlist>
 189         <listitem>
 190           <para>
 191           <literal>ost</literal>-- to configure block quota managed by
 192           OSTs</para>
 193         </listitem>
 194         <listitem>
 195           <para>
 196           <literal>mdt</literal>-- to configure inode quota managed by
 197           MDTs</para>
 198         </listitem>
 199         <listitem>
 200           <para>
 201           <literal>u</literal>-- to enable quota enforcement for users
 202           only</para>
 203         </listitem>
 204         <listitem>
 205           <para>
 206           <literal>g</literal>-- to enable quota enforcement for groups
 207           only</para>
 208         </listitem>
 209         <listitem>
 210           <para>
 211           <literal>ug</literal>-- to enable quota enforcement for both users
 212           and groups</para>
 213         </listitem>
 214         <listitem>
 215           <para>
 216           <literal>none</literal>-- to disable quota enforcement for both users
 217           and groups</para>
 218         </listitem>
 219       </itemizedlist>
 220       <para>Examples:</para>
 221       <para>To turn on user and group quotas for block only on file system
 222       <literal>testfs1</literal>, run:</para>
 223       <screen>
 224 $ lctl conf_param testfs1.quota.ost=ug
 225 </screen>
 226       <para>To turn on group quotas for inodes on file system
 227       <literal>testfs2</literal>, run:</para>
 228       <screen>
 229 $ lctl conf_param testfs2.quota.mdt=g
 230 </screen>
 231       <para>To turn off user and group quotas for both inode and block on file
 232       system
 233       <literal>testfs3</literal>, run:</para>
 234       <screen>
 235 $ lctl conf_param testfs3.quota.ost=none
 236 </screen>
 237       <screen>
 238 $ lctl conf_param testfs3.quota.mdt=none
 239 </screen>
 240       <para>Once the quota parameter set on the MGS, all targets which are part
 241       of the file system will be notified of the new quota settings and
 242       enable/disable quota enforcement as needed. The per-target enforcement
 243       status can still be verified by running the following command on the
 244       Lustre servers:</para>
 245       <screen>
 246 $ lctl get_param osd-*.*.quota_slave.info
 247 osd-zfs.testfs-MDT0000.quota_slave.info=
 248 target name:    testfs-MDT0000
 249 pool ID:        0
 250 type:           md
 251 quota enabled:  ug
 252 conn to master: setup
 253 user uptodate:  glb[1],slv[1],reint[0]
 254 group uptodate: glb[1],slv[1],reint[0]
 255 </screen>
 256       <caution>
 257         <para>Lustre software release 2.4 comes with a new quota protocol and a
 258         new on-disk format, be sure to check the Interoperability section below
 259         (see
 260         <xref linkend="quota_interoperability" />.) when migrating to release
 261         2.4</para>
 262       </caution>
 263     </section>
 264     <section remap="h3">
 265       <title>Enabling Disk Quotas (Lustre Releases Previous to Release 2.4
 266       )</title>
 267       <para>
 268       <note>
 269         <?oxy_custom_start type="oxy_content_highlight" color="255,64,0"?>
 270         <para><?oxy_custom_end?>
 271         In Lustre software releases previous to release 2.4, when new OSTs are
 272         added to the file system, quotas are not automatically propagated to
 273         the new OSTs. As a workaround, clear and then reset quotas for each
 274         user or group using the
 275         <literal>lfs setquota</literal> command. In the example below, quotas
 276         are cleared and reset for user
 277         <literal>bob</literal> on file system
 278         <literal>testfs</literal>:
 279         <screen>
 280 $ lfs setquota -u bob -b 0 -B 0 -i 0 -I 0 /mnt/testfs
 281 $ lfs setquota -u bob -b 307200 -B 309200 -i 10000 -I 11000 /mnt/testfs
 282 </screen></para>
 283       </note>For Lustre software releases older than release 2.4,
 284       <literal>lfs quotacheck</literal> must be first run from a client node to
 285       create quota files on the Lustre targets (i.e. the MDT and OSTs).
 286       <literal>lfs quotacheck</literal> requires the file system to be quiescent
 287       (i.e. no modifying operations like write, truncate, create or delete
 288       should run concurrently). Failure to follow this caution may result in
 289       inaccurate user/group disk usage. Operations that do not change Lustre
 290       files (such as read or mount) are okay to run.
 291       <literal>lfs quotacheck</literal> performs a scan on all the Lustre
 292       targets to calculates the block/inode usage for each user/group. If the
 293       Lustre file system has many files,
 294       <literal>quotacheck</literal> may take a long time to complete. Several
 295       options can be passed to
 296       <literal>lfs quotacheck</literal>:</para>
 297       <screen>
 298 # lfs quotacheck -ug /mnt/testfs
 299 </screen>
 300       <itemizedlist>
 301         <listitem>
 302           <para>
 303           <literal>u</literal>-- checks the user disk quota information</para>
 304         </listitem>
 305         <listitem>
 306           <para>
 307           <literal>g</literal>-- checks the group disk quota information</para>
 308         </listitem>
 309       </itemizedlist>
 310       <para>By default, quota is turned on after
 311       <literal>quotacheck</literal> completes. However, this setting isn't
 312       persistent and quota will have to be enabled again (via
 313       <literal>lfs quotaon</literal>) if one of the Lustre targets is
 314       restarted.
 315       <literal>lfs quotaoff</literal> is used to turn off quota.</para>
 316       <para>To enable quota permanently with a Lustre software release older
 317       than release 2.4, the
 318       <literal>quota_type</literal> parameter must be used. This requires
 319       setting
 320       <literal>mdd.quota_type</literal> and
 321       <literal>ost.quota_type</literal>, respectively, on the MDT and OSTs.
 322       <literal>quota_type</literal> can be set to the string
 323       <literal>u</literal> (user),
 324       <literal>g</literal> (group) or
 325       <literal>ug</literal> for both users and groups. This parameter can be
 326       specified at
 327       <literal>mkfs</literal> time (
 328       <literal>mkfs.lustre --param mdd.quota_type=ug</literal>) or with
 329       <literal>tunefs.lustre</literal>. As an example:</para>
 330       <screen>
 331 tunefs.lustre --param ost.quota_type=ug $ost_dev
 332 </screen>
 333       <para>When using
 334       <literal>mkfs.lustre --param mdd.quota_type=ug</literal> or
 335       <literal>tunefs.lustre --param ost.quota_type=ug</literal>, be sure to
 336       run the command on all OSTs and the MDT. Otherwise, abnormal results may
 337       occur.</para>
 338     </section>
 339   </section>
 340   <section xml:id="quota_administration">
 341     <title>
 342     <indexterm>
 343       <primary>Quotas</primary>
 344       <secondary>creating</secondary>
 345     </indexterm>Quota Administration</title>
 346     <para>Once the file system is up and running, quota limits on blocks and
 347     files can be set for both user and group. This is controlled via three
 348     quota parameters:</para>
 349     <para>
 350     <emphasis role="bold">Grace period</emphasis>-- The period of time (in
 351     seconds) within which users are allowed to exceed their soft limit. There
 352     are four types of grace periods:</para>
 353     <itemizedlist>
 354       <listitem>
 355         <para>user block soft limit</para>
 356       </listitem>
 357       <listitem>
 358         <para>user inode soft limit</para>
 359       </listitem>
 360       <listitem>
 361         <para>group block soft limit</para>
 362       </listitem>
 363       <listitem>
 364         <para>group inode soft limit</para>
 365       </listitem>
 366     </itemizedlist>
 367     <para>The grace period applies to all users. The user block soft limit is
 368     for all users who are using a blocks quota.</para>
 369     <para>
 370     <emphasis role="bold">Soft limit</emphasis>-- The grace timer is started
 371     once the soft limit is exceeded. At this point, the user/group can still
 372     allocate block/inode. When the grace time expires and if the user is still
 373     above the soft limit, the soft limit becomes a hard limit and the
 374     user/group can't allocate any new block/inode any more. The user/group
 375     should then delete files to be under the soft limit. The soft limit MUST be
 376     smaller than the hard limit. If the soft limit is not needed, it should be
 377     set to zero (0).</para>
 378     <para>
 379     <emphasis role="bold">Hard limit</emphasis>-- Block or inode allocation
 380     will fail with
 381     <literal>EDQUOT</literal>(i.e. quota exceeded) when the hard limit is
 382     reached. The hard limit is the absolute limit. When a grace period is set,
 383     one can exceed the soft limit within the grace period if under the hard
 384     limit.</para>
 385     <para>Due to the distributed nature of a Lustre file system and the need to
 386     mainain performance under load, those quota parameters may not be 100%
 387     accurate. The quota settings can be manipulated via the
 388     <literal>lfs</literal> command which includes several options to work with
 389     quotas:</para>
 390     <itemizedlist>
 391       <listitem>
 392         <para>
 393         <varname>quota</varname>-- displays general quota information (disk
 394         usage and limits)</para>
 395       </listitem>
 396       <listitem>
 397         <para>
 398         <varname>setquota</varname>-- specifies quota limits and tunes the
 399         grace period. By default, the grace period is one week.</para>
 400       </listitem>
 401     </itemizedlist>
 402     <para>Usage:</para>
 403     <screen>
 404 lfs quota [-q] [-v] [-h] [-o obd_uuid] [-u|-g
 405 <replaceable>uname|uid|gname|gid</replaceable>]
 406 <replaceable>/mount_point</replaceable>
 407 lfs quota -t
 408 <replaceable>-u|-g</replaceable>
 409 <replaceable>/mount_point</replaceable>
 410 lfs setquota
 411 <replaceable>-u|--user|-g|--group</replaceable>
 412 <replaceable>username|groupname</replaceable> [-b
 413 <replaceable>block-softlimit</replaceable>] \
 414              [-B
 415 <replaceable>block_hardlimit</replaceable>] [-i
 416 <replaceable>inode_softlimit</replaceable>] \
 417              [-I
 418 <replaceable>inode_hardlimit</replaceable>]
 419 <replaceable>/mount_point</replaceable>
 420 </screen>
 421     <para>To display general quota information (disk usage and limits) for the
 422     user running the command and his primary group, run:</para>
 423     <screen>
 424 $ lfs quota /mnt/testfs
 425 </screen>
 426     <para>To display general quota information for a specific user ("
 427     <literal>bob</literal>" in this example), run:</para>
 428     <screen>
 429 $ lfs quota -u bob /mnt/testfs
 430 </screen>
 431     <para>To display general quota information for a specific user ("
 432     <literal>bob</literal>" in this example) and detailed quota statistics for
 433     each MDT and OST, run:</para>
 434     <screen>
 435 $ lfs quota -u bob -v /mnt/testfs
 436 </screen>
 437     <para>To display general quota information for a specific group ("
 438     <literal>eng</literal>" in this example), run:</para>
 439     <screen>
 440 $ lfs quota -g eng /mnt/testfs
 441 </screen>
 442     <para>To display block and inode grace times for user quotas, run:</para>
 443     <screen>
 444 $ lfs quota -t -u /mnt/testfs
 445 </screen>
 446     <para>To set user or group quotas for a specific ID ("bob" in this
 447     example), run:</para>
 448     <screen>
 449 $ lfs setquota -u bob -b 307200 -B 309200 -i 10000 -I 11000 /mnt/testfs
 450 </screen>
 451     <para>In this example, the quota for user "bob" is set to 300 MB
 452     (309200*1024) and the hard limit is 11,000 files. Therefore, the inode hard
 453     limit should be 11000.</para>
 454     <para>The quota command displays the quota allocated and consumed by each
 455     Lustre target. Using the previous
 456     <literal>setquota</literal> example, running this
 457     <literal>lfs</literal> quota command:</para>
 458     <screen>
 459 $ lfs quota -u bob -v /mnt/testfs
 460 </screen>
 461     <para>displays this command output:</para>
 462     <screen>
 463 Disk quotas for user bob (uid 6000):
 464 Filesystem          kbytes quota limit grace files quota limit grace
 465 /mnt/testfs         0      30720 30920 -     0     10000 11000 -
 466 testfs-MDT0000_UUID 0      -      8192 -     0     -     2560  -
 467 testfs-OST0000_UUID 0      -      8192 -     0     -     0     -
 468 testfs-OST0001_UUID 0      -      8192 -     0     -     0     -
 469 Total allocated inode limit: 2560, total allocated block limit: 24576
 470 </screen>
 471     <para>Global quota limits are stored in dedicated index files (there is one
 472     such index per quota type) on the quota master target (aka QMT). The QMT
 473     runs on MDT0000 and exports the global indexes via /proc. The global
 474     indexes can thus be dumped via the following command:
 475     <screen>
 476 # lctl get_param qmt.testfs-QMT0000.*.glb-*
 477 </screen>The format of global indexes depends on the OSD type. The ldiskfs OSD
 478 uses an IAM files while the ZFS OSD creates dedicated ZAPs.</para>
 479     <para>Each slave also stores a copy of this global index locally. When the
 480     global index is modified on the master, a glimpse callback is issued on the
 481     global quota lock to notify all slaves that the global index has been
 482     modified. This glimpse callback includes information about the identifier
 483     subject to the change. If the global index on the QMT is modified while a
 484     slave is disconnected, the index version is used to determine whether the
 485     slave copy of the global index isn't uptodate any more. If so, the slave
 486     fetches the whole index again and updates the local copy. The slave copy of
 487     the global index is also exported via /proc and can be accessed via the
 488     following command:
 489     <screen>
 490 lctl get_param osd-*.*.quota_slave.limit*
 491 </screen></para>
 492     <note>
 493       <para>Prior to 2.4, global quota limits used to be stored in
 494       administrative quota files using the on-disk format of the linux quota
 495       file. When upgrading MDT0000 to 2.4, those administrative quota files are
 496       converted into IAM indexes automatically, conserving existing quota
 497       limits previously set by the administrator.</para>
 498     </note>
 499   </section>
 500   <section xml:id="quota_allocation">
 501     <title>
 502     <indexterm>
 503       <primary>Quotas</primary>
 504       <secondary>allocating</secondary>
 505     </indexterm>Quota Allocation</title>
 506     <para>In a Lustre file system, quota must be properly allocated or users
 507     may experience unnecessary failures. The file system block quota is divided
 508     up among the OSTs within the file system. Each OST requests an allocation
 509     which is increased up to the quota limit. The quota allocation is then
 510     <emphasis role="italic">quantized</emphasis>to reduce the number of
 511     quota-related request traffic.</para>
 512     <para>The Lustre quota system distributes quotas from the Quota Master
 513     Target (aka QMT). Only one QMT instance is supported for now and only runs
 514     on the same node as MDT0000. All OSTs and MDTs set up a Quota Slave Device
 515     (aka QSD) which connects to the QMT to allocate/release quota space. The
 516     QSD is setup directly from the OSD layer.</para>
 517     <para>To reduce quota requests, quota space is initially allocated to QSDs
 518     in very large chunks. How much unused quota space can be hold by a target
 519     is controlled by the qunit size. When quota space for a given ID is close
 520     to exhaustion on the QMT, the qunit size is reduced and QSDs are notified
 521     of the new qunit size value via a glimpse callback. Slaves are then
 522     responsible for releasing quota space above the new qunit value. The qunit
 523     size isn't shrunk indefinitely and there is a minimal value of 1MB for
 524     blocks and 1,024 for inodes. This means that the quota space rebalancing
 525     process will stop when this mininum value is reached. As a result, quota
 526     exceeded can be returned while many slaves still have 1MB or 1,024 inodes
 527     of spare quota space.</para>
 528     <para>If we look at the
 529     <literal>setquota</literal> example again, running this
 530     <literal>lfs quota</literal> command:</para>
 531     <screen>
 532 # lfs quota -u bob -v /mnt/testfs
 533 </screen>
 534     <para>displays this command output:</para>
 535     <screen>
 536 Disk quotas for user bob (uid 500):
 537 Filesystem          kbytes quota limit grace       files  quota limit grace
 538 /mnt/testfs         30720* 30720 30920 6d23h56m44s 10101* 10000 11000
 539 6d23h59m50s
 540 testfs-MDT0000_UUID 0      -     0     -           10101  -     10240
 541 testfs-OST0000_UUID 0      -     1024  -           -      -     -
 542 testfs-OST0001_UUID 30720* -     29896 -           -      -     -
 543 Total allocated inode limit: 10240, total allocated block limit: 30920
 544 </screen>
 545     <para>The total quota limit of 30,920 is allocated to user bob, which is
 546     further distributed to two OSTs.</para>
 547     <para>Values appended with '
 548     <literal>*</literal>' show that the quota limit has been exceeded, causing
 549     the following error when trying to write or create a file:</para>
 550     <para>
 551       <screen>
 552 $ cp: writing `/mnt/testfs/foo`: Disk quota exceeded.
 553 </screen>
 554     </para>
 555     <note>
 556       <para>It is very important to note that the block quota is consumed per
 557       OST and the inode quota per MDS. Therefore, when the quota is consumed on
 558       one OST (resp. MDT), the client may not be able to create files
 559       regardless of the quota available on other OSTs (resp. MDTs).</para>
 560       <para>Setting the quota limit below the minimal qunit size may prevent
 561       the user/group from all file creation. It is thus recommended to use
 562       soft/hard limits which are a multiple of the number of OSTs * the minimal
 563       qunit size.</para>
 564     </note>
 565     <para>To determine the total number of inodes, use
 566     <literal>lfs df -i</literal>(and also
 567     <literal>lctl get_param *.*.filestotal</literal>). For more information on
 568     using the
 569     <literal>lfs df -i</literal> command and the command output, see
 570     <xref linkend="dbdoclet.50438209_35838" />.</para>
 571     <para>Unfortunately, the
 572     <literal>statfs</literal> interface does not report the free inode count
 573     directly, but instead reports the total inode and used inode counts. The
 574     free inode count is calculated for
 575     <literal>df</literal> from (total inodes - used inodes). It is not critical
 576     to know the total inode count for a file system. Instead, you should know
 577     (accurately), the free inode count and the used inode count for a file
 578     system. The Lustre software manipulates the total inode count in order to
 579     accurately report the other two values.</para>
 580   </section>
 581   <section xml:id="quota_interoperability">
 582     <title>
 583     <indexterm>
 584       <primary>Quotas</primary>
 585       <secondary>Interoperability</secondary>
 586     </indexterm>Interoperability</title>
 587     <para>The new quota protocol introduced in Lustre software release 2.4.0
 588     <emphasis role="bold">isn't compatible</emphasis>with the old one. As a
 589     consequence,
 590     <emphasis role="bold">all Lustre servers must be upgraded to release 2.4.0
 591     for quota to be functional</emphasis>. Quota limits set on the Lustre file
 592     system prior to the upgrade will be automatically migrated to the new quota
 593     index format. As for accounting information with ldiskfs backend, they will
 594     be regenerated by running
 595     <literal>tunefs.lustre --quota</literal> against all targets. It is worth
 596     noting that running
 597     <literal>tunefs.lustre --quota</literal> is
 598     <emphasis role="bold">mandatory</emphasis>for all targets formatted with a
 599     Lustre software release older than release 2.4.0, otherwise quota
 600     enforcement as well as accounting won't be functional.</para>
 601     <para>Besides, the quota protocol in release 2.4 takes for granted that the
 602     Lustre client supports the
 603     <literal>OBD_CONNECT_EINPROGRESS</literal> connect flag. Clients supporting
 604     this flag will retry indefinitely when the server returns
 605     <literal>EINPROGRESS</literal> in a reply. Here is the list of Lustre client
 606     version which are compatible with release 2.4:</para>
 607     <itemizedlist>
 608       <listitem>
 609         <para>Release 2.3-based clients and beyond</para>
 610       </listitem>
 611       <listitem>
 612         <para>Release 1.8 clients newer or equal to release 1.8.9-wc1</para>
 613       </listitem>
 614       <listitem>
 615         <para>Release 2.1 clients newer or equal to release 2.1.4</para>
 616       </listitem>
 617     </itemizedlist>
 618   </section>
 619   <section xml:id="granted_cache_and_quota_limits">
 620     <title>
 621     <indexterm>
 622       <primary>Quotas</primary>
 623       <secondary>known issues</secondary>
 624     </indexterm>Granted Cache and Quota Limits</title>
 625     <para>In a Lustre file system, granted cache does not respect quota limits.
 626     In this situation, OSTs grant cache to a Lustre client to accelerate I/O.
 627     Granting cache causes writes to be successful in OSTs, even if they exceed
 628     the quota limits, and will overwrite them.</para>
 629     <para>The sequence is:</para>
 630     <orderedlist>
 631       <listitem>
 632         <para>A user writes files to the Lustre file system.</para>
 633       </listitem>
 634       <listitem>
 635         <para>If the Lustre client has enough granted cache, then it returns
 636         'success' to users and arranges the writes to the OSTs.</para>
 637       </listitem>
 638       <listitem>
 639         <para>Because Lustre clients have delivered success to users, the OSTs
 640         cannot fail these writes.</para>
 641       </listitem>
 642     </orderedlist>
 643     <para>Because of granted cache, writes always overwrite quota limitations.
 644     For example, if you set a 400 GB quota on user A and use IOR to write for
 645     user A from a bundle of clients, you will write much more data than 400 GB,
 646     and cause an out-of-quota error (
 647     <literal>EDQUOT</literal>).</para>
 648     <note>
 649       <para>The effect of granted cache on quota limits can be mitigated, but
 650       not eradicated. Reduce the maximum amount of dirty data on the clients
 651       (minimal value is 1MB):</para>
 652       <itemizedlist>
 653         <listitem>
 654           <para>
 655             <literal>lctl set_param osc.*.max_dirty_mb=8</literal>
 656           </para>
 657         </listitem>
 658       </itemizedlist>
 659     </note>
 660   </section>
 661   <section xml:id="lustre_quota_statistics">
 662     <title>
 663     <indexterm>
 664       <primary>Quotas</primary>
 665       <secondary>statistics</secondary>
 666     </indexterm>Lustre Quota Statistics</title>
 667     <para>The Lustre software includes statistics that monitor quota activity,
 668     such as the kinds of quota RPCs sent during a specific period, the average
 669     time to complete the RPCs, etc. These statistics are useful to measure
 670     performance of a Lustre file system.</para>
 671     <para>Each quota statistic consists of a quota event and
 672     <literal>min_time</literal>,
 673     <literal>max_time</literal> and
 674     <literal>sum_time</literal> values for the event.</para>
 675     <informaltable frame="all">
 676       <tgroup cols="2">
 677         <colspec colname="c1" colwidth="50*" />
 678         <colspec colname="c2" colwidth="50*" />
 679         <thead>
 680           <row>
 681             <entry>
 682               <para>
 683                 <emphasis role="bold">Quota Event</emphasis>
 684               </para>
 685             </entry>
 686             <entry>
 687               <para>
 688                 <emphasis role="bold">Description</emphasis>
 689               </para>
 690             </entry>
 691           </row>
 692         </thead>
 693         <tbody>
 694           <row>
 695             <entry>
 696               <para>
 697                 <emphasis role="bold">sync_acq_req</emphasis>
 698               </para>
 699             </entry>
 700             <entry>
 701               <para>Quota slaves send a acquiring_quota request and wait for
 702               its return.</para>
 703             </entry>
 704           </row>
 705           <row>
 706             <entry>
 707               <para>
 708                 <emphasis role="bold">sync_rel_req</emphasis>
 709               </para>
 710             </entry>
 711             <entry>
 712               <para>Quota slaves send a releasing_quota request and wait for
 713               its return.</para>
 714             </entry>
 715           </row>
 716           <row>
 717             <entry>
 718               <para>
 719                 <emphasis role="bold">async_acq_req</emphasis>
 720               </para>
 721             </entry>
 722             <entry>
 723               <para>Quota slaves send an acquiring_quota request and do not
 724               wait for its return.</para>
 725             </entry>
 726           </row>
 727           <row>
 728             <entry>
 729               <para>
 730                 <emphasis role="bold">async_rel_req</emphasis>
 731               </para>
 732             </entry>
 733             <entry>
 734               <para>Quota slaves send a releasing_quota request and do not wait
 735               for its return.</para>
 736             </entry>
 737           </row>
 738           <row>
 739             <entry>
 740               <para>
 741                 <emphasis role="bold">wait_for_blk_quota
 742                 (lquota_chkquota)</emphasis>
 743               </para>
 744             </entry>
 745             <entry>
 746               <para>Before data is written to OSTs, the OSTs check if the
 747               remaining block quota is sufficient. This is done in the
 748               lquota_chkquota function.</para>
 749             </entry>
 750           </row>
 751           <row>
 752             <entry>
 753               <para>
 754                 <emphasis role="bold">wait_for_ino_quota
 755                 (lquota_chkquota)</emphasis>
 756               </para>
 757             </entry>
 758             <entry>
 759               <para>Before files are created on the MDS, the MDS checks if the
 760               remaining inode quota is sufficient. This is done in the
 761               lquota_chkquota function.</para>
 762             </entry>
 763           </row>
 764           <row>
 765             <entry>
 766               <para>
 767                 <emphasis role="bold">wait_for_blk_quota
 768                 (lquota_pending_commit)</emphasis>
 769               </para>
 770             </entry>
 771             <entry>
 772               <para>After blocks are written to OSTs, relative quota
 773               information is updated. This is done in the lquota_pending_commit
 774               function.</para>
 775             </entry>
 776           </row>
 777           <row>
 778             <entry>
 779               <para>
 780                 <emphasis role="bold">wait_for_ino_quota
 781                 (lquota_pending_commit)</emphasis>
 782               </para>
 783             </entry>
 784             <entry>
 785               <para>After files are created, relative quota information is
 786               updated. This is done in the lquota_pending_commit
 787               function.</para>
 788             </entry>
 789           </row>
 790           <row>
 791             <entry>
 792               <para>
 793                 <emphasis role="bold">wait_for_pending_blk_quota_req
 794                 (qctxt_wait_pending_dqacq)</emphasis>
 795               </para>
 796             </entry>
 797             <entry>
 798               <para>On the MDS or OSTs, there is one thread sending a quota
 799               request for a specific UID/GID for block quota at any time. At
 800               that time, if other threads need to do this too, they should
 801               wait. This is done in the qctxt_wait_pending_dqacq
 802               function.</para>
 803             </entry>
 804           </row>
 805           <row>
 806             <entry>
 807               <para>
 808                 <emphasis role="bold">wait_for_pending_ino_quota_req
 809                 (qctxt_wait_pending_dqacq)</emphasis>
 810               </para>
 811             </entry>
 812             <entry>
 813               <para>On the MDS, there is one thread sending a quota request for
 814               a specific UID/GID for inode quota at any time. If other threads
 815               need to do this too, they should wait. This is done in the
 816               qctxt_wait_pending_dqacq function.</para>
 817             </entry>
 818           </row>
 819           <row>
 820             <entry>
 821               <para>
 822                 <emphasis role="bold">nowait_for_pending_blk_quota_req
 823                 (qctxt_wait_pending_dqacq)</emphasis>
 824               </para>
 825             </entry>
 826             <entry>
 827               <para>On the MDS or OSTs, there is one thread sending a quota
 828               request for a specific UID/GID for block quota at any time. When
 829               threads enter qctxt_wait_pending_dqacq, they do not need to wait.
 830               This is done in the qctxt_wait_pending_dqacq function.</para>
 831             </entry>
 832           </row>
 833           <row>
 834             <entry>
 835               <para>
 836                 <emphasis role="bold">nowait_for_pending_ino_quota_req
 837                 (qctxt_wait_pending_dqacq)</emphasis>
 838               </para>
 839             </entry>
 840             <entry>
 841               <para>On the MDS, there is one thread sending a quota request for
 842               a specific UID/GID for inode quota at any time. When threads
 843               enter qctxt_wait_pending_dqacq, they do not need to wait. This is
 844               done in the qctxt_wait_pending_dqacq function.</para>
 845             </entry>
 846           </row>
 847           <row>
 848             <entry>
 849               <para>
 850                 <emphasis role="bold">quota_ctl</emphasis>
 851               </para>
 852             </entry>
 853             <entry>
 854               <para>The quota_ctl statistic is generated when lfs
 855               <literal>setquota</literal>,
 856               <literal>lfs quota</literal> and so on, are issued.</para>
 857             </entry>
 858           </row>
 859           <row>
 860             <entry>
 861               <para>
 862                 <emphasis role="bold">adjust_qunit</emphasis>
 863               </para>
 864             </entry>
 865             <entry>
 866               <para>Each time qunit is adjusted, it is counted.</para>
 867             </entry>
 868           </row>
 869         </tbody>
 870       </tgroup>
 871     </informaltable>
 872     <section remap="h3">
 873       <title>Interpreting Quota Statistics</title>
 874       <para>Quota statistics are an important measure of the performance of a
 875       Lustre file system. Interpreting these statistics correctly can help you
 876       diagnose problems with quotas, and may indicate adjustments to improve
 877       system performance.</para>
 878       <para>For example, if you run this command on the OSTs:</para>
 879       <screen>
 880 lctl get_param lquota.testfs-OST0000.stats
 881 </screen>
 882       <para>You will get a result similar to this:</para>
 883       <screen>
 884 snapshot_time                                1219908615.506895 secs.usecs
 885 async_acq_req                              1 samples [us]  32 32 32
 886 async_rel_req                              1 samples [us]  5 5 5
 887 nowait_for_pending_blk_quota_req(qctxt_wait_pending_dqacq) 1 samples [us] 2\
 888  2 2
 889 quota_ctl                          4 samples [us]  80 3470 4293
 890 adjust_qunit                               1 samples [us]  70 70 70
 891 ....
 892 </screen>
 893       <para>In the first line,
 894       <literal>snapshot_time</literal> indicates when the statistics were taken.
 895       The remaining lines list the quota events and their associated
 896       data.</para>
 897       <para>In the second line, the
 898       <literal>async_acq_req</literal> event occurs one time. The
 899       <literal>min_time</literal>,
 900       <literal>max_time</literal> and
 901       <literal>sum_time</literal> statistics for this event are 32, 32 and 32,
 902       respectively. The unit is microseconds (μs).</para>
 903       <para>In the fifth line, the quota_ctl event occurs four times. The
 904       <literal>min_time</literal>,
 905       <literal>max_time</literal> and
 906       <literal>sum_time</literal> statistics for this event are 80, 3470 and
 907       4293, respectively. The unit is microseconds (μs).</para>
 908     </section>
 909   </section>
 910 </chapter>