1 <?xml version='1.0' encoding='utf-8'?>
2 <chapter xmlns="http://docbook.org/ns/docbook"
3 xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
4 xml:id="configuringquotas">
5 <title xml:id="configuringquotas.title">Configuring and Managing
7 <section xml:id="quota_configuring">
10 <primary>Quotas</primary>
11 <secondary>configuring</secondary>
12 </indexterm>Working with Quotas</title>
13 <para>Quotas allow a system administrator to limit the amount of disk
14 space a user, group, or project can use. Quotas are set by root, and can
15 be specified for individual users, groups, and/or projects. Before a file
16 is written to a partition where quotas are set, the quota of the creator's
17 group is checked. If a quota exists, then the file size counts towards
18 the group's quota. If no quota exists, then the owner's user quota is
19 checked before the file is written. Similarly, inode usage for specific
20 functions can be controlled if a user over-uses the allocated space.</para>
21 <para>Lustre quota enforcement differs from standard Linux quota
22 enforcement in several ways:</para>
25 <para>Quotas are administered via the
26 <literal>lfs</literal> and
27 <literal>lctl</literal> commands (post-mount).</para>
30 <para>The quota feature in Lustre software is distributed
31 throughout the system (as the Lustre file system is a distributed file
32 system). Because of this, quota setup and behavior on Lustre is
33 somewhat different from local disk quotas in the following ways:</para>
36 <para>No single point of administration: some commands must be
37 executed on the MGS, other commands on the MDSs and OSSs, and still
38 other commands on the client.</para>
41 <para>Granularity: a local quota is typically specified for
42 kilobyte resolution, Lustre uses one megabyte as the smallest quota
46 <para>Accuracy: quota information is distributed throughout the file
47 system and can only be accurately calculated with a quiescent file
48 system in order to minimize performance overhead during normal use.
54 <para>Quotas are allocated and consumed in a quantized fashion.</para>
57 <para>Client does not set the
58 <literal>usrquota</literal> or
59 <literal>grpquota</literal> options to mount. Space accounting is
60 enabled by default and quota enforcement can be enabled/disabled on
61 a per-filesystem basis with <literal>lctl set_param -P</literal>.</para>
62 <para condition="l28">It is worth noting that the
63 <literal>lfs quotaon</literal>, <literal>lfs quotaoff</literal>,
64 <literal>lfs quotacheck</literal> and <literal>quota_type</literal>
65 sub-commands are deprecated as of Lustre 2.4.0, and removed completely
66 in Lustre 2.8.0.</para>
70 <para>Although a quota feature is available in the Lustre software, root
71 quotas are NOT enforced.</para>
73 <literal>lfs setquota -u root</literal> (limits are not enforced)</para>
75 <literal>lfs quota -u root</literal> (usage includes internal Lustre data
76 that is dynamic in size and does not accurately reflect mount point
77 visible block and inode usage).</para>
80 <section xml:id="enabling_disk_quotas">
83 <primary>Quotas</primary>
84 <secondary>enabling disk</secondary>
85 </indexterm>Enabling Disk Quotas</title>
86 <para>The design of quotas on Lustre has management and enforcement
87 separated from resource usage and accounting. Lustre software is
88 responsible for management and enforcement. The back-end file
89 system is responsible for resource usage and accounting. Because of
90 this, it is necessary to begin enabling quotas by enabling quotas on the
94 <para>Quota setup is orchestrated by the MGS and <emphasis>all setup
95 commands in this section must be run directly on the MGS</emphasis>.
96 Support for project quotas specifically requires Lustre Release 2.10 or
97 later. A <emphasis>patched server</emphasis> may be required, depending
98 on the kernel version and backend filesystem type:</para>
99 <informaltable frame="all">
101 <colspec colname="c1" colwidth="50*" />
102 <colspec colname="c2" colwidth="50*" align="center" />
107 <emphasis role="bold">Configuration</emphasis>
112 <emphasis role="bold">Patched Server Required?</emphasis>
120 <emphasis>ldiskfs with kernel version < 4.5</emphasis>
122 <entry><para>Yes</para></entry>
126 <emphasis>ldiskfs with kernel version >= 4.5</emphasis>
128 <entry><para>No</para></entry>
132 <emphasis>zfs version >=0.8 with kernel
133 version < 4.5</emphasis>
135 <entry><para>Yes</para></entry>
139 <emphasis>zfs version >=0.8 with kernel
140 version > 4.5</emphasis>
142 <entry><para>No</para></entry>
147 <para>*Note: Project quotas are not supported on zfs versions earlier
150 <para>Once setup, verification of the quota state must be performed on the
151 MDT. Although quota enforcement is managed by the Lustre software, each
152 OSD implementation relies on the back-end file system to maintain
153 per-user/group/project block and inode usage. Hence, differences exist
154 when setting up quotas with ldiskfs or ZFS back-ends:</para>
157 <para>For ldiskfs backends,
158 <literal>mkfs.lustre</literal> now creates empty quota files and
159 enables the QUOTA feature flag in the superblock which turns quota
160 accounting on at mount time automatically. e2fsck was also modified
161 to fix the quota files when the QUOTA feature flag is present. The
162 project quota feature is disabled by default, and
163 <literal>tune2fs</literal> needs to be run to enable every target
164 manually. If user, group, and project quota usage is inconsistent,
165 run <literal>e2fsck -f</literal> on all unmounted MDTs and OSTs.
169 <para>For ZFS backend, <emphasis>the project quota feature is not
170 supported on zfs versions less than 0.8.0.</emphasis> Accounting ZAPs
171 are created and maintained by the ZFS file system itself. While ZFS
172 tracks per-user and group block usage, it does not handle inode
173 accounting for ZFS versions prior to zfs-0.7.0. The ZFS OSD previously
174 implemented its own support for inode tracking. Two options are
178 <para>The ZFS OSD can estimate the number of inodes in-use based
179 on the number of blocks used by a given user or group. This mode
180 can be enabled by running the following command on the server
182 <literal>lctl set_param
183 osd-zfs.${FSNAME}-${TARGETNAME}.quota_iused_estimate=1</literal>.
187 <para>Similarly to block accounting, dedicated ZAPs are also
188 created the ZFS OSD to maintain per-user and group inode usage.
189 This is the default mode which corresponds to
190 <literal>quota_iused_estimate</literal> set to 0.</para>
196 <para>To (re-)enable space usage quota on ldiskfs filesystems, run
197 <literal>tune2fs -O quota</literal> against all targets. This command
198 sets the QUOTA feature flag in the superblock and runs e2fsck internally.
199 As a result, the target must be offline to build the per-UID/GID disk
200 usage database.</para>
201 <para condition="l2A">Lustre filesystems formatted with a Lustre release
202 prior to 2.10 can be still safely upgraded to release 2.10, but will not
203 have project quota usage reporting functional until 2.15.0 or
204 <literal>tune2fs -O project</literal> is run against all ldiskfs backend
205 targets. This command sets the PROJECT feature flag in the superblock and
206 runs e2fsck (as a result, the target must be offline). See
207 <xref linkend="quota_interoperability"/> for further important
208 considerations.</para>
211 <para>Lustre requires a version of e2fsprogs that supports quota
212 to be installed on the server nodes when using the ldiskfs backend
213 (e2fsprogs is not needed with ZFS backend). In general, we recommend
214 to use the latest e2fsprogs version available on
215 <link xl:href="https://downloads.whamcloud.com/public/e2fsprogs/">
216 https://downloads.whamcloud.com/public/e2fsprogs/</link>.</para>
217 <para>The ldiskfs OSD relies on the standard Linux quota to maintain
218 accounting information on disk. As a consequence, the Linux kernel
219 running on the Lustre servers using ldiskfs backend must have
220 <literal>CONFIG_QUOTA</literal>,
221 <literal>CONFIG_QUOTACTL</literal> and
222 <literal>CONFIG_QFMT_V2</literal> enabled.</para>
224 <para>Quota enforcement is turned on/off independently of space
225 accounting which is always enabled. There is a single per-file
226 system quota parameter controlling inode/block quota enforcement.
227 Like all permanent parameters, this quota parameter can be set via
228 <literal>lctl set_param -P</literal> on the MGS via the command:</para>
230 lctl set_param -P osd-*.<replaceable>fsname</replaceable>-*.quota_slave_<replaceable>md|dt</replaceable>.enabled=<replaceable>u|g|p|none</replaceable>
235 <literal>dt</literal> -- to configure data/block quota managed
236 by OSTs (and MDTs for DoM files)</para>
240 <literal>md</literal> -- to configure metadata/inode quota
241 managed by MDTs</para>
245 <literal>u</literal> -- to enable quota enforcement for users
250 <literal>g</literal> -- to enable quota enforcement for groups
255 <literal>p</literal> -- to enable quota enforcement for projects
260 <literal>ug</literal> -- to enable quota enforcement for all
261 users and groups</para>
265 <literal>ugp</literal> -- to enable quota enforcement for all
266 users, groups, and projects</para>
270 <literal>none</literal> -- to disable quota enforcement for
271 all users, groups and projects</para>
274 <para>Examples:</para>
275 <para>To turn on user, group, and project quotas for block only on
277 <literal>testfs1</literal>, <emphasis>on the MGS</emphasis> run:</para>
278 <screen>mgs# lctl set_param -P osd-*.testfs1*.quota_slave_dt.enabled=ugp</screen>
279 <para>To turn on only group quotas for inodes on file system
280 <literal>testfs2</literal>, on the MGS run:</para>
281 <screen>mgs# lctl set_param -P osd*.testfs2*.quota_slave_md.enabled=g</screen>
282 <para>To turn off user, group, and project quotas for both inode and block
284 <literal>testfs3</literal>, on the MGS run:</para>
285 <screen>mgs# lctl set_param -P osd*.testfs3*.quota*.enabled=none</screen>
286 <section xml:id="quota_verification">
289 <primary>Quotas</primary>
290 <secondary>verifying</secondary>
291 </indexterm>Quota Verification</title>
292 <para>Once the quota parameters have been configured, all targets
293 which are part of the file system will be automatically notified of the
294 new quota settings and enable/disable quota enforcement as needed. The
295 per-target enforcement status can still be verified by running the
296 following <emphasis>command on the servers</emphasis>:</para>
298 $ lctl get_param osd-*.*.quota_slave_*.enabled
299 osd-zfs.testfs1-MDT0000.quota_slave_dt.enabled=ugp
300 osd-zfs.testfs1-OST0000.quota_slave_dt.enabled=ugp
304 <section xml:id="quota_administration">
307 <primary>Quotas</primary>
308 <secondary>creating</secondary>
309 </indexterm>Quota Administration</title>
310 <para>Once the file system is up and running, quota limits on blocks
311 and inodes can be set for user, group, and project. This is <emphasis>
312 controlled entirely from a client</emphasis> via three quota
315 <emphasis role="bold">Grace period</emphasis>-- The period of time (in
316 seconds) within which users are allowed to exceed their soft limit. There
317 are six types of grace periods:</para>
320 <para>user block soft limit</para>
323 <para>user inode soft limit</para>
326 <para>group block soft limit</para>
329 <para>group inode soft limit</para>
332 <para>project block soft limit</para>
335 <para>project inode soft limit</para>
338 <para>The grace period applies to all users. The user block soft limit is
339 for all users who are using a blocks quota.</para>
341 <emphasis role="bold">Soft limit</emphasis> -- The grace timer is started
342 once the soft limit is exceeded. At this point, the user/group/project
343 can still allocate block/inode. When the grace time expires and if the
344 user is still above the soft limit, the soft limit becomes a hard limit
345 and the user/group/project can't allocate any new block/inode any more.
346 The user/group/project should then delete files to be under the soft limit.
347 The soft limit MUST be smaller than the hard limit. If the soft limit is
348 not needed, it should be set to zero (0).</para>
350 <emphasis role="bold">Hard limit</emphasis> -- Block or inode allocation
352 <literal>EDQUOT</literal>(i.e. quota exceeded) when the hard limit is
353 reached. The hard limit is the absolute limit. When a grace period is set,
354 one can exceed the soft limit within the grace period if under the hard
356 <para>Due to the distributed nature of a Lustre file system and the need to
357 maintain performance under load, those quota parameters may not be 100%
358 accurate. The quota settings can be manipulated via the
359 <literal>lfs</literal> command, executed on a client, and includes several
360 options to work with quotas:</para>
364 <varname>quota</varname> -- displays general quota information (disk
365 usage and limits)</para>
369 <varname>setquota</varname> -- specifies quota limits and tunes the
370 grace period. By default, the grace period is one week.</para>
375 lfs quota [-q] [-v] [-h] [-o obd_uuid] [-u|-g|-p <replaceable>uname|uid|gname|gid|projid</replaceable>] <replaceable>/mount_point</replaceable>
376 lfs quota -t {-u|-g|-p} <replaceable>/mount_point</replaceable>
377 lfs setquota {-u|--user|-g|--group|-p|--project} <replaceable>username|groupname</replaceable> [-b <replaceable>block-softlimit</replaceable>] \
378 [-B <replaceable>block_hardlimit</replaceable>] [-i <replaceable>inode_softlimit</replaceable>] \
379 [-I <replaceable>inode_hardlimit</replaceable>] <replaceable>/mount_point</replaceable>
381 <para>To display general quota information (disk usage and limits) for the
382 user running the command and his primary group, run:</para>
384 $ lfs quota /mnt/testfs
386 <para>To display general quota information for a specific user ("
387 <literal>bob</literal>" in this example), run:</para>
389 $ lfs quota -u bob /mnt/testfs
391 <para>To display general quota information for a specific user ("
392 <literal>bob</literal>" in this example) and detailed quota statistics for
393 each MDT and OST, run:</para>
395 $ lfs quota -u bob -v /mnt/testfs
397 <para>To display general quota information for a specific project ("
398 <literal>1</literal>" in this example), run:</para>
400 $ lfs quota -p 1 /mnt/testfs
402 <para>To display general quota information for a specific group ("
403 <literal>eng</literal>" in this example), run:</para>
405 $ lfs quota -g eng /mnt/testfs
407 <para>To limit quota usage for a specific project ID on a specific
408 directory ("<literal>/mnt/testfs/dir</literal>" in this example), run:</para>
410 $ lfs project -s -p 1 -r /mnt/testfs/dir
411 $ lfs setquota -p 1 -b 307200 -B 309200 -i 10000 -I 11000 /mnt/testfs
413 <para> Recursively list all descendants'(of the directory) project attribute on
414 directory ("<literal>/mnt/testfs/dir</literal>" in this example), run:</para>
416 $ lfs project -r /mnt/testfs/dir
418 <para>Please note that if it is desired to have
419 <literal>lfs quota -p</literal> show the space/inode usage under the
420 directory properly (much faster than <literal>du</literal>), then the
421 user/admin needs to use different project IDs for different directories.
423 <para>To display block and inode grace times for user quotas, run:</para>
425 $ lfs quota -t -u /mnt/testfs
427 <para>To set user or group quotas for a specific ID ("bob" in this
428 example), run:</para>
430 $ lfs setquota -u bob -b 307200 -B 309200 -i 10000 -I 11000 /mnt/testfs
432 <para>In this example, the quota for user "bob" is set to 300 MB
433 (309200*1024) and the hard limit is 11,000 files. Therefore, the inode hard
434 limit should be 11000.</para>
435 <para>The quota command displays the quota allocated and consumed by each
436 Lustre target. Using the previous
437 <literal>setquota</literal> example, running this
438 <literal>lfs</literal> quota command:</para>
440 $ lfs quota -u bob -v /mnt/testfs
442 <para>displays this command output:</para>
444 Disk quotas for user bob (uid 6000):
445 Filesystem kbytes quota limit grace files quota limit grace
446 /mnt/testfs 0 30720 30920 - 0 10000 11000 -
447 testfs-MDT0000_UUID 0 - 8192 - 0 - 2560 -
448 testfs-OST0000_UUID 0 - 8192 - 0 - 0 -
449 testfs-OST0001_UUID 0 - 8192 - 0 - 0 -
450 Total allocated inode limit: 2560, total allocated block limit: 24576
452 <para>Global quota limits are stored in dedicated index files (there is one
453 such index per quota type) on the quota master target (aka QMT). The QMT
454 runs on MDT0000 and exports the global indices via <replaceable>lctl
455 get_param</replaceable>. The global indices can thus be dumped via the
458 # lctl get_param qmt.testfs-QMT0000.*.glb-*
459 </screen>The format of global indexes depends on the OSD type. The ldiskfs OSD
460 uses an IAM files while the ZFS OSD creates dedicated ZAPs.</para>
461 <para>Each slave also stores a copy of this global index locally. When the
462 global index is modified on the master, a glimpse callback is issued on the
463 global quota lock to notify all slaves that the global index has been
464 modified. This glimpse callback includes information about the identifier
465 subject to the change. If the global index on the QMT is modified while a
466 slave is disconnected, the index version is used to determine whether the
467 slave copy of the global index isn't up to date any more. If so, the slave
468 fetches the whole index again and updates the local copy. The slave copy of
469 the global index can also be accessed via the following command:
471 lctl get_param osd-*.*.quota_slave.limit*
474 <section condition='l2C' xml:id="default_quota">
477 <primary>Quotas</primary>
478 <secondary>default</secondary>
479 </indexterm>Default Quota</title>
480 <para>The default quota is used to enforce the quota limits for any user,
481 group, or project that do not have quotas set by administrator.</para>
482 <para>The default quota can be disabled by setting limits to
483 <literal>0</literal>.</para>
484 <section xml:id="defalut_quota_usage">
487 <primary>Quotas</primary>
488 <secondary>usage</secondary>
489 </indexterm>Usage</title>
491 lfs quota [-U|--default-usr|-G|--default-grp|-P|--default-prj] <replaceable>/mount_point</replaceable>
492 lfs setquota {-U|--default-usr|-G|--default-grp|-P|--default-prj} [-b <replaceable>block-softlimit</replaceable>] \
493 [-B <replaceable>block_hardlimit</replaceable>] [-i <replaceable>inode_softlimit</replaceable>] [-I <replaceable>inode_hardlimit</replaceable>] <replaceable>/mount_point</replaceable>
494 lfs setquota {-u|-g|-p} <replaceable>username|groupname</replaceable> --default <replaceable>/mount_point</replaceable>
496 <para>To set the default user quota:</para>
498 # lfs setquota -U -b 10G -B 11G -i 100K -I 105K /mnt/testfs
500 <para>To set the default group quota:</para>
502 # lfs setquota -G -b 10G -B 11G -i 100K -I 105K /mnt/testfs
504 <para>To set the default project quota:</para>
506 # lfs setquota -P -b 10G -B 11G -i 100K -I 105K /mnt/testfs
508 <para>To disable the default user quota:</para>
510 # lfs setquota -U -b 0 -B 0 -i 0 -I 0 /mnt/testfs
512 <para>To disable the default group quota:</para>
514 # lfs setquota -G -b 0 -B 0 -i 0 -I 0 /mnt/testfs
516 <para>To disable the default project quota:</para>
518 # lfs setquota -P -b 0 -B 0 -i 0 -I 0 /mnt/testfs
520 <para>To set user 'bob' to use the default user quota:</para>
522 # lfs setquota -u bob --default /mnt/testfs
524 <para>To set group 'bob' to use the default group quota:</para>
526 # lfs setquota -g bob --default /mnt/testfs
528 <para>To set project 1000 to use the default project quota:</para>
530 # lfs setquota -p 1000 --default /mnt/testfs
534 If quota limits are set for some user, group or project, it will use
535 those specific quota limits instead of the default quota. Quota limits for
536 any user, group or project will use the default quota by setting its quota
537 limits with option '<literal>--default</literal>'.
542 <section xml:id="quota_allocation">
545 <primary>Quotas</primary>
546 <secondary>allocating</secondary>
547 </indexterm>Quota Allocation</title>
548 <para>In a Lustre file system, quota must be properly allocated or users
549 may experience unnecessary failures. The file system block quota is divided
550 up among the OSTs within the file system. Each OST requests an allocation
551 which is increased up to the quota limit. The quota allocation is then
552 <emphasis role="italic">quantized</emphasis> to reduce the number of
553 quota-related request traffic.</para>
554 <para>The Lustre quota system distributes quotas from the Quota Master
555 Target (aka QMT). Only one QMT instance is supported for now and only runs
556 on the same node as MDT0000. All OSTs and MDTs set up a Quota Slave Device
557 (aka QSD) which connects to the QMT to allocate/release quota space. The
558 QSD is setup directly from the OSD layer.</para>
559 <para>To reduce quota requests, quota space is initially allocated to QSDs
560 in very large chunks. How much unused quota space can be held by a target
561 is controlled by the qunit size. When quota space for a given ID is close
562 to exhaustion on the QMT, the qunit size is reduced and QSDs are notified
563 of the new qunit size value via a glimpse callback. Slaves are then
564 responsible for releasing quota space above the new qunit value. The qunit
565 size isn't shrunk indefinitely and there is a minimal value of 1MB for
566 blocks and 1,024 for inodes. This means that the quota space rebalancing
567 process will stop when this minimum value is reached. As a result, quota
568 exceeded can be returned while many slaves still have 1MB or 1,024 inodes
569 of spare quota space.</para>
570 <para>If we look at the
571 <literal>setquota</literal> example again, running this
572 <literal>lfs quota</literal> command:</para>
574 # lfs quota -u bob -v /mnt/testfs
576 <para>displays this command output:</para>
578 Disk quotas for user bob (uid 500):
579 Filesystem kbytes quota limit grace files quota limit grace
580 /mnt/testfs 30720* 30720 30920 6d23h56m44s 10101* 10000 11000
582 testfs-MDT0000_UUID 0 - 0 - 10101 - 10240
583 testfs-OST0000_UUID 0 - 1024 - - - -
584 testfs-OST0001_UUID 30720* - 29896 - - - -
585 Total allocated inode limit: 10240, total allocated block limit: 30920
587 <para>The total quota limit of 30,920 is allocated to user bob, which is
588 further distributed to two OSTs.</para>
589 <para>Values appended with '
590 <literal>*</literal>' show that the quota limit has been exceeded, causing
591 the following error when trying to write or create a file:</para>
594 $ cp: writing `/mnt/testfs/foo`: Disk quota exceeded.
598 <para>It is very important to note that the block quota is consumed per
599 OST and the inode quota per MDS. Therefore, when the quota is consumed on
600 one OST (resp. MDT), the client may not be able to create files
601 regardless of the quota available on other OSTs (resp. MDTs).</para>
602 <para>Setting the quota limit below the minimal qunit size may prevent
603 the user/group from all file creation. It is thus recommended to use
604 soft/hard limits which are a multiple of the number of OSTs * the minimal
607 <para>To determine the total number of inodes, use
608 <literal>lfs df -i</literal>(and also
609 <literal>lctl get_param *.*.filestotal</literal>). For more information on
611 <literal>lfs df -i</literal> command and the command output, see
612 <xref linkend="file_striping.checking_free_space" />.</para>
613 <para>Unfortunately, the
614 <literal>statfs</literal> interface does not report the free inode count
615 directly, but instead reports the total inode and used inode counts. The
616 free inode count is calculated for
617 <literal>df</literal> from (total inodes - used inodes). It is not critical
618 to know the total inode count for a file system. Instead, you should know
619 (accurately), the free inode count and the used inode count for a file
620 system. The Lustre software manipulates the total inode count in order to
621 accurately report the other two values.</para>
623 <section xml:id="quota_interoperability">
626 <primary>Quotas</primary>
627 <secondary>Interoperability</secondary>
628 </indexterm>Quotas and Version Interoperability</title>
629 <para condition="l2A">To use the project quota functionality introduced in
630 Lustre 2.10, <emphasis role="bold">all Lustre servers and clients must be
631 upgraded to Lustre release 2.10 or later for project quota to work
632 correctly</emphasis>. Otherwise, project quota will be inaccessible on
633 clients and not be accounted for on OSTs. Furthermore, the
634 <emphasis role="bold">servers may be required to use a patched kernel,
635 </emphasis> for more information see
636 <xref linkend="enabling_disk_quotas"/>.</para>
637 <para condition="l2E"><literal>df</literal> and <literal>lfs df</literal>
638 will return the amount of space available to that project rather than the
639 total filesystem space, if the project quota limit is smaller.
640 <emphasis role="bold"> Only client need be upgraded to Lustre
641 release 2.14 or later to apply this new behavior</emphasis>.</para>
643 <section xml:id="granted_cache_and_quota_limits">
646 <primary>Quotas</primary>
647 <secondary>known issues</secondary>
648 </indexterm>Granted Cache and Quota Limits</title>
649 <para>In a Lustre file system, granted cache does not respect quota limits.
650 In this situation, OSTs grant cache to a Lustre client to accelerate I/O.
651 Granting cache causes writes to be successful in OSTs, even if they exceed
652 the quota limits, and will overwrite them.</para>
653 <para>The sequence is:</para>
656 <para>A user writes files to the Lustre file system.</para>
659 <para>If the Lustre client has enough granted cache, then it returns
660 'success' to users and arranges the writes to the OSTs.</para>
663 <para>Because Lustre clients have delivered success to users, the OSTs
664 cannot fail these writes.</para>
667 <para>Because of granted cache, writes always overwrite quota limitations.
668 For example, if you set a 400 GB quota on user A and use IOR to write for
669 user A from a bundle of clients, you will write much more data than 400 GB,
670 and cause an out-of-quota error (
671 <literal>EDQUOT</literal>).</para>
673 <para>The effect of granted cache on quota limits can be mitigated, but
674 not eradicated. Reduce the maximum amount of dirty data on the clients
675 (minimal value is 1MB):</para>
679 <literal>lctl set_param osc.*.max_dirty_mb=8</literal>
685 <section xml:id="lustre_quota_statistics">
688 <primary>Quotas</primary>
689 <secondary>statistics</secondary>
690 </indexterm>Lustre Quota Statistics</title>
691 <para>The Lustre software includes statistics that monitor quota activity,
692 such as the kinds of quota RPCs sent during a specific period, the average
693 time to complete the RPCs, etc. These statistics are useful to measure
694 performance of a Lustre file system.</para>
695 <para>Each quota statistic consists of a quota event and
696 <literal>min_time</literal>,
697 <literal>max_time</literal> and
698 <literal>sum_time</literal> values for the event.</para>
699 <informaltable frame="all">
701 <colspec colname="c1" colwidth="50*" />
702 <colspec colname="c2" colwidth="50*" />
707 <emphasis role="bold">Quota Event</emphasis>
712 <emphasis role="bold">Description</emphasis>
721 <emphasis role="bold">sync_acq_req</emphasis>
725 <para>Quota slaves send a acquiring_quota request and wait for
732 <emphasis role="bold">sync_rel_req</emphasis>
736 <para>Quota slaves send a releasing_quota request and wait for
743 <emphasis role="bold">async_acq_req</emphasis>
747 <para>Quota slaves send an acquiring_quota request and do not
748 wait for its return.</para>
754 <emphasis role="bold">async_rel_req</emphasis>
758 <para>Quota slaves send a releasing_quota request and do not wait
759 for its return.</para>
765 <emphasis role="bold">wait_for_blk_quota
766 (lquota_chkquota)</emphasis>
770 <para>Before data is written to OSTs, the OSTs check if the
771 remaining block quota is sufficient. This is done in the
772 lquota_chkquota function.</para>
778 <emphasis role="bold">wait_for_ino_quota
779 (lquota_chkquota)</emphasis>
783 <para>Before files are created on the MDS, the MDS checks if the
784 remaining inode quota is sufficient. This is done in the
785 lquota_chkquota function.</para>
791 <emphasis role="bold">wait_for_blk_quota
792 (lquota_pending_commit)</emphasis>
796 <para>After blocks are written to OSTs, relative quota
797 information is updated. This is done in the lquota_pending_commit
804 <emphasis role="bold">wait_for_ino_quota
805 (lquota_pending_commit)</emphasis>
809 <para>After files are created, relative quota information is
810 updated. This is done in the lquota_pending_commit
817 <emphasis role="bold">wait_for_pending_blk_quota_req
818 (qctxt_wait_pending_dqacq)</emphasis>
822 <para>On the MDS or OSTs, there is one thread sending a quota
823 request for a specific UID/GID for block quota at any time. At
824 that time, if other threads need to do this too, they should
825 wait. This is done in the qctxt_wait_pending_dqacq
832 <emphasis role="bold">wait_for_pending_ino_quota_req
833 (qctxt_wait_pending_dqacq)</emphasis>
837 <para>On the MDS, there is one thread sending a quota request for
838 a specific UID/GID for inode quota at any time. If other threads
839 need to do this too, they should wait. This is done in the
840 qctxt_wait_pending_dqacq function.</para>
846 <emphasis role="bold">nowait_for_pending_blk_quota_req
847 (qctxt_wait_pending_dqacq)</emphasis>
851 <para>On the MDS or OSTs, there is one thread sending a quota
852 request for a specific UID/GID for block quota at any time. When
853 threads enter qctxt_wait_pending_dqacq, they do not need to wait.
854 This is done in the qctxt_wait_pending_dqacq function.</para>
860 <emphasis role="bold">nowait_for_pending_ino_quota_req
861 (qctxt_wait_pending_dqacq)</emphasis>
865 <para>On the MDS, there is one thread sending a quota request for
866 a specific UID/GID for inode quota at any time. When threads
867 enter qctxt_wait_pending_dqacq, they do not need to wait. This is
868 done in the qctxt_wait_pending_dqacq function.</para>
874 <emphasis role="bold">quota_ctl</emphasis>
878 <para>The quota_ctl statistic is generated when lfs
879 <literal>setquota</literal>,
880 <literal>lfs quota</literal> and so on, are issued.</para>
886 <emphasis role="bold">adjust_qunit</emphasis>
890 <para>Each time qunit is adjusted, it is counted.</para>
897 <title>Interpreting Quota Statistics</title>
898 <para>Quota statistics are an important measure of the performance of a
899 Lustre file system. Interpreting these statistics correctly can help you
900 diagnose problems with quotas, and may indicate adjustments to improve
901 system performance.</para>
902 <para>For example, if you run this command on the OSTs:</para>
904 lctl get_param lquota.testfs-OST0000.stats
906 <para>You will get a result similar to this:</para>
908 snapshot_time 1219908615.506895 secs.usecs
909 async_acq_req 1 samples [us] 32 32 32
910 async_rel_req 1 samples [us] 5 5 5
911 nowait_for_pending_blk_quota_req(qctxt_wait_pending_dqacq) 1 samples [us] 2\
913 quota_ctl 4 samples [us] 80 3470 4293
914 adjust_qunit 1 samples [us] 70 70 70
917 <para>In the first line,
918 <literal>snapshot_time</literal> indicates when the statistics were taken.
919 The remaining lines list the quota events and their associated
921 <para>In the second line, the
922 <literal>async_acq_req</literal> event occurs one time. The
923 <literal>min_time</literal>,
924 <literal>max_time</literal> and
925 <literal>sum_time</literal> statistics for this event are 32, 32 and 32,
926 respectively. The unit is microseconds (μs).</para>
927 <para>In the fifth line, the quota_ctl event occurs four times. The
928 <literal>min_time</literal>,
929 <literal>max_time</literal> and
930 <literal>sum_time</literal> statistics for this event are 80, 3470 and
931 4293, respectively. The unit is microseconds (μs).</para>
934 <section xml:id="quota_pools" condition='l2E'>
937 <primary>Quotas</primary>
938 <secondary>pools</secondary>
939 </indexterm>Pool Quotas</title>
941 OST Pool Quotas feature gives an ability to limit user's (group's/project's)
942 disk usage at OST pool level. Each OST Pool Quota (PQ) maps directly to the
943 OST pool of the same name. Thus PQ could be tuned with standard <literal>
944 lctl pool_new/add/remove/erase</literal> commands. All PQ are subset of a
945 global pool that includes all OSTs and MDTs (DOM case).
946 It may be initially confusing to be prevented from using "all of" one quota
947 due to a different quota setting. In Lustre, a quota is a limit, not a right
948 to use an amount. You don't always get to use your quota - an OST may be out
949 of space, or some other quota is limiting. For example, if there is an inode
950 quota and a space quota, and you hit your inode limit while you still have
951 plenty of space, you can't use the space. For another example, quotas may
952 easily be over-allocated: everyone gets 10PB of quota, in a 15PB system.
953 That does not give them the right to use 10PB, it means they cannot use more
954 than 10PB. They may very well get ENOSPC long before that - but they will not
955 get EDQUOT. This behavior already exists in Lustre today, but pool quotas
956 increase the number of limits in play: user, group or project global space quota
957 and now all of those limits can also be defined for each pool. In all cases,
958 the net effect is that the actual amount of space you can use is limited to the
959 smallest (min) quota out of everything that is applicable.
961 <link xl:href="http://wiki.lustre.org/OST_Pool_Quotas_HLD">
962 OST Pool Quotas HLD</link>
965 <title>DOM and MDT pools</title>
967 From Quota Master point of view, "data" MDTs are regular members together
968 with OSTs. However Pool Quotas support only OSTs as there is currently
969 no mechanism to group MDTs in pools.
973 <title>Lfs quota/setquota options to setup quota pools</title>
975 The same long option <literal>--pool</literal> is used to setup and report
976 Pool Quotas with <literal>lfs setquota</literal> and <literal>lfs setquota</literal>.
979 <literal>lfs setquota --pool <replaceable>pool_name</replaceable></literal>
980 is used to set the block and soft usage limit for the user, group, or
981 project for the specified pool name.
984 <literal>lfs quota --pool <replaceable>pool_name</replaceable></literal>
985 shows the user, group, or project usage for the specified pool name.
989 <title>Quota pools interoperability</title>
991 Both client and server should have at least Lustre 2.14 to support Pool Quotas.
994 <para>Pool Quotas may be able to work with older clients if server
995 supports Pool Quotas. Pool quotas cannot be viewed or modified by
996 older clients. Since the quota enforcement is done on the servers, only
997 a single client is needed to configure the quotas. This could be done by
998 mounting a client directly on the MDS if needed.
1002 <section remap="h3">
1003 <title>Pool Quotas Hard Limit setup example</title>
1005 Let's imagine you need to setup quota usage for already existed OST pool
1006 <literal>flash_pool</literal>:
1009 # it is a limit for global pool. PQ don't work properly without that
1010 lfs setquota -u <replaceable>ivan</replaceable> -B<replaceable>100T /mnt/testfs</replaceable>
1011 # set 1TiB block hard limit for ivan in a flash_pool
1012 lfs setquota -u <replaceable>ivan</replaceable> --pool <replaceable>flash_pool</replaceable> -B<replaceable>1T /mnt/testfs</replaceable>
1016 <para>System-side hard limit is required before setting Quota Pool limit.
1017 If you do not need to limit user at all OSTs and MDTs at system,
1018 only per pool, it is recommended to set some unrealistic big hard limit.
1019 Without a global limit in place the Quota Pool limit will not be enforced.
1020 No matter hard or soft global limit - at least one of them should be set.
1025 <section remap="h3">
1026 <title>Pool Quotas Soft Limit setup example</title>
1028 # notify OSTs to enforce quota for ivan
1029 lfs setquota -u <replaceable>ivan</replaceable> -B<replaceable>10T /mnt/testfs</replaceable>
1030 # soft limit 10MiB for ivan in a pool flash_pool
1031 lfs setquota -u <replaceable>ivan</replaceable> --pool <replaceable>flash_pool</replaceable> -b<replaceable>1T /mnt/testfs</replaceable>
1032 # set block grace 600 s for all users at flash_pool
1033 lfs setquota -t -u --block-grace <replaceable>600</replaceable> --pool <replaceable>flash_pool /mnt/testfs</replaceable>
1039 vim:expandtab:shiftwidth=2:tabstop=8: