1 <?xml version='1.0' encoding='UTF-8'?>
2 <chapter xmlns="http://docbook.org/ns/docbook"
3 xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US"
4 xml:id="managingsecurity">
5 <title xml:id="managingsecurity.title">Managing Security in a Lustre File System</title>
6 <para>This chapter describes security features of the Lustre file system and
7 includes the following sections:</para>
10 <para><xref linkend="managingSecurity.acl"/></para>
13 <para><xref linkend="managingSecurity.root_squash"/></para>
16 <para><xref linkend="managingSecurity.isolation"/></para>
19 <para><xref linkend="managingSecurity.sepol"/></para>
22 <section xml:id="managingSecurity.acl">
23 <title><indexterm><primary>Access Control List (ACL)</primary></indexterm>
25 <para>An access control list (ACL), is a set of data that informs an
26 operating system about permissions or access rights that each user or
27 group has to specific system objects, such as directories or files. Each
28 object has a unique security attribute that identifies users who have
29 access to it. The ACL lists each object and user access privileges such as
30 read, write or execute.</para>
31 <section xml:id="managingSecurity.acl.howItWorks" remap="h3">
32 <title><indexterm><primary>Access Control List (ACL)</primary><secondary>
33 how they work</secondary></indexterm>How ACLs Work</title>
34 <para>Implementing ACLs varies between operating systems. Systems that
35 support the Portable Operating System Interface (POSIX) family of
36 standards share a simple yet powerful file system permission model,
37 which should be well-known to the Linux/UNIX administrator. ACLs add
38 finer-grained permissions to this model, allowing for more complicated
39 permission schemes. For a detailed explanation of ACLs on a Linux
40 operating system, refer to the SUSE Labs article
41 <link xl:href="http://wiki.lustre.org/images/5/57/PosixAccessControlInLinux.pdf">
42 Posix Access Control Lists on Linux</link>.</para>
43 <para>We have implemented ACLs according to this model. The Lustre
44 software works with the standard Linux ACL tools, setfacl, getfacl, and
45 the historical chacl, normally installed with the ACL package.</para>
47 <para>ACL support is a system-range feature, meaning that all clients
48 have ACL enabled or not. You cannot specify which clients should
52 <section xml:id="managingSecurity.acl.using" remap="h3">
54 <primary>Access Control List (ACL)</primary>
55 <secondary>using</secondary>
56 </indexterm>Using ACLs with the Lustre Software</title>
57 <para>POSIX Access Control Lists (ACLs) can be used with the Lustre
58 software. An ACL consists of file entries representing permissions based
59 on standard POSIX file system object permissions that define three
60 classes of user (owner, group and other). Each class is associated with
61 a set of permissions [read (r), write (w) and execute (x)].</para>
64 <para>Owner class permissions define access privileges of the file
68 <para>Group class permissions define access privileges of the owning
72 <para>Other class permissions define access privileges of all users
73 not in the owner or group class.</para>
76 <para>The <literal>ls -l</literal> command displays the owner, group, and
77 other class permissions in the first column of its output (for example,
78 <literal>-rw-r- --</literal> for a regular file with read and write
79 access for the owner class, read access for the group class, and no
80 access for others).</para>
81 <para>Minimal ACLs have three entries. Extended ACLs have more than the
82 three entries. Extended ACLs also contain a mask entry and may contain
83 any number of named user and named group entries.</para>
84 <para>The MDS needs to be configured to enable ACLs. Use
85 <literal>--mountfsoptions</literal> to enable ACLs when creating your
87 <screen>$ mkfs.lustre --fsname spfs --mountfsoptions=acl --mdt -mgs /dev/sda</screen>
88 <para>Alternately, you can enable ACLs at run time by using the
89 <literal>--acl</literal> option with <literal>mkfs.lustre</literal>:
91 <screen>$ mount -t lustre -o acl /dev/sda /mnt/mdt</screen>
92 <para>To check ACLs on the MDS:</para>
93 <screen>$ lctl get_param -n mdc.home-MDT0000-mdc-*.connect_flags | grep acl acl</screen>
94 <para>To mount the client with no ACLs:</para>
95 <screen>$ mount -t lustre -o noacl ibmds2@o2ib:/home /home</screen>
96 <para>ACLs are enabled in a Lustre file system on a system-wide basis;
97 either all clients enable ACLs or none do. Activating ACLs is controlled
98 by MDS mount options <literal>acl</literal> / <literal>noacl</literal>
99 (enable/disable ACLs). Client-side mount options acl/noacl are ignored.
100 You do not need to change the client configuration, and the
101 'acl' string will not appear in the client /etc/mtab. The
102 client acl mount option is no longer needed. If a client is mounted with
103 that option, then this message appears in the MDS syslog:</para>
104 <screen>...MDS requires ACL support but client does not</screen>
105 <para>The message is harmless but indicates a configuration issue, which
106 should be corrected.</para>
107 <para>If ACLs are not enabled on the MDS, then any attempts to reference
108 an ACL on a client return an Operation not supported error.</para>
110 <section xml:id="managingSecurity.acl.examples" remap="h3">
112 <primary>Access Control List (ACL)</primary>
113 <secondary>examples</secondary>
114 </indexterm>Examples</title>
115 <para>These examples are taken directly from the POSIX paper referenced
116 above. ACLs on a Lustre file system work exactly like ACLs on any Linux
117 file system. They are manipulated with the standard tools in the
118 standard manner. Below, we create a directory and allow a specific user
120 <screen>[root@client lustre]# umask 027
121 [root@client lustre]# mkdir rain
122 [root@client lustre]# ls -ld rain
123 drwxr-x--- 2 root root 4096 Feb 20 06:50 rain
124 [root@client lustre]# getfacl rain
132 [root@client lustre]# setfacl -m user:chirag:rwx rain
133 [root@client lustre]# ls -ld rain
134 drwxrwx---+ 2 root root 4096 Feb 20 06:50 rain
135 [root@client lustre]# getfacl --omit-header rain
143 <section xml:id="managingSecurity.root_squash">
145 <primary>root squash</primary>
146 </indexterm>Using Root Squash</title>
147 <para>Root squash is a security feature which restricts super-user access
148 rights to a Lustre file system. Without the root squash feature enabled,
149 Lustre file system users on untrusted clients could access or modify files
150 owned by root on the file system, including deleting them. Using the root
151 squash feature restricts file access/modifications as the root user to
152 only the specified clients. Note, however, that this does
153 <emphasis>not</emphasis> prevent users on insecure clients from accessing
154 files owned by <emphasis>other</emphasis> users.</para>
155 <para>The root squash feature works by re-mapping the user ID (UID) and
156 group ID (GID) of the root user to a UID and GID specified by the system
157 administrator, via the Lustre configuration management server (MGS). The
158 root squash feature also enables the Lustre file system administrator to
159 specify a set of client for which UID/GID re-mapping does not apply.
161 <note><para>Nodemaps (<xref linkend="lustrenodemap.title" />) are an
162 alternative to root squash, since it also allows root squash on a per-client
163 basis. With UID maps, the clients can even have a local root UID without
164 actually having root access to the filesystem itself.</para></note>
165 <section xml:id="managingSecurity.root_squash.config" remap="h3">
167 <primary>root squash</primary>
168 <secondary>configuring</secondary>
169 </indexterm>Configuring Root Squash</title>
170 <para>Root squash functionality is managed by two configuration
171 parameters, <literal>root_squash</literal> and
172 <literal>nosquash_nids</literal>.</para>
175 <para>The <literal>root_squash</literal> parameter specifies the UID
176 and GID with which the root user accesses the Lustre file system.
180 <para>The <literal>nosquash_nids</literal> parameter specifies the set
181 of clients to which root squash does not apply. LNet NID range
182 syntax is used for this parameter (see the NID range syntax rules
183 described in <xref linkend="managingSecurity.root_squash"/>). For
187 <screen>nosquash_nids=172.16.245.[0-255/2]@tcp</screen>
188 <para>In this example, root squash does not apply to TCP clients on subnet
189 172.16.245.0 that have an even number as the last component of their IP
192 <section xml:id="managingSecurity.root_squash.tuning">
194 <primary>root squash</primary><secondary>enabling</secondary>
195 </indexterm>Enabling and Tuning Root Squash</title>
196 <para>The default value for <literal>nosquash_nids</literal> is NULL,
197 which means that root squashing applies to all clients. Setting the root
198 squash UID and GID to 0 turns root squash off.</para>
199 <para>Root squash parameters can be set when the MDT is created
200 (<literal>mkfs.lustre --mdt</literal>). For example:</para>
201 <screen>mds# mkfs.lustre --reformat --fsname=testfs --mdt --mgs \
202 --param "mdt.root_squash=500:501" \
203 --param "mdt.nosquash_nids='0@elan1 192.168.1.[10,11]'" /dev/sda1</screen>
204 <para>Root squash parameters can also be changed on an unmounted device
205 with <literal>tunefs.lustre</literal>. For example:</para>
206 <screen>tunefs.lustre --param "mdt.root_squash=65534:65534" \
207 --param "mdt.nosquash_nids=192.168.0.13@tcp0" /dev/sda1
209 <para>Root squash parameters can also be changed with the
210 <literal>lctl conf_param</literal> command. For example:</para>
211 <screen>mgs# lctl conf_param testfs.mdt.root_squash="1000:101"
212 mgs# lctl conf_param testfs.mdt.nosquash_nids="*@tcp"</screen>
213 <para>To retrieve the current root squash parameter settings, the
214 following <literal>lctl get_param</literal> commands can be used:</para>
215 <screen>mgs# lctl get_param mdt.*.root_squash
216 mgs# lctl get_param mdt.*.nosquash_nids</screen>
218 <para>When using the lctl conf_param command, keep in mind:</para>
221 <para><literal>lctl conf_param</literal> must be run on a live MGS
225 <para><literal>lctl conf_param</literal> causes the parameter to
226 change on all MDSs</para>
229 <para><literal>lctl conf_param</literal> is to be used once per a
234 <para>The root squash settings can also be changed temporarily with
235 <literal>lctl set_param</literal> or persistently with
236 <literal>lctl set_param -P</literal>. For example:</para>
237 <screen>mgs# lctl set_param mdt.testfs-MDT0000.root_squash="1:0"
238 mgs# lctl set_param -P mdt.testfs-MDT0000.root_squash="1:0"</screen>
239 <para>The <literal>nosquash_nids</literal> list can be cleared with:</para>
240 <screen>mgs# lctl conf_param testfs.mdt.nosquash_nids="NONE"</screen>
242 <screen>mgs# lctl conf_param testfs.mdt.nosquash_nids="clear"</screen>
243 <para>If the <literal>nosquash_nids</literal> value consists of several
244 NID ranges (e.g. <literal>0@elan</literal>, <literal>1@elan1</literal>),
245 the list of NID ranges must be quoted with single (') or double
246 ('') quotation marks. List elements must be separated with a
247 space. For example:</para>
248 <screen>mds# mkfs.lustre ... --param "mdt.nosquash_nids='0@elan1 1@elan2'" /dev/sda1
249 lctl conf_param testfs.mdt.nosquash_nids="24@elan 15@elan1"</screen>
250 <para>These are examples of incorrect syntax:</para>
251 <screen>mds# mkfs.lustre ... --param "mdt.nosquash_nids=0@elan1 1@elan2" /dev/sda1
252 lctl conf_param testfs.mdt.nosquash_nids=24@elan 15@elan1</screen>
253 <para>To check root squash parameters, use the lctl get_param command:
255 <screen>mds# lctl get_param mdt.testfs-MDT0000.root_squash
256 lctl get_param mdt.*.nosquash_nids</screen>
258 <para>An empty nosquash_nids list is reported as NONE.</para>
261 <section xml:id="managingSecurity.root_squash.tips" remap="h3">
263 <primary>root squash</primary>
264 <secondary>tips</secondary>
265 </indexterm>Tips on Using Root Squash</title>
266 <para>Lustre configuration management limits root squash in several ways.
270 <para>The <literal>lctl conf_param</literal> value overwrites the
271 parameter's previous value. If the new value uses an incorrect
272 syntax, then the system continues with the old parameters and the
273 previously-correct value is lost on remount. That is, be careful
274 doing root squash tuning.</para>
277 <para><literal>mkfs.lustre</literal> and
278 <literal>tunefs.lustre</literal> do not perform parameter syntax
279 checking. If the root squash parameters are incorrect, they are
280 ignored on mount and the default values are used instead.</para>
283 <para>Root squash parameters are parsed with rigorous syntax checking.
284 The root_squash parameter should be specified as
285 <literal><decnum>:<decnum></literal>. The
286 <literal>nosquash_nids</literal> parameter should follow LNet NID
287 range list syntax.</para>
290 <para>LNet NID range syntax:</para>
291 <screen><nidlist> :== <nidrange> [ ' ' <nidrange> ]
292 <nidrange> :== <addrrange> '@' <net>
293 <addrrange> :== '*' |
294 <ipaddr_range> |
295 <numaddr_range>
296 <ipaddr_range> :==
297 <numaddr_range>.<numaddr_range>.<numaddr_range>.<numaddr_range>
298 <numaddr_range> :== <number> |
300 <expr_list> :== '[' <range_expr> [ ',' <range_expr>] ']'
301 <range_expr> :== <number> |
302 <number> '-' <number> |
303 <number> '-' <number> '/' <number>
304 <net> :== <netname> | <netname><number>
305 <netname> :== "lo" | "tcp" | "o2ib"
306 | "ra" | "elan"
307 <number> :== <nonnegative decimal> | <hexadecimal></screen>
309 <para>For networks using numeric addresses (e.g. elan), the address
310 range must be specified in the
311 <literal><numaddr_range></literal> syntax. For networks using
312 IP addresses, the address range must be in the
313 <literal><ipaddr_range></literal>. For example, if elan is using
314 numeric addresses, <literal>1.2.3.4@elan</literal> is incorrect.
319 <section xml:id="managingSecurity.isolation">
320 <title><indexterm><primary>Isolation</primary></indexterm>
321 Isolating Clients to a Sub-directory Tree</title>
322 <para>Isolation is the Lustre implementation of the generic concept of
323 multi-tenancy, which aims at providing separated namespaces from a single
324 filesystem. Lustre Isolation enables different populations of users on
325 the same file system beyond normal Unix permissions/ACLs, even when users
326 on the clients may have root access. Those tenants share the same file
327 system, but they are isolated from each other: they cannot access or even
328 see each other’s files, and are not aware that they are sharing common
329 file system resources.</para>
330 <para>Lustre Isolation leverages the Fileset feature
331 (<xref linkend="SystemConfigurationUtilities.fileset" />)
332 to mount only a subdirectory of the filesystem rather than the root
334 In order to achieve isolation, the subdirectory mount, which presents to
335 tenants only their own fileset, has to be imposed to the clients. To that
336 extent, we make use of the nodemap feature
337 (<xref linkend="lustrenodemap.title" />). We group all clients used by a
338 tenant under a common nodemap entry, and we assign to this nodemap entry
339 the fileset to which the tenant is restricted.</para>
340 <section xml:id="managingSecurity.isolation.clientid" remap="h3">
341 <title><indexterm><primary>Isolation</primary><secondary>
342 client identification</secondary></indexterm>Identifying Clients</title>
343 <para>Enforcing multi-tenancy on Lustre relies on the ability to properly
344 identify the client nodes used by a tenant, and trust those identities.
345 This can be achieved by having physical hardware and/or network
346 security, so that client nodes have well-known NIDs. It is also possible
347 to make use of strong authentication with Kerberos or Shared-Secret Key
348 (see <xref linkend="lustressk" />).
349 Kerberos prevents NID spoofing, as every client needs its own
350 credentials, based on its NID, in order to connect to the servers.
351 Shared-Secret Key also prevents tenant impersonation, because keys
352 can be linked to a specific nodemap. See
353 <xref linkend="ssknodemaprole" /> for detailed explanations.
356 <section xml:id="managingSecurity.isolation.configuring" remap="h3">
357 <title><indexterm><primary>Isolation</primary><secondary>
358 configuring</secondary></indexterm>Configuring Isolation</title>
359 <para>Isolation on Lustre can be achieved by setting the
360 <literal>fileset</literal> parameter on a nodemap entry. All clients
361 belonging to this nodemap entry will automatically mount this fileset
362 instead of the root directory. For example:</para>
363 <screen>mgs# lctl nodemap_set_fileset --name tenant1 --fileset '/dir1'</screen>
364 <para>So all clients matching the <literal>tenant1</literal> nodemap will
365 be automatically presented the fileset <literal>/dir1</literal> when
366 mounting. This means these clients are doing an implicit subdirectory
367 mount on the subdirectory <literal>/dir1</literal>.
371 If subdirectory defined as fileset does not exist on the file system,
372 it will prevent any client belonging to the nodemap from mounting
376 <para>To delete the fileset parameter, just set it to an empty string:
378 <screen>mgs# lctl nodemap_set_fileset --name tenant1 --fileset ''</screen>
380 <section xml:id="managingSecurity.isolation.permanent" remap="h3">
381 <title><indexterm><primary>Isolation</primary><secondary>
382 making permanent</secondary></indexterm>Making Isolation Permanent
384 <para>In order to make isolation permanent, the fileset parameter on the
385 nodemap has to be set with <literal>lctl set_param</literal> with the
386 <literal>-P</literal> option.</para>
387 <screen>mgs# lctl set_param nodemap.tenant1.fileset=/dir1
388 mgs# lctl set_param -P nodemap.tenant1.fileset=/dir1</screen>
389 <para>This way the fileset parameter will be stored in the Lustre config
390 logs, letting the servers retrieve the information after a restart.
394 <section xml:id="managingSecurity.sepol" condition='l2D'>
395 <title><indexterm><primary>selinux policy check</primary></indexterm>
396 Checking SELinux Policy Enforced by Lustre Clients</title>
397 <para>SELinux provides a mechanism in Linux for supporting Mandatory Access
398 Control (MAC) policies. When a MAC policy is enforced, the operating
399 system’s (OS) kernel defines application rights, firewalling applications
400 from compromising the entire system. Regular users do not have the ability to
401 override the policy.</para>
402 <para>One purpose of SELinux is to protect the
403 <emphasis role="bold">OS</emphasis> from privilege escalation. To that
404 extent, SELinux defines confined and unconfined domains for processes and
405 users. Each process, user, file is assigned a security context, and
406 rules define the allowed operations by processes and users on files.
408 <para>Another purpose of SELinux can be to protect
409 <emphasis role="bold">data</emphasis> sensitivity, thanks to Multi-Level
410 Security (MLS). MLS works on top of SELinux, by defining the concept of
411 security levels in addition to domains. Each process, user and file is
412 assigned a security level, and the model states that processes and users
413 can read the same or lower security level, but can only write to their own
414 or higher security level.
416 <para>From a file system perspective, the security context of files must be
417 stored permanently. Lustre makes use of the
418 <literal>security.selinux</literal> extended attributes on files to hold
419 this information. Lustre supports SELinux on the client side. All you have
420 to do to have MAC and MLS on Lustre is to enforce the appropriate SELinux
421 policy (as provided by the Linux distribution) on all Lustre clients. No
422 SELinux is required on Lustre servers.
424 <para>Because Lustre is a distributed file system, the specificity when
425 using MLS is that Lustre really needs to make sure data is always accessed
426 by nodes with the SELinux MLS policy properly enforced. Otherwise, data is
427 not protected. This means Lustre has to check that SELinux is properly
428 enforced on client side, with the right, unaltered policy. And if SELinux
429 is not enforced as expected on a client, the server denies its access to
432 <section xml:id="managingSecurity.sepol.determining" remap="h3">
433 <title><indexterm><primary>selinux policy check</primary><secondary>
434 determining</secondary></indexterm>Determining SELinux Policy Info
436 <para>A string that represents the SELinux Status info will be used by
437 servers as a reference, to check if clients are enforcing SELinux
438 properly. This reference string can be obtained on a client node known
439 to enforce the right SELinux policy, by calling the
440 <literal>l_getsepol</literal> command line utility:</para>
441 <screen>client# l_getsepol
442 SELinux status info: 1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f</screen>
443 <para>The string describing the SELinux policy has the following
445 <para><literal>mode:name:version:hash</literal></para>
449 <para><literal>mode</literal> is a digit telling if SELinux is in
450 Permissive mode (0) or Enforcing mode (1)</para>
453 <para><literal>name</literal> is the name of the SELinux policy
457 <para><literal>version</literal> is the version of the SELinux
461 <para><literal>hash</literal> is the computed hash of the binary
462 representation of the policy, as exported in
463 /etc/selinux/<literal>name</literal>/policy/policy.
464 <literal>version</literal></para>
468 <section xml:id="managingSecurity.sepol.configuring" remap="h3">
469 <title><indexterm><primary>selinux policy check</primary><secondary>
470 enforcing</secondary></indexterm>Enforcing SELinux Policy Check</title>
471 <para>SELinux policy check can be enforced by setting the
472 <literal>sepol</literal> parameter on a nodemap entry. All clients
473 belonging to this nodemap entry must enforce the SELinux policy
474 described by this parameter, otherwise they are denied access to the
475 Lustre file system. For example:</para>
476 <screen>mgs# lctl nodemap_set_sepol --name restricted
477 --sepol '1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f'</screen>
478 <para>So all clients matching the <literal>restricted</literal> nodemap
479 must enforce the SELinux policy which description matches
480 <literal>1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f</literal>.
481 If not, they will get Permission Denied when trying to mount or access
482 files on the Lustre file system.</para>
483 <para>To delete the <literal>sepol</literal> parameter, just set it to an
485 <screen>mgs# lctl nodemap_set_sepol --name restricted --sepol ''</screen>
486 <para>See <xref linkend="lustrenodemap.title" /> for more details about
487 the Nodemap feature.</para>
489 <section xml:id="managingSecurity.sepol.permanent" remap="h3">
490 <title><indexterm><primary>selinux policy check</primary><secondary>
491 making permanent</secondary></indexterm>Making SELinux Policy Check
493 <para>In order to make SELinux Policy check permanent, the sepol parameter
494 on the nodemap has to be set with <literal>lctl set_param</literal> with
495 the <literal>-P</literal> option.</para>
496 <screen>mgs# lctl set_param nodemap.restricted.sepol=1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f
497 mgs# lctl set_param -P nodemap.restricted.sepol=1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f</screen>
498 <para>This way the sepol parameter will be stored in the Lustre config
499 logs, letting the servers retrieve the information after a restart.
502 <section xml:id="managingSecurity.sepol.client" remap="h3">
503 <title><indexterm><primary>selinux policy check</primary><secondary>
504 sending client</secondary></indexterm>Sending SELinux Status Info from
506 <para>In order for Lustre clients to send their SELinux status
507 information, in case SELinux is enabled locally, the
508 <literal>send_sepol</literal> ptlrpc kernel module's parameter has to be
509 set to a non-zero value. <literal>send_sepol</literal> accepts various
513 <para>0: do not send SELinux policy info;</para>
516 <para>-1: fetch SELinux policy info for every request;</para>
519 <para>N > 0: only fetch SELinux policy info every N seconds. Use
520 <literal>N = 2^31-1</literal> to have SELinux policy info
521 fetched only at mount time.</para>
524 <para>Clients that are part of a nodemap on which
525 <literal>sepol</literal> is defined must send SELinux status info.
526 And the SELinux policy they enforce must match the representation
527 stored into the nodemap. Otherwise they will be denied access to the
528 Lustre file system.</para>
533 vim:expandtab:shiftwidth=2:tabstop=8: