1 <?xml version='1.0' encoding='UTF-8'?><chapter xmlns="http://docbook.org/ns/docbook" xmlns:xl="http://www.w3.org/1999/xlink" version="5.0" xml:lang="en-US" xml:id="managingsecurity">
2 <title xml:id="managingsecurity.title">Managing Security in a Lustre File System</title>
3 <para>This chapter describes security features of the Lustre file system and
4 includes the following sections:</para>
7 <para><xref linkend="managingSecurity.acl"/></para>
10 <para><xref linkend="managingSecurity.root_squash"/></para>
13 <para><xref linkend="managingSecurity.isolation"/></para>
16 <para><xref linkend="managingSecurity.sepol"/></para>
19 <section xml:id="managingSecurity.acl">
20 <title><indexterm><primary>Access Control List (ACL)</primary></indexterm>
22 <para>An access control list (ACL), is a set of data that informs an
23 operating system about permissions or access rights that each user or
24 group has to specific system objects, such as directories or files. Each
25 object has a unique security attribute that identifies users who have
26 access to it. The ACL lists each object and user access privileges such as
27 read, write or execute.</para>
28 <section xml:id="managingSecurity.acl.howItWorks" remap="h3">
29 <title><indexterm><primary>Access Control List (ACL)</primary><secondary>
30 how they work</secondary></indexterm>How ACLs Work</title>
31 <para>Implementing ACLs varies between operating systems. Systems that
32 support the Portable Operating System Interface (POSIX) family of
33 standards share a simple yet powerful file system permission model,
34 which should be well-known to the Linux/UNIX administrator. ACLs add
35 finer-grained permissions to this model, allowing for more complicated
36 permission schemes. For a detailed explanation of ACLs on a Linux
37 operating system, refer to the SUSE Labs article
38 <link xl:href="http://wiki.lustre.org/images/5/57/PosixAccessControlInLinux.pdf">
39 Posix Access Control Lists on Linux</link>.</para>
40 <para>We have implemented ACLs according to this model. The Lustre
41 software works with the standard Linux ACL tools, setfacl, getfacl, and
42 the historical chacl, normally installed with the ACL package.</para>
44 <para>ACL support is a system-range feature, meaning that all clients
45 have ACL enabled or not. You cannot specify which clients should
49 <section xml:id="managingSecurity.acl.using" remap="h3">
51 <primary>Access Control List (ACL)</primary>
52 <secondary>using</secondary>
53 </indexterm>Using ACLs with the Lustre Software</title>
54 <para>POSIX Access Control Lists (ACLs) can be used with the Lustre
55 software. An ACL consists of file entries representing permissions based
56 on standard POSIX file system object permissions that define three
57 classes of user (owner, group and other). Each class is associated with
58 a set of permissions [read (r), write (w) and execute (x)].</para>
61 <para>Owner class permissions define access privileges of the file
65 <para>Group class permissions define access privileges of the owning
69 <para>Other class permissions define access privileges of all users
70 not in the owner or group class.</para>
73 <para>The <literal>ls -l</literal> command displays the owner, group, and
74 other class permissions in the first column of its output (for example,
75 <literal>-rw-r- --</literal> for a regular file with read and write
76 access for the owner class, read access for the group class, and no
77 access for others).</para>
78 <para>Minimal ACLs have three entries. Extended ACLs have more than the
79 three entries. Extended ACLs also contain a mask entry and may contain
80 any number of named user and named group entries.</para>
81 <para>The MDS needs to be configured to enable ACLs. Use
82 <literal>--mountfsoptions</literal> to enable ACLs when creating your
84 <screen>$ mkfs.lustre --fsname spfs --mountfsoptions=acl --mdt -mgs /dev/sda</screen>
85 <para>Alternately, you can enable ACLs at run time by using the
86 <literal>--acl</literal> option with <literal>mkfs.lustre</literal>:
88 <screen>$ mount -t lustre -o acl /dev/sda /mnt/mdt</screen>
89 <para>To check ACLs on the MDS:</para>
90 <screen>$ lctl get_param -n mdc.home-MDT0000-mdc-*.connect_flags | grep acl acl</screen>
91 <para>To mount the client with no ACLs:</para>
92 <screen>$ mount -t lustre -o noacl ibmds2@o2ib:/home /home</screen>
93 <para>ACLs are enabled in a Lustre file system on a system-wide basis;
94 either all clients enable ACLs or none do. Activating ACLs is controlled
95 by MDS mount options <literal>acl</literal> / <literal>noacl</literal>
96 (enable/disable ACLs). Client-side mount options acl/noacl are ignored.
97 You do not need to change the client configuration, and the
98 'acl' string will not appear in the client /etc/mtab. The
99 client acl mount option is no longer needed. If a client is mounted with
100 that option, then this message appears in the MDS syslog:</para>
101 <screen>...MDS requires ACL support but client does not</screen>
102 <para>The message is harmless but indicates a configuration issue, which
103 should be corrected.</para>
104 <para>If ACLs are not enabled on the MDS, then any attempts to reference
105 an ACL on a client return an Operation not supported error.</para>
107 <section xml:id="managingSecurity.acl.examples" remap="h3">
109 <primary>Access Control List (ACL)</primary>
110 <secondary>examples</secondary>
111 </indexterm>Examples</title>
112 <para>These examples are taken directly from the POSIX paper referenced
113 above. ACLs on a Lustre file system work exactly like ACLs on any Linux
114 file system. They are manipulated with the standard tools in the
115 standard manner. Below, we create a directory and allow a specific user
117 <screen>[root@client lustre]# umask 027
118 [root@client lustre]# mkdir rain
119 [root@client lustre]# ls -ld rain
120 drwxr-x--- 2 root root 4096 Feb 20 06:50 rain
121 [root@client lustre]# getfacl rain
129 [root@client lustre]# setfacl -m user:chirag:rwx rain
130 [root@client lustre]# ls -ld rain
131 drwxrwx---+ 2 root root 4096 Feb 20 06:50 rain
132 [root@client lustre]# getfacl --omit-header rain
140 <section xml:id="managingSecurity.root_squash">
142 <primary>root squash</primary>
143 </indexterm>Using Root Squash</title>
144 <para>Root squash is a security feature which restricts super-user access
145 rights to a Lustre file system. Without the root squash feature enabled,
146 Lustre file system users on untrusted clients could access or modify files
147 owned by root on the file system, including deleting them. Using the root
148 squash feature restricts file access/modifications as the root user to
149 only the specified clients. Note, however, that this does
150 <emphasis>not</emphasis> prevent users on insecure clients from accessing
151 files owned by <emphasis>other</emphasis> users.</para>
152 <para>The root squash feature works by re-mapping the user ID (UID) and
153 group ID (GID) of the root user to a UID and GID specified by the system
154 administrator, via the Lustre configuration management server (MGS). The
155 root squash feature also enables the Lustre file system administrator to
156 specify a set of client for which UID/GID re-mapping does not apply.
158 <note><para>Nodemaps (<xref linkend="lustrenodemap.title" />) are an
159 alternative to root squash, since it also allows root squash on a per-client
160 basis. With UID maps, the clients can even have a local root UID without
161 actually having root access to the filesystem itself.</para></note>
162 <section xml:id="managingSecurity.root_squash.config" remap="h3">
164 <primary>root squash</primary>
165 <secondary>configuring</secondary>
166 </indexterm>Configuring Root Squash</title>
167 <para>Root squash functionality is managed by two configuration
168 parameters, <literal>root_squash</literal> and
169 <literal>nosquash_nids</literal>.</para>
172 <para>The <literal>root_squash</literal> parameter specifies the UID
173 and GID with which the root user accesses the Lustre file system.
177 <para>The <literal>nosquash_nids</literal> parameter specifies the set
178 of clients to which root squash does not apply. LNet NID range
179 syntax is used for this parameter (see the NID range syntax rules
180 described in <xref linkend="managingSecurity.root_squash"/>). For
184 <screen>nosquash_nids=172.16.245.[0-255/2]@tcp</screen>
185 <para>In this example, root squash does not apply to TCP clients on subnet
186 172.16.245.0 that have an even number as the last component of their IP
189 <section xml:id="managingSecurity.root_squash.tuning">
191 <primary>root squash</primary><secondary>enabling</secondary>
192 </indexterm>Enabling and Tuning Root Squash</title>
193 <para>The default value for <literal>nosquash_nids</literal> is NULL,
194 which means that root squashing applies to all clients. Setting the root
195 squash UID and GID to 0 turns root squash off.</para>
196 <para>Root squash parameters can be set when the MDT is created
197 (<literal>mkfs.lustre --mdt</literal>). For example:</para>
198 <screen>mds# mkfs.lustre --reformat --fsname=testfs --mdt --mgs \
199 --param "mdt.root_squash=500:501" \
200 --param "mdt.nosquash_nids='0@elan1 192.168.1.[10,11]'" /dev/sda1</screen>
201 <para>Root squash parameters can also be changed on an unmounted device
202 with <literal>tunefs.lustre</literal>. For example:</para>
203 <screen>tunefs.lustre --param "mdt.root_squash=65534:65534" \
204 --param "mdt.nosquash_nids=192.168.0.13@tcp0" /dev/sda1
206 <para>Root squash parameters can also be changed with the
207 <literal>lctl conf_param</literal> command. For example:</para>
208 <screen>mgs# lctl conf_param testfs.mdt.root_squash="1000:101"
209 mgs# lctl conf_param testfs.mdt.nosquash_nids="*@tcp"</screen>
210 <para>To retrieve the current root squash parameter settings, the
211 following <literal>lctl get_param</literal> commands can be used:</para>
212 <screen>mgs# lctl get_param mdt.*.root_squash
213 mgs# lctl get_param mdt.*.nosquash_nids</screen>
215 <para>When using the lctl conf_param command, keep in mind:</para>
218 <para><literal>lctl conf_param</literal> must be run on a live MGS
222 <para><literal>lctl conf_param</literal> causes the parameter to
223 change on all MDSs</para>
226 <para><literal>lctl conf_param</literal> is to be used once per a
231 <para>The root squash settings can also be changed temporarily with
232 <literal>lctl set_param</literal> or persistently with
233 <literal>lctl set_param -P</literal>. For example:</para>
234 <screen>mgs# lctl set_param mdt.testfs-MDT0000.root_squash="1:0"
235 mgs# lctl set_param -P mdt.testfs-MDT0000.root_squash="1:0"</screen>
236 <para>The <literal>nosquash_nids</literal> list can be cleared with:</para>
237 <screen>mgs# lctl conf_param testfs.mdt.nosquash_nids="NONE"</screen>
239 <screen>mgs# lctl conf_param testfs.mdt.nosquash_nids="clear"</screen>
240 <para>If the <literal>nosquash_nids</literal> value consists of several
241 NID ranges (e.g. <literal>0@elan</literal>, <literal>1@elan1</literal>),
242 the list of NID ranges must be quoted with single (') or double
243 ('') quotation marks. List elements must be separated with a
244 space. For example:</para>
245 <screen>mds# mkfs.lustre ... --param "mdt.nosquash_nids='0@elan1 1@elan2'" /dev/sda1
246 lctl conf_param testfs.mdt.nosquash_nids="24@elan 15@elan1"</screen>
247 <para>These are examples of incorrect syntax:</para>
248 <screen>mds# mkfs.lustre ... --param "mdt.nosquash_nids=0@elan1 1@elan2" /dev/sda1
249 lctl conf_param testfs.mdt.nosquash_nids=24@elan 15@elan1</screen>
250 <para>To check root squash parameters, use the lctl get_param command:
252 <screen>mds# lctl get_param mdt.testfs-MDT0000.root_squash
253 lctl get_param mdt.*.nosquash_nids</screen>
255 <para>An empty nosquash_nids list is reported as NONE.</para>
258 <section xml:id="managingSecurity.root_squash.tips" remap="h3">
260 <primary>root squash</primary>
261 <secondary>tips</secondary>
262 </indexterm>Tips on Using Root Squash</title>
263 <para>Lustre configuration management limits root squash in several ways.
267 <para>The <literal>lctl conf_param</literal> value overwrites the
268 parameter's previous value. If the new value uses an incorrect
269 syntax, then the system continues with the old parameters and the
270 previously-correct value is lost on remount. That is, be careful
271 doing root squash tuning.</para>
274 <para><literal>mkfs.lustre</literal> and
275 <literal>tunefs.lustre</literal> do not perform parameter syntax
276 checking. If the root squash parameters are incorrect, they are
277 ignored on mount and the default values are used instead.</para>
280 <para>Root squash parameters are parsed with rigorous syntax checking.
281 The root_squash parameter should be specified as
282 <literal><decnum>:<decnum></literal>. The
283 <literal>nosquash_nids</literal> parameter should follow LNet NID
284 range list syntax.</para>
287 <para>LNet NID range syntax:</para>
288 <screen><nidlist> :== <nidrange> [ ' ' <nidrange> ]
289 <nidrange> :== <addrrange> '@' <net>
290 <addrrange> :== '*' |
291 <ipaddr_range> |
292 <numaddr_range>
293 <ipaddr_range> :==
294 <numaddr_range>.<numaddr_range>.<numaddr_range>.<numaddr_range>
295 <numaddr_range> :== <number> |
297 <expr_list> :== '[' <range_expr> [ ',' <range_expr>] ']'
298 <range_expr> :== <number> |
299 <number> '-' <number> |
300 <number> '-' <number> '/' <number>
301 <net> :== <netname> | <netname><number>
302 <netname> :== "lo" | "tcp" | "o2ib"
303 | "ra" | "elan"
304 <number> :== <nonnegative decimal> | <hexadecimal></screen>
306 <para>For networks using numeric addresses (e.g. elan), the address
307 range must be specified in the
308 <literal><numaddr_range></literal> syntax. For networks using
309 IP addresses, the address range must be in the
310 <literal><ipaddr_range></literal>. For example, if elan is using
311 numeric addresses, <literal>1.2.3.4@elan</literal> is incorrect.
316 <section xml:id="managingSecurity.isolation">
317 <title><indexterm><primary>Isolation</primary></indexterm>
318 Isolating Clients to a Sub-directory Tree</title>
319 <para>Isolation is the Lustre implementation of the generic concept of
320 multi-tenancy, which aims at providing separated namespaces from a single
321 filesystem. Lustre Isolation enables different populations of users on
322 the same file system beyond normal Unix permissions/ACLs, even when users
323 on the clients may have root access. Those tenants share the same file
324 system, but they are isolated from each other: they cannot access or even
325 see each other’s files, and are not aware that they are sharing common
326 file system resources.</para>
327 <para>Lustre Isolation leverages the Fileset feature
328 (<xref linkend="SystemConfigurationUtilities.fileset" />)
329 to mount only a subdirectory of the filesystem rather than the root
331 In order to achieve isolation, the subdirectory mount, which presents to
332 tenants only their own fileset, has to be imposed to the clients. To that
333 extent, we make use of the nodemap feature
334 (<xref linkend="lustrenodemap.title" />). We group all clients used by a
335 tenant under a common nodemap entry, and we assign to this nodemap entry
336 the fileset to which the tenant is restricted.</para>
337 <section xml:id="managingSecurity.isolation.clientid" remap="h3">
338 <title><indexterm><primary>Isolation</primary><secondary>
339 client identification</secondary></indexterm>Identifying Clients</title>
340 <para>Enforcing multi-tenancy on Lustre relies on the ability to properly
341 identify the client nodes used by a tenant, and trust those identities.
342 This can be achieved by having physical hardware and/or network
343 security, so that client nodes have well-known NIDs. It is also possible
344 to make use of strong authentication with Kerberos or Shared-Secret Key
345 (see <xref linkend="lustressk" />).
346 Kerberos prevents NID spoofing, as every client needs its own
347 credentials, based on its NID, in order to connect to the servers.
348 Shared-Secret Key also prevents tenant impersonation, because keys
349 can be linked to a specific nodemap. See
350 <xref linkend="ssknodemaprole" /> for detailed explanations.
353 <section xml:id="managingSecurity.isolation.configuring" remap="h3">
354 <title><indexterm><primary>Isolation</primary><secondary>
355 configuring</secondary></indexterm>Configuring Isolation</title>
356 <para>Isolation on Lustre can be achieved by setting the
357 <literal>fileset</literal> parameter on a nodemap entry. All clients
358 belonging to this nodemap entry will automatically mount this fileset
359 instead of the root directory. For example:</para>
360 <screen>mgs# lctl nodemap_set_fileset --name tenant1 --fileset '/dir1'</screen>
361 <para>So all clients matching the <literal>tenant1</literal> nodemap will
362 be automatically presented the fileset <literal>/dir1</literal> when
363 mounting. This means these clients are doing an implicit subdirectory
364 mount on the subdirectory <literal>/dir1</literal>.
368 If subdirectory defined as fileset does not exist on the file system,
369 it will prevent any client belonging to the nodemap from mounting
373 <para>To delete the fileset parameter, just set it to an empty string:
375 <screen>mgs# lctl nodemap_set_fileset --name tenant1 --fileset ''</screen>
377 <section xml:id="managingSecurity.isolation.permanent" remap="h3">
378 <title><indexterm><primary>Isolation</primary><secondary>
379 making permanent</secondary></indexterm>Making Isolation Permanent
381 <para>In order to make isolation permanent, the fileset parameter on the
382 nodemap has to be set with <literal>lctl set_param</literal> with the
383 <literal>-P</literal> option.</para>
384 <screen>mgs# lctl set_param nodemap.tenant1.fileset=/dir1
385 mgs# lctl set_param -P nodemap.tenant1.fileset=/dir1</screen>
386 <para>This way the fileset parameter will be stored in the Lustre config
387 logs, letting the servers retrieve the information after a restart.
391 <section xml:id="managingSecurity.sepol" condition='l2D'>
392 <title><indexterm><primary>selinux policy check</primary></indexterm>
393 Checking SELinux Policy Enforced by Lustre Clients</title>
394 <para>SELinux provides a mechanism in Linux for supporting Mandatory Access
395 Control (MAC) policies. When a MAC policy is enforced, the operating
396 system’s (OS) kernel defines application rights, firewalling applications
397 from compromising the entire system. Regular users do not have the ability to
398 override the policy.</para>
399 <para>One purpose of SELinux is to protect the
400 <emphasis role="bold">OS</emphasis> from privilege escalation. To that
401 extent, SELinux defines confined and unconfined domains for processes and
402 users. Each process, user, file is assigned a security context, and
403 rules define the allowed operations by processes and users on files.
405 <para>Another purpose of SELinux can be to protect
406 <emphasis role="bold">data</emphasis> sensitivity, thanks to Multi-Level
407 Security (MLS). MLS works on top of SELinux, by defining the concept of
408 security levels in addition to domains. Each process, user and file is
409 assigned a security level, and the model states that processes and users
410 can read the same or lower security level, but can only write to their own
411 or higher security level.
413 <para>From a file system perspective, the security context of files must be
414 stored permanently. Lustre makes use of the
415 <literal>security.selinux</literal> extended attributes on files to hold
416 this information. Lustre supports SELinux on the client side. All you have
417 to do to have MAC and MLS on Lustre is to enforce the appropriate SELinux
418 policy (as provided by the Linux distribution) on all Lustre clients. No
419 SELinux is required on Lustre servers.
421 <para>Because Lustre is a distributed file system, the specificity when
422 using MLS is that Lustre really needs to make sure data is always accessed
423 by nodes with the SELinux MLS policy properly enforced. Otherwise, data is
424 not protected. This means Lustre has to check that SELinux is properly
425 enforced on client side, with the right, unaltered policy. And if SELinux
426 is not enforced as expected on a client, the server denies its access to
429 <section xml:id="managingSecurity.sepol.determining" remap="h3">
430 <title><indexterm><primary>selinux policy check</primary><secondary>
431 determining</secondary></indexterm>Determining SELinux Policy Info
433 <para>A string that represents the SELinux Status info will be used by
434 servers as a reference, to check if clients are enforcing SELinux
435 properly. This reference string can be obtained on a client node known
436 to enforce the right SELinux policy, by calling the
437 <literal>l_getsepol</literal> command line utility:</para>
438 <screen>client# l_getsepol
439 SELinux status info: 1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f</screen>
440 <para>The string describing the SELinux policy has the following
442 <para><literal>mode:name:version:hash</literal></para>
446 <para><literal>mode</literal> is a digit telling if SELinux is in
447 Permissive mode (0) or Enforcing mode (1)</para>
450 <para><literal>name</literal> is the name of the SELinux policy
454 <para><literal>version</literal> is the version of the SELinux
458 <para><literal>hash</literal> is the computed hash of the binary
459 representation of the policy, as exported in
460 /etc/selinux/<literal>name</literal>/policy/policy.
461 <literal>version</literal></para>
465 <section xml:id="managingSecurity.sepol.configuring" remap="h3">
466 <title><indexterm><primary>selinux policy check</primary><secondary>
467 enforcing</secondary></indexterm>Enforcing SELinux Policy Check</title>
468 <para>SELinux policy check can be enforced by setting the
469 <literal>sepol</literal> parameter on a nodemap entry. All clients
470 belonging to this nodemap entry must enforce the SELinux policy
471 described by this parameter, otherwise they are denied access to the
472 Lustre file system. For example:</para>
473 <screen>mgs# lctl nodemap_set_sepol --name restricted
474 --sepol '1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f'</screen>
475 <para>So all clients matching the <literal>restricted</literal> nodemap
476 must enforce the SELinux policy which description matches
477 <literal>1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f</literal>.
478 If not, they will get Permission Denied when trying to mount or access
479 files on the Lustre file system.</para>
480 <para>To delete the <literal>sepol</literal> parameter, just set it to an
482 <screen>mgs# lctl nodemap_set_sepol --name restricted --sepol ''</screen>
483 <para>See <xref linkend="lustrenodemap.title" /> for more details about
484 the Nodemap feature.</para>
486 <section xml:id="managingSecurity.sepol.permanent" remap="h3">
487 <title><indexterm><primary>selinux policy check</primary><secondary>
488 making permanent</secondary></indexterm>Making SELinux Policy Check
490 <para>In order to make SELinux Policy check permanent, the sepol parameter
491 on the nodemap has to be set with <literal>lctl set_param</literal> with
492 the <literal>-P</literal> option.</para>
493 <screen>mgs# lctl set_param nodemap.restricted.sepol=1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f
494 mgs# lctl set_param -P nodemap.restricted.sepol=1:mls:31:40afb76d077c441b69af58cccaaa2ca63641ed6e21b0a887dc21a684f508b78f</screen>
495 <para>This way the sepol parameter will be stored in the Lustre config
496 logs, letting the servers retrieve the information after a restart.
499 <section xml:id="managingSecurity.sepol.client" remap="h3">
500 <title><indexterm><primary>selinux policy check</primary><secondary>
501 sending client</secondary></indexterm>Sending SELinux Status Info from
503 <para>In order for Lustre clients to send their SELinux status
504 information, in case SELinux is enabled locally, the
505 <literal>send_sepol</literal> ptlrpc kernel module's parameter has to be
506 set to a non-zero value. <literal>send_sepol</literal> accepts various
510 <para>0: do not send SELinux policy info;</para>
513 <para>-1: fetch SELinux policy info for every request;</para>
516 <para>N > 0: only fetch SELinux policy info every N seconds. Use
517 <literal>N = 2^31-1</literal> to have SELinux policy info
518 fetched only at mount time.</para>
521 <para>Clients that are part of a nodemap on which
522 <literal>sepol</literal> is defined must send SELinux status info.
523 And the SELinux policy they enforce must match the representation
524 stored into the nodemap. Otherwise they will be denied access to the
525 Lustre file system.</para>