Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-14359 hsm: support a flatter HSM archive format
John L. Hammond [Fri, 22 Jan 2021 16:56:06 +0000 (10:56 -0600)]
LU-14359 hsm: support a flatter HSM archive format

Add versioning (v1 and v2) to the HSM archive format (directory
layout):
  v1: (oid & 0xffff)/-/-/-/-/-/FID
  v2: ((oid ^ seq) & 0xffff)/FID

v1 is the original layout and the default. v2 is the new layout which
should be selected for new installs.

Add an option --archive-format to select the archive format.

Add YAML configuration file support to lhsmtool_posix with properties
achive_format and archive_path. Add an option --config to set the
config file.

Adapt sanity-hsm and test-framework to allow testing of both archive
formats.

Lustre-change: https://review.whamcloud.com/41312
Lustre-commit: 65062463199fa76b6313e9452e3ab9590cbedaa2

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6d6bd0c8817a491848b554fa76078d876549cc1f
Reviewed-on: https://review.whamcloud.com/43490
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoRM-620 build: New tag 2.14.0-ddn4
Andreas Dilger [Wed, 19 May 2021 02:36:27 +0000 (20:36 -0600)]
RM-620 build: New tag 2.14.0-ddn4

New tag 2.14.0-ddn4

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ife64820c72a134ce5ae749d5c61cbc8511b3a9de

4 years agoLU-14502 lov: fault page update cp_lov_index
Bobi Jam [Tue, 9 Mar 2021 09:15:20 +0000 (17:15 +0800)]
LU-14502 lov: fault page update cp_lov_index

In fault IO, vvp_io_fault_start() could find an existing cl_page
associated with the vmpage covering the fault index, and the page
may still refer to another mirror of an old IO.

This patch update the fault page's cp_lov_index in lov_io_fault_start

Lustre-commit: e9bac5fa455eab5371cdfb141b73a3beb0cc8d9c
Lustre-change: https://review.whamcloud.com/41954

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I50639700159a76061437fd2f1a09dadf25cfd33f
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43454
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3]
Jian Yu [Wed, 5 May 2021 17:36:09 +0000 (10:36 -0700)]
LU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.22.1.el8_3.

Test-Parameters: trivial \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Change-Id: I1a3152d95822a74e05f9b44f590a6cdb1f8b02b6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43547
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14671 kernel: kernel update SLES15 SP2 [5.3.18-24.61.1]
Jian Yu [Mon, 10 May 2021 21:18:49 +0000 (14:18 -0700)]
LU-14671 kernel: kernel update SLES15 SP2 [5.3.18-24.61.1]

Update SLES15 SP2 kernel to 5.3.18-24.61.1 for Lustre client.

Test-Parameters: trivial \
env=SANITY_EXCEPT="100 130 136 817" \
clientdistro=sles15sp2 serverdistro=el7.9 \
testlist=sanity

Change-Id: Ie0aab7cc7200796ed8e4d75862ceaef020943c08
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43631
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7]
Jian Yu [Mon, 10 May 2021 19:38:11 +0000 (12:38 -0700)]
LU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.25.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ic846d648c45476cc4886ce86577605bf3e66d935
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2]
Jian Yu [Mon, 10 May 2021 21:59:02 +0000 (14:59 -0700)]
LU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2]

Update SLES12 SP5 kernel to 4.12.14-122.66.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ib2bf4795ccb21dbd0bb9202228ff32d73a203eee
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43634
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2659 tests: add sanity-lipe.sh to test LiPE utilities
Jian Yu [Tue, 4 May 2021 17:41:23 +0000 (10:41 -0700)]
EX-2659 tests: add sanity-lipe.sh to test LiPE utilities

This patch adds sanity-lipe.sh test script to test
lipe_find and lipe_scan utilities for LiPE.

Lustre-change: https://review.whamcloud.com/42151
Lustre-commit: 5d67c987c8d2dc393b1e0952fe01d33978efdea0

Test-Parameters: trivial testlist=sanity-lipe

Change-Id: I69d82f7e3675becb4e38915ff363e853d0accb77
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-3176 build: remove extra_version from kernel rpm
Minh Diep [Fri, 14 May 2021 16:56:06 +0000 (09:56 -0700)]
EX-3176 build: remove extra_version from kernel rpm

This will allow us to use the same kernel for both
ES5 and ES6

Change-Id: I7e49f97b28d6e74ab6fe79f0438900c3ebd665df
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43708
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14055 lmv: reduce struct lmv_obd size
Andreas Dilger [Tue, 11 May 2021 07:11:36 +0000 (00:11 -0700)]
LU-14055 lmv: reduce struct lmv_obd size

The lmv_obd struct contains lmv_mdt_descs which is large enough
to reference 512 * 512 = 262144 targets, but there can be only
65536 OSTs or MDTs in a single filesystem today.

Shrink the allocation size to match the current limits, reducing
the size of obd_device.u since this is the largest union member.

This reduces the size of each obd_device from 6752 to 4568 bytes.

Lustre-change: https://review.whamcloud.com/41162
Lustre-commit: e11deeb1e6d114608eac4ee998d4cea22e30b0f5

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I752b021bdb5d02e3ead3bb266121be5dbf3ebbe5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43651
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: support absence of account_page_dirtied
Mr NeilBrown [Tue, 11 May 2021 07:05:12 +0000 (00:05 -0700)]
LU-13783 libcfs: support absence of account_page_dirtied

Some kernels export neither account_page_dirtied nor
kallsyms_lookup_name.
For these kernels we need to use __set_page_dirty() and suffer the
cost of dropping an reclaiming the page-tree lock.

Lustre-change: https://review.whamcloud.com/40827
Lustre-commit: 6be4b3118c16039cff52e9a781b7d1852489a969

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I69d934480832f3909d3ec103f11e1d62489d70d7
Reviewed-on: https://review.whamcloud.com/43650
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: use lsmcontext in security_release_secctx
Jian Yu [Tue, 11 May 2021 07:02:23 +0000 (00:02 -0700)]
LU-13783 libcfs: use lsmcontext in security_release_secctx

Kernel linux-hwe-5.8 (5.8.0-22.23~20.04.1) introduces
struct lsmcontext and uses it in security_release_secctx(),
which reduces the argruments from 2 to 1.

Lustre-change: https://review.whamcloud.com/43284
Lustre-commit: c9e644add7091299d030a96e46384912ac2bef50

Change-Id: I37e185493001d335b40ea0a6102db593cb18beb3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: add cfs_kallsyms_lookup_name()
Jian Yu [Tue, 11 May 2021 06:51:33 +0000 (23:51 -0700)]
LU-13783 libcfs: add cfs_kallsyms_lookup_name()

The inline kallsyms_lookup_name() added by
commit d7249d9d70a caused the following failures:

libcfs/include/libcfs/linux/linux-misc.h:150:21:
error: conflicting types for ‘kallsyms_lookup_name’
  150 | static inline void *kallsyms_lookup_name(char *func)
      |                     ^~~~~~~~~~~~~~~~~~~~

include/linux/kallsyms.h:76:15:
note: previous declaration of ‘kallsyms_lookup_name’ was here
   76 | unsigned long kallsyms_lookup_name(const char *name);
      |               ^~~~~~~~~~~~~~~~~~~~

This patch removes the inline kallsyms_lookup_name() definition
from linux-misc.h and adds cfs_kallsyms_lookup_name() to wrap
kallsyms_lookup_name() if it is exported or return NULL in case of
kallsyms_lookup_name() is not exported.

Lustre-change: https://review.whamcloud.com/43296
Lustre-commit: 783002035ae9612b5b0aa80f2342a2ee9e81c374

Fixes: d7249d9d70a ("LU-13783 libcfs: provide fallback kallsyms_lookup_name()")
Change-Id: I4b2d4499948a8586b48db68484491ec76c3a609d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43648
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13783 libcfs: provide fallback kallsyms_lookup_name()
Mr NeilBrown [Tue, 11 May 2021 06:41:37 +0000 (23:41 -0700)]
LU-13783 libcfs: provide fallback kallsyms_lookup_name()

Since Linux 5.7, kallsyms_lookup_name() is no longer exported, so we
cannot rely on it.

So test for this, and when not available provide a fallback which just
returns NULL.

As this was the only way to access apply_workqueue_attrs() in recent
kernels, we need to cope with the absence of that function.

Lustre-change: https://review.whamcloud.com/40826
Lustre-commit: d7249d9d70ac0fcfa665ece78634b495bc9a22cd

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I09cc00047ec163a9395c5acd415505a8586e4e99
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43647
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13783 libcfs: don't depend on sysctl support for debugfs
Mr NeilBrown [Tue, 11 May 2021 06:38:41 +0000 (23:38 -0700)]
LU-13783 libcfs: don't depend on sysctl support for debugfs

Since Linux v5.8-rc1~55^2~6 sysctl support routines like
proc_dointvec() expect a pointer to kernel-space, not userspace.

So stop using these function for debugfs files, and instead
provide bespoke functions.

Lustre-change: https://review.whamcloud.com/40832
Lustre-commit: d707b390aec5e95a1aec9910fb3c8248c231cbfb

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I340a748bbfbd066054a73299ce32698aa39a0e2d
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/43646
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: support __vmalloc with only 2 args.
Mr NeilBrown [Tue, 11 May 2021 06:35:09 +0000 (23:35 -0700)]
LU-13783 libcfs: support __vmalloc with only 2 args.

Since v5.8-rc1~201^2~19 Commit 88dca4ca5a93 ("mm: remove the pgprot
argument to __vmalloc") __vmalloc only takes 2 arguments.

So introduce __ll_vmalloc which takes 2 args, and calls
__vmalloc with correct number of args.

Lustre-change: https://review.whamcloud.com/40328
Lustre-commit: 2a32eaa35dd7b96bb29f6a17991f48fe07fa833e

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2c89512a12e28b27544a891620e448a9b752b089
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/43645
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13783 libcfs: support removal of kernel_setsockopt()
Mr NeilBrown [Tue, 11 May 2021 06:27:46 +0000 (23:27 -0700)]
LU-13783 libcfs: support removal of kernel_setsockopt()

Linux 5.8 removes kernel_setsockopt() and kernel_getsockopt(), and
provides some helper functions for some accesses that are
not trivial.

This patch adds those helpers to libcfs when they are not available,
and changes (nearly) all calls to kernel_[gs]etsockopt() to
either use direct access to a helper call.

->keepalive() is not available before v4.11-rc1~94^2~43^2~14
and there is no helper function, so for SO_KEEPALIVE we
need to have #ifdef code in the C file.

TCP_BACKOFF* setting are not converted as they are not available in
any upstream kernel, so no conversion is possible.

Also include some minor style fixes and change lnet_sock_setbuf() and
lnet_sock_getbuf() to be 'void' functions.

Lustre-change: https://review.whamcloud.com/39259
Lustre-commit: 99d9638d6c074b48f1c21c5c94d6dfe347eed3ee

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I539cf8d20555ddb3565fa75130fdd3acf709c545
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43644
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: switch from ->mmap_sem to mmap_lock()
Mr NeilBrown [Tue, 11 May 2021 06:13:38 +0000 (23:13 -0700)]
LU-13783 libcfs: switch from ->mmap_sem to mmap_lock()

In Linux 5.8, ->mmap_sem is gone and the preferred interface
for locking the mmap is to suite of mmap*lock() functions.

So provide those functions when not available, and use them
as needed in Lustre.

Lustre-change: https://review.whamcloud.com/40288
Lustre-commit: 5309e108582c692f3b60705818fddc4a3b3b1345

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4ce3959f9e93eae10a7b7db03e2b0a1525723138
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43643
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13344 libcfs: Abstract proc_fs with proc_ops
Shaun Tancheff [Tue, 11 May 2021 04:22:12 +0000 (21:22 -0700)]
LU-13344 libcfs: Abstract proc_fs with proc_ops

Linux 5.6 introduces proc_ops with v5.5-8862-gd56c0d45f0e2
proc: decouple proc from VFS with "struct proc_ops"

Map proc_ops and it's members to file_operations and
the appropriate members for older kernels.

One remaining 'PROC_OWNER()' macro is left to deal with
proc_ops being unable to sensibly map the owner member.

Lustre-change: https://review.whamcloud.com/37873
Lustre-commit: 13cd0f9f667c6e138a8cb235d4920f8b749cb154

Test-Parameters: trivial
HPE-bug-id: LUS-8589
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3d8940a91b331c4f6bb31a9432194cc082c9cecd
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43642
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13344 all: Separate debugfs and procfs handling
Shaun Tancheff [Tue, 11 May 2021 04:16:20 +0000 (21:16 -0700)]
LU-13344 all: Separate debugfs and procfs handling

Linux 5.6 introduces proc_ops with v5.5-8862-gd56c0d45f0e2
proc: decouple proc from VFS with "struct proc_ops"

Separate debugfs usage and procfs usage to prepare for the divergence
of debugfs using file_operations and procfs using proc_ops

Lustre-change: https://review.whamcloud.com/37834
Lustre-commit: 76626d6c52b19b5cca04007c4b1656cc52a487c1

HPE-bug-id: LUS-8589
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I1746e563b55a9e89f90ac01843c304fe6b690d8b
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13485 libcfs: FIELD_SIZEOF macro removed
Shaun Tancheff [Tue, 11 May 2021 03:59:41 +0000 (20:59 -0700)]
LU-13485 libcfs: FIELD_SIZEOF macro removed

Linux v4.15-rc2-5-g4229a470175b introduced sizeof_field() macro
Linux v5.5-rc4-1-g1f07dcc459d5 removed FIELD_SIZEOF() macro

Provide a sizeof_field() macro in terms of FIELD_SIZEOF()
when sizeof_field() is not provided.

Lustre-change: https://review.whamcloud.com/39710
Lustre-commit: 03b7befcc0a9308cbac91370046f6c00e5cf1005

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I48ca9abb931d58919d788199e5089984c9e854dd
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43640
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-6142 lustre: change super/file/inode operations to const
Mr NeilBrown [Tue, 11 May 2021 03:55:32 +0000 (20:55 -0700)]
LU-6142 lustre: change super/file/inode operations to const

All 'struct file_operations', 'struct inode_operations', 'struct
export_operations' and 'struct super_operations' are changed to
'const'.  This potenetially allows them to be placed in read-only
memory, and ensure they are never changed.

Lustre-change: https://review.whamcloud.com/39394
Lustre-commit: 140b9e6d736a8c11d660094fc11ee61a89264b13

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8b236f0248eca11f91f11da02fe18be3f6d2e17c
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43639
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-6142 lustre: make various 'struct file_operations' static
Mr NeilBrown [Tue, 11 May 2021 03:47:57 +0000 (20:47 -0700)]
LU-6142 lustre: make various 'struct file_operations' static

These 'struct file_operations' are only used locally, so make them
static.
Except lprocfs_evict_client_fops() which isn't used at all and doesn't
exist, so discard the declaration.

Lustre-change: https://review.whamcloud.com/39741
Lustre-commit: 950200a21fb0636c53eefc9b6337bf1d10ad121e

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib6c51683c1e765db202b3f72d2accebe17191303
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43638
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-930 misc: limit CDEBUG console message frequency
Andreas Dilger [Sat, 7 Nov 2020 07:53:28 +0000 (00:53 -0700)]
LU-930 misc: limit CDEBUG console message frequency

Some CDEBUG() messages have variable message levels, but if printed
to the console it is not rate limited like CWARN() and CERROR():

 server_bulk_callback()) event type 5, status -110
 server_bulk_callback()) event type 5, status -110
 server_bulk_callback()) event type 5, status -110
 :

Instead, use CDEBUG_LIMIT() for those messages to limit them.

Lustre-change: https://review.whamcloud.com/40571
Lustre-commit: 7462e8cad730897f459da31886c57585654f26b8

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9081398c7d014b2873e764dc283ce2f4623ebbe5
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43400
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-3080 pcc: avoid dead lock for auto attach in PCC-RO
Qian Yingjin [Wed, 12 May 2021 03:43:28 +0000 (11:43 +0800)]
EX-3080 pcc: avoid dead lock for auto attach in PCC-RO

In this patch, It releases the pcc inode lock when calling
ll_layout_refresh() in @pcc_try_auto_attach() as it may cause the
following deadlock:
1. The client is writing or truncating a file in readonly mode.
   At this time, it will send a write layout intent lock to clear
   the readonly state on the layout on MDT.
2. A read process tries to auto attach the file with pcc inode
   lock hold. During the pregress of auto attach, it will call
   ll_layout_refresh(). The client-side enqueue request for a
   layout lock returned a blocked lock, it will sleep and wait for
   the lock being granted;
3. MDT will take EX layout lock to cancel all cached layout lock
   on client to change the layout for clearing the PCC-RO state.
4. when the client handles the revocation of layout lock, it needs
   to invalidate the PCC state which needs under the protection of
   pcc inode lock.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I18890d19d03726a5991c923505e8c5363382fdc2
Reviewed-on: https://review.whamcloud.com/43668
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-3124 build: liblnetconfig.so.4 is needed by liblustreapi.so
Minh Diep [Wed, 12 May 2021 05:58:28 +0000 (22:58 -0700)]
EX-3124 build: liblnetconfig.so.4 is needed by liblustreapi.so

Need to include llnetconfig
libssh >= 0.8.0 does not provide libssh_thread.so anymore

Change-Id: Ia3884dd1c45712c099ab1e03739f6ba684c11ae1
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43670
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
4 years agoEX-3144 pcc: revalidate the pointer after attach
Yang Sheng [Tue, 11 May 2021 16:57:47 +0000 (00:57 +0800)]
EX-3144 pcc: revalidate the pointer after attach

We need refresh pointer again since the lock may
be released in pcc_try_readonly_open_attach.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I470358dfde525e08e7110e862b30b527e5db94fe
Reviewed-on: https://review.whamcloud.com/43662
Reviewed-by: Yingjin Qian <qian@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-3133 pcc: keep PCC copy when it is being attached
Qian Yingjin [Tue, 11 May 2021 03:38:36 +0000 (11:38 +0800)]
EX-3133 pcc: keep PCC copy when it is being attached

When detach a file from PCC backend via FID, if the file is being
attached, it should not purge the coresponding PCC copy from the
PCC backend. Just keep the PCC copy to finish the attach process.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I8a8f7c6986d51eaf9b2516e5dd5a6f21aa38b7db
Reviewed-on: https://review.whamcloud.com/43637
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2861 pcc: don't reopen mountpoint for each cache file
Qian Yingjin [Fri, 19 Mar 2021 08:45:26 +0000 (16:45 +0800)]
EX-2861 pcc: don't reopen mountpoint for each cache file

When scanning and processing files in the PCC cache filesystem
(e.g. "llapi_pcc_scan_detach()" is looking for the Lustre
mountpoint and reopening it for every file processed.

This patch changed it to open the Lustre mountpoint only once,
then reuse the file handle for all of the later calls. The file
handle will be closed when finished the processing.

This patch also repaces to use llapi_fid_parse to get FID from
an given string.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Iad92c216262296096e30ca4a4c6b2765dfd3afaa
Reviewed-on: https://review.whamcloud.com/42107
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2872 pcc: mtime rule for 'lctl pcc add'
Andreas Dilger [Sat, 20 Mar 2021 11:14:04 +0000 (05:14 -0600)]
EX-2872 pcc: mtime rule for 'lctl pcc add'

Add an "mtime>N" rule to allow skipping files for PCC-RO auto-attach
if they were created or modified more than N seconds ago.  Otherwise,
it may be that files are added to the PCC cache before they finished
writing, or if they will be modified again quickly after creation.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibb99bff5b483717ae6e5b83f82f1bcd86c3ebbe5
Reviewed-on: https://review.whamcloud.com/42122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2860 pcc: test interoperability with 2.14.0
Qian Yingjin [Thu, 25 Mar 2021 02:44:16 +0000 (10:44 +0800)]
EX-2860 pcc: test interoperability with 2.14.0

For Lustre 2.14.0 servers, it fails many of subtests that are
PCC-RO specific.
In this patch, each subtest related to PCC-RO adds an connect
flag check and skip it when run against old servers without
PCC-RO support.

Test-Parameters: serverversion=2.14 testlist=sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ie4fc41b2dc51a038027009fbcc6e86f9d61cd54f
Reviewed-on: https://review.whamcloud.com/43104
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2873 pcc: don't fallback sync attach for EINPROGRESS error
Qian Yingjin [Wed, 31 Mar 2021 09:44:06 +0000 (17:44 +0800)]
EX-2873 pcc: don't fallback sync attach for EINPROGRESS error

When a file is read-only attaching into PCC backend in background
with asynchronous mode by a thread, other threads trying to open
attach the same file will get -EINPROGRESS error code. It should
tolerate this erorr instead of falling back to synchronous attach
mode.

For asynchronous open attach, it can not reuse the Lustre file
handle directly for data copy when the file is opening for read
as the file position in the file handle can not be shread by the
user thread and the asynchronous attach thread in kernel on the
background. It needs reopen the file without O_DIRECT flag and
use the new Lustre file handle to do data copy from Lustre OSTs
to the PCC copy.

As i_size_read(inode) without stat() call sometimes returns zero
value, not the actual file size value. This may result in wrong
open attach action. Also it does not know whether the lazysize is
always going to be set. Thus, in this patch it uses max(lazysize,
i_size_read(inode)) to determine whether do open attach in
background asynchronously.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I80b88a8ba05af4af45433ba9be5b87854e116b10
Reviewed-on: https://review.whamcloud.com/43180
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14597 flr: allow multiple primary mirrors
Bobi Jam [Fri, 9 Apr 2021 04:53:07 +0000 (12:53 +0800)]
LU-14597 flr: allow multiple primary mirrors

Users can set "prefer" flag on any mirror/component, so the IO should
not report error if multiple mirrors are encountered.

Rename lod_mirror_entry::lme_primary to lme_prefer to avoid confusion.

Lustre-change: https://review.whamcloud.com/43247
Lustre-commit: 93258b9d93611e75b79c30f3ddfc2c9c21f25917

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I45748e56e38985a0d9028792ba3d976a4e03efb8
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43535
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14468 utils: improve 'lfs rmfid' error messages
John L. Hammond [Tue, 23 Feb 2021 15:40:08 +0000 (09:40 -0600)]
LU-14468 utils: improve 'lfs rmfid' error messages

In lfs_rmfid_and_show_errors(), convert the error messages printed by
'lfs rmfid' from the format
  rmfid([0x20001a9f5:0x159:0x0]): rc = -39
to
  lfs rmfid: cannot remove [0x20001a9f5:0x155:0x0]: Directory not empty

Simplify the logic and swap rc and rc2 to follow conventions.

Lustre-commit: 6560ae08a788b3779118640837f68b499a99ee8c
Lustre-change: https://review.whamcloud.com/41727

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Iccd9e1054ed8842fc4f65dd601077cfdeaa1320c
Reviewed-on: https://review.whamcloud.com/41727
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14550 libcfs: fix setting of debug_path
Andreas Dilger [Thu, 25 Mar 2021 06:39:07 +0000 (00:39 -0600)]
LU-14550 libcfs: fix setting of debug_path

While it was possible to set "lctl set_param debug_path=path" or
"echo path > /sys/module/libcfs/parameters/libcfs_debug_file_path"
this change does not affect the path used to dump debug logs.

Connect these parameters to the pathname used for the debug log.

Lustre-commit: f7392c7c4a16bc1127ee448f937ba81c50dcdfd5
Lustre-change: https://review.whamcloud.com/43109

Test-Parameters: testlist=sanity env=ONLY=60f,ONLY_REPEAT=30
Fixes: 7092309f325 ("LU-8066 libcfs: migrate to debugfs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic18b5b24d1ac939c09637e66a342f5e3622367c3
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43450
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13730 lod: don't confuse stale with primary flag
Alex Zhuravlev [Thu, 11 Mar 2021 05:47:34 +0000 (08:47 +0300)]
LU-13730 lod: don't confuse stale with primary flag

there can be few in-sync replicas which are not primry.

Lustre-commit: 571f3cf1115973d0fdaf6d5244bfeee230b52989
Lustre-change: https://review.whamcloud.com/42003

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8b984463a2665bc88f2f76247df5366a68d74ea6
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43448
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13073 osp: don't block waiting for new objects
Alex Zhuravlev [Fri, 16 Oct 2020 16:09:04 +0000 (19:09 +0300)]
LU-13073 osp: don't block waiting for new objects

if OST is down, then it's possible that few threads trying
to get already precreated object will get stuck. even worse
that all QoS-based allocations then are serialized by the
single semaphore, even those that wouldn't try to allocate
on failed OST.

the patch introduces noblock flag in the allocation hint
which is passed to OSP. then QoS code tries to allocate
objects in a non-blocking manner.

Lustre-commit: 2112ccb3c48ccf86aaf2a61c9f040571a6323f9c
Lustre-change: https://review.whamcloud.com/40274

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I38e66d7569aefecf800dbc32f1049ac87853439e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/43148
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14588 o2ib: make config script aware of the ofed symbols
Serguei Smirnov [Tue, 6 Apr 2021 22:54:01 +0000 (15:54 -0700)]
LU-14588 o2ib: make config script aware of the ofed symbols

LNet o2ib configuration script needs to be aware of the external
ofed dkms symbols when testing for availability of o2ib features
by building "conftest" kernel objects. If this is not done,
symbols from the core kernel are used by default which is
different from what is used when actually building LNet,
at least on Ubuntu. This patch adds the check for external symbols.

Lustre-commit: bcc5d784826d2d7a8eece28e96fab8b0fa02ab17
Lustre-change: https://review.whamcloud.com/43223

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Iea566f8a3feb86b8bef2f4501a3abc968d76451a
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14506 hsm: correct default stripe offset in import
John L. Hammond [Wed, 10 Mar 2021 15:20:29 +0000 (09:20 -0600)]
LU-14506 hsm: correct default stripe offset in import

In lhsmtool_posix, when calling llapi_hsm_import(), pass a stripe
offset of -1 rather than 0 to select the default. Add sanity-hsm
test_11c() to check that a file may be imported to a directory with a
default striping specifing a pool that does not include OST0000.

Lustre-commit: ea964031d7bdc6f31fccb7f136591b682eb35087
Lustre-change: https://review.whamcloud.com/41978

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I40636c0620b2f9314eb13bf23a8cf6d02990f851
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/43457
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoRM-620 build: New tag 2.14.0-ddn3
Andreas Dilger [Wed, 5 May 2021 04:14:41 +0000 (22:14 -0600)]
RM-620 build: New tag 2.14.0-ddn3

New tag 2.14.0-ddn3

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia872cc5544e97a281a1854b138aae19acb3ebbe5

4 years agoLU-14405 mdt: read LMV with mdt_stripe_get()
Lai Siyao [Tue, 9 Feb 2021 14:09:09 +0000 (22:09 +0800)]
LU-14405 mdt: read LMV with mdt_stripe_get()

mdt_path_current() reads LMV into mdt_thread_info.mti_xattr_buf,
whose size is static, and will return -ERANGE if LMV contains too
many stripes, instead it should call mdt_stripe_get(), the latter
will allocate dynamic memory for LMV.

Lustre-change: https://review.whamcloud.com/41452
Lustre-commit: 9dbfa36d3dd2434cfcffa13f76beb89fa3516586

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I1ed78f7a7f951fa5984e604a8773143a70b419e7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/41966
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-13440 utils: fix handling of lsa_stripe_off -1
Andreas Dilger [Tue, 4 May 2021 01:25:23 +0000 (19:25 -0600)]
LU-13440 utils: fix handling of lsa_stripe_off -1

Use LMV_OFFSET_DEFAULT instead of "-1" for parsing lfs_setdirstripe()
since parse_targets() will return "(__u32)-1" to the caller for the
stripe index, but lsa_stripe_off is a signed long long so it is
interpreted as 4294967295.  This causes the parsing to fail when
"lfs setdirstripe -i -1 --max-inherit-rr 1" is used.

Update sanity test_413a/413c to also specify "-i -1" to verify this.

Lustre-change: https://review.whamcloud.com/43530
Lustre-commit: TBD (from 792fa045a1975a1a18af0d72470134e5bf997d6a)

Fixes: 01d34a6b3b2e ("LU-13440 lmv: add default LMV inherit depth")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic934f859173155b1b2df56fcd315c8da633ebbe5
Reviewed-on: https://review.whamcloud.com/43524
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13439 lmv: qos stay on current MDT if less full
Andreas Dilger [Sun, 25 Apr 2021 11:02:19 +0000 (05:02 -0600)]
LU-13439 lmv: qos stay on current MDT if less full

Keep "space balanced" subdirectories on the parent MDT if it is less
full than average, since it doesn't make sense to select another MDT
which may occasionally be *more* full.  This also reduces random
"MDT jumping" and needless remote directories.

Reduce the QOS threshold for space balanced LMV layouts, so that the
MDTs don't become too imbalanced before trying to fix the problem.

Change the LUSTRE_OP_MKDIR opcode to be 1 instead of 0, so it can
be seen that a valid opcode has been stored into the structure.

Lustre-change: https://review.whamcloud.com/43445
Lustre-commit: 3f6fc483013da443b1494d81efe2d271ac67f901

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iab34c7eade03d761aa16b08f409f7e5d69cd70bd
Reviewed-on: https://review.whamcloud.com/43431
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-13440 lmv: add default LMV inherit depth
Lai Siyao [Mon, 15 Mar 2021 03:57:36 +0000 (11:57 +0800)]
LU-13440 lmv: add default LMV inherit depth

A new field "__u8 lum_max_inherit" is added into struct lmv_user_md,
which represents the inherit depth of default LMV. It will be
decreased by 1 for subdirectories.

The valid value of lum_max_inherit is [0, 255]:
* 0 means unlimited inherit.
* 1 means inherit end.
* 250 is the max inherit depth.
* [251, 254] are reserved.
* 255 means it's not set.

A new field "__u8 lum_max_inherit_rr" is added, if default stripe
offset is -1, lum_max_inherit_rr is non-zero, and system is balanced,
new directories are created in roundrobin mannner, otherwise they
are created on the MDT where their parents are located to avoid
creating remote directories. And similarly this value will be
decreased by 1 for each level of subdirectories.

The valid value of lum_max_inherit_rr is different:
* 0 means not set.
* 1 means inherit end.
* 250 is the max inherit depth.
* [251, 254] are reserved.
* 255 means unlimited inherit.

However for the user interface of "lfs", the valid value is [-1, 250]:
* -1 means unlimited inherit.
* 0 means not set.
* others are the same.

Add sanity 413c.

Lustre-change: https://review.whamcloud.com/43131
Lustre-commit: 01d34a6b3b2e34f7414f627e4f87993322dafa78

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I98ccad8556a0469f83bd7d79f5086a2184d5b115
Reviewed-on: https://review.whamcloud.com/43429
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14366 mdt: lfs mkdir should return -EEXIST if exists
Lai Siyao [Sat, 23 Jan 2021 10:28:26 +0000 (18:28 +0800)]
LU-14366 mdt: lfs mkdir should return -EEXIST if exists

'lfs setdirstripe' will try restripe if target exists, however
it's confusing to get -ENOTSUPP or -EALREADY for 'lfs mkdir', while
the latter invokes the same function as 'lfs setdirstripe'.

Pack MDS_OPEN_CREAT flag in request for 'lfs mkdir', and MDT won't
try restripe if it's set.

Add sanity 230s.

Lustre-change: https://review.whamcloud.com/41329
Lustre-commit: 65e3e4050ec5bb371c1c343fca49a605286a086e

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7b7ed04ee0b150253ff4d13bbdf1fe847d8f577c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/43428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13440 obdclass: server qos penalty miscaculated
Lai Siyao [Wed, 21 Apr 2021 12:05:52 +0000 (20:05 +0800)]
LU-13440 obdclass: server qos penalty miscaculated

Server qos penalty calculation uses active target count, but it
should use server count, which will make it larger than expected,
then weight of targets are often 0, and finally cause MDT0 is
often chosen in qos allocation.

Lustre-change: https://review.whamcloud.com/43385
Lustre-commit: 0ccce7ecb72f847f4235a513424d90119edad7ca

Fixes: 45222b2ef ("LU-12624 obdclass: lu_tgt_descs cleanup")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I1982363e4ff74c7344dd5e07d04e29214afa8a7f
Reviewed-on: https://review.whamcloud.com/43399
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13212 osc: fall back to vmalloc for large RPCs
Andreas Dilger [Mon, 12 Apr 2021 18:53:07 +0000 (12:53 -0600)]
LU-13212 osc: fall back to vmalloc for large RPCs

For large RPC sizes (16MB+) the page array (4096 brw_page) can
become very large (128KB+ with fscrypt) and should fall back to
vmalloc() if kmalloc() fails due to memory fragmentation.

The mdc/mdt allocations are currently limited to 1MB for readdir
RPCs, but it doesn't hurt to prepare them for larger RPCs from
clients in the future if this limit is increased.

Lustre-commit: 037a9e2cf6d5b8d6fdbcde02c1c22e22272c5c07
Lustre-change: https://review.whamcloud.com/43281

Fixes: 51b32ac2b9b8 ("LU-7990 rpc: increase bulk size")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I56805f5701d6850412664ce0681a1456b9405580
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43460
Tested-by: jenkins <devops@whamcloud.com>
4 years agoEX-2992 tests: add sleep to verify lamigo and lpurge params
Jian Yu [Tue, 4 May 2021 06:23:36 +0000 (23:23 -0700)]
EX-2992 tests: add sleep to verify lamigo and lpurge params

This patch improves verify_one_lamigo_param() and
verify_one_lpurge_param() in hot-pools.sh to try
more times while verifying lamigo and lpurge params
in case there is a latency time for the param(s)
to be updated.

Test-Parameters: trivial testlist=hot-pools \
env=HOT_POOLS_EXCEPT="56"
Test-Parameters: trivial clientdistro=el8.3 \
testlist=hot-pools env=HOT_POOLS_EXCEPT="56"
Test-Parameters: trivial clientdistro=sles15sp2 \
testlist=hot-pools env=HOT_POOLS_EXCEPT="56"
Test-Parameters: trivial testgroup=review-dne-part-2 \
env=SANITY_LFSCK_EXCEPT="30",HOT_POOLS_EXCEPT="56"

Lustre-change: https://review.whamcloud.com/43256
Lustre-commit: b465db8b9b99e217d175f31230d04e10a9a17906

Change-Id: I0f4818baa1c2cd87920ff3189461b45b53871e90
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43529
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2718 tests: remove lpurge mds validation
John L. Hammond [Tue, 4 May 2021 06:04:05 +0000 (23:04 -0700)]
EX-2718 tests: remove lpurge mds validation

The lpurge changes for EX-2718 (use local mountpoint
for purge operations) lands with commit dfa0760e7d8.
However, the hot-pools.sh changes were missing somehow.
This patch adds the changes back to remove lpurge
mds validation.

Lustre-change: https://review.whamcloud.com/43140
Lustre-commit: 7b00329e09ef73335e33dd8e83bb7993c39990e9

Fixes: dfa0760e7d8 ("EX-2718 lpurge: use local mountpoint for purge operations")
Change-Id: I55a40dfd958e40859ddb5e98b3f76ad568d0095b
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43528
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-3108 build: update kernel to -ddn13
Andreas Dilger [Tue, 4 May 2021 15:09:35 +0000 (09:09 -0600)]
EX-3108 build: update kernel to -ddn13

Update the kernel version to -ddn13 to match the version used
on b_es5_2 so that it is possible to just upgrade the Lustre
RPMs when moving from EXA5.2.2 to EXA6.0.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8845fa2c797769b94971e60dc92cdfb2c79bb570
Reviewed-on: https://review.whamcloud.com/43534
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Minh Diep <mdiep@whamcloud.com>
4 years agoEX-1135 lipe: add support to build against centos8
Gu Zheng [Fri, 8 May 2020 07:16:42 +0000 (03:16 -0400)]
EX-1135 lipe: add support to build against centos8

There's a huge difference between centOS8 and centOS 7 series, especailly
the strict distinction between python2 and python3, and related python
rpms or pypi packages are the same condition.
Following changes are introduced to add support to build against centOS8:
1. the python platform is strict to python2(python2.7)
2. use 'pip2' instead of 'pip' for pypi
3. improve dependency package list (rpm and pypi module), make it can
be acceptable to centOS7.x and centOS8.x
4. fix code sytle issues to make pylint/pep8 on centOS8 happy
5. set encoding via environ "PYTHONIOENCODING" if sys.setdefaultencoding
is gone (python2.7 on centOS8)
6. improve the lipe.spec to support "make rpms" against centOS8

Change-Id: Id172e2a6aa29f382c4d12ff0d2e748e8b0cde444
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/43483
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
4 years agoEX-3082 lipe: posix scan cannot get projid
Lei Feng [Mon, 26 Apr 2021 08:28:57 +0000 (16:28 +0800)]
EX-3082 lipe: posix scan cannot get projid

Regular file or directory can have projid. So if an entry is
not regular AND not directory, set projid to 0.

Change-Id: Id9e7dd471513817ac1cb9d146563b369f9ebe2eb
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43447
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-3006 lipe: fix time calculation mistake
Lei Feng [Tue, 13 Apr 2021 01:18:09 +0000 (09:18 +0800)]
EX-3006 lipe: fix time calculation mistake

1ms = 1,000us = 1,000,000ns

Change-Id: Iab99f0190ca6d91178d10519b44bce989246d03d
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43288
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
4 years agoLU-12142 clio: fix hang on urgent cached pages
Wang Shilong [Wed, 14 Oct 2020 02:49:49 +0000 (10:49 +0800)]
LU-12142 clio: fix hang on urgent cached pages

Few problems addressed by this patch:

1) We try to reserve cl_pages in batch, but we don't do
that for append IO, there is no reason to skip that.

2) IO might be not page aligned, calculate reserved pages
correctly for this case.

3) If we issue one large IO block size which is larger
than max_cached_mb, IO will never be finished, because
we don't have enough cl pages to finish it, split IO
in this case.

4) Readahead should fail if we are short of LRU page
slots to avoid deadlock.

After above adjustment, LRU slots are guranteed for normal
buffer write before IO starts, if block size is too large
for max LRU slots, IO will be split.

For extra readahead, don't try hard and quit if we
are short of LRU pages, since readahead could tolerate
errors, applications won't be aware of it.

besides newly added tests, following command with 64M
max_cached_mb setting and don't see client hang any more.

/usr/lib64/openmpi/bin/mpirun --allow-run-as-root -np 12
-wd /mnt/lustre ior -g -e -w -r -b 1g -T 10 -F -C -t 64m

Todo:
Performance benchmark for readahead

Lustre-commit: 2a34dc95bd100c181573e231047ff8976e296a36
Lustre-change: https://review.whamcloud.com/40237

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I5c85454a40daeefb4fb97609d6aa28df2eafb99c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/43456
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14494 mdt: check object exists in mdt_close_handle_layouts()
John L. Hammond [Fri, 5 Mar 2021 18:47:43 +0000 (12:47 -0600)]
LU-14494 mdt: check object exists in mdt_close_handle_layouts()

In mdt_close_handle_layouts() the client supplied FID may not identify
an existing object. So check for this before calling lu_object_attr().

Lustre-commit: 075bea805efe8a7ef1a3aabd8dd2c166bb52115b
Lustre-change: https://review.whamcloud.com/41905

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ib1710ca4bf7587e0496b3a37a2afb65f81250455
Reviewed-on: https://review.whamcloud.com/41905
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43453
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14522 ldlm: reprocess locks if enqueue failed
Alex Zhuravlev [Sun, 14 Mar 2021 04:29:11 +0000 (07:29 +0300)]
LU-14522 ldlm: reprocess locks if enqueue failed

if the export got disconnected during enqueue, ldlm_handle_enqueue0()
drops the lock, but can skip reprocessing and this way all subsequent
waiting locks conflicting with the dopped one may get stuck.

with the patch most of racers succeed, otherwise 1/4 of runs get stuck

Lustre-commit: 9cc7128b9b2bf444657dac6765decf9fb56aee8d
Lustre-change: https://review.whamcloud.com/42031

Fixes: 37932c4beb ("LU-10175 ldlm: IBITS lock convert instead of cancel")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I584b0de2656840da5dfa86a894fe02f138e1389d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43451
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14507 mdt: handle default stripe_count=-1 properly
Andreas Dilger [Wed, 10 Mar 2021 16:57:44 +0000 (09:57 -0700)]
LU-14507 mdt: handle default stripe_count=-1 properly

If the default LMV stripe_count=-1 print it as a signed value
instead of unsigned, to better match how it is set with "-c -1".

Lustre-commit: d9753b5ba6ad29fd8958a47b462d2fa594ba1145
Lustre-change: https://review.whamcloud.com/41983

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I106f266c33e2c2cf0f5bcc1491e4bc5ac93ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43147
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2811 build: systemd missing from deb dkms
Minh Diep [Thu, 18 Mar 2021 22:48:42 +0000 (15:48 -0700)]
EX-2811 build: systemd missing from deb dkms

lnet.service is missing from lustre-client-utils
when build with dkms

Change-Id: Ic52d41dea867f55c5bd8edd39057ba514ed7308a
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43449
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2882 tests: fix hot-pools.sh issues on single node
Jian Yu [Mon, 26 Apr 2021 17:53:49 +0000 (10:53 -0700)]
EX-2882 tests: fix hot-pools.sh issues on single node

This patch fixes the following issues while running hot-pools.sh
on single node:
- sh: warning: here-document at line 0 delimited by end-of-file
  (wanted `EOF')
- lfs changelog_clear: cannot open '/dev/changelog-lustre-MDT0000':
  No such file or directory (2)

Lustre-commit: ab4a750a5138fa9710adc5f196ac820634628c4d
Lustre-change: https://review.whamcloud.com/42140

Test-Parameters: trivial testlist=hot-pools,hot-pools
Test-Parameters: trivial testgroup=review-dne-part-2

Change-Id: Ie259438c737c9ef4c1fd7148a6cc918177b8fb47
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/43122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-11289 ptlrpc: fix ASSERTION on scp_rqbd_posted
Yang Sheng [Mon, 8 Mar 2021 14:53:13 +0000 (22:53 +0800)]
LU-11289 ptlrpc: fix ASSERTION on scp_rqbd_posted

The request may be referenced by other target even the threads
of service were stopped. It caused by some portal shared among
different services. Just wait the request to be released as a
workaround.

LustreError: (service.c::ptlrpc_service_purge_all())
ASSERTION( list_empty(&svcpt->scp_rqbd_posted) ) failed:
LustreError: (service.c::ptlrpc_service_purge_all()) LBUG
Pid: 21, comm: umount 3.10.0 #1 SMP
Call Trace:
 [<a01c47dc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
 [<a01c488c>] lbug_with_loc+0x4c/0xa0 [libcfs]
 [<a0b534dd>] ptlrpc_unregister_service+0xced/0xd90 [ptlrpc]
 [<a005e122>] ost_cleanup+0x82/0x1b0 [ost]
 [<a08e0bfa>] class_free_dev+0x1ca/0x630 [obdclass]
 [<a08e1240>] class_export_put+0x1e0/0x2b0 [obdclass]
 [<a08e2cc5>] class_unlink_export+0x135/0x170 [obdclass]
 [<a08f8030>] class_decref+0x80/0x160 [obdclass]
 [<a08f8481>] class_detach+0x1b1/0x2e0 [obdclass]
 [<a08fef21>] class_process_config+0x1a91/0x2820 [obdclass]
 [<a08ffe90>] class_manual_cleanup+0x1e0/0x6d0 [obdclass]
 [<a092a115>] server_stop_servers+0xd5/0x160 [obdclass]
 [<a092f6c6>] server_put_super+0x126/0xca0 [obdclass]
 [<8121068a>] generic_shutdown_super+0x6a/0xf0
 [<81210a62>] kill_anon_super+0x12/0x20
 [<a09027e2>] lustre_kill_super+0x32/0x50 [obdclass]
 [<81210e59>] deactivate_locked_super+0x49/0x60
 [<812115a6>] deactivate_super+0x46/0x60
 [<8123019f>] cleanup_mnt+0x3f/0x80
 [<81230232>] __cleanup_mnt+0x12/0x20
 [<810ab085>] task_work_run+0xb5/0xf0
 [<8102ac12>] do_notify_resume+0x92/0xb0
 [<81783c83>] int_signal+0x12/0x17
 Kernel panic - not syncing: LBUG

Lustre-change: https://review.whamcloud.com/41936
Lustre-commit: b635a0435d13d8431a8344735322b84cb4613b68

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Idfb19df123ceae177a0e447e9344bac6861166bf
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/42048
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14530 kernel: kernel update SLES12 SP5 [4.12.14-122.63.1]
Jian Yu [Sat, 3 Apr 2021 18:17:52 +0000 (11:17 -0700)]
LU-14530 kernel: kernel update SLES12 SP5 [4.12.14-122.63.1]

Update SLES12 SP5 kernel to 4.12.14-122.63.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: I67ab524ff2dc94c649bc970c7bb1d83009828880
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43205
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2921: merge lipe changes from b_es5_2
John L. Hammond [Tue, 27 Apr 2021 14:38:30 +0000 (09:38 -0500)]
EX-2921: merge lipe changes from b_es5_2

Merge commit 'dfa0760e7d8c7f5f56ebb5ee2e766f0a05cc4e67' into b_es6_0:

$ git checkout b_es5_2
$ git subtree split --prefix=lipe
8251fae87b508e36caab6397b1063b308dcb2b05
$ git checkout b_es6_0
$ git subtree merge --prefix=lipe --squash 8251fae87b508e36caab6397b1063b308dcb2b05

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I864a5d014b8e0528a90052b291401db7ce203cc1

4 years agoSquashed 'lipe/' changes from 38f79e56ec..8251fae87b
John L. Hammond [Tue, 27 Apr 2021 14:38:30 +0000 (09:38 -0500)]
Squashed 'lipe/' changes from 38f79e56ec..8251fae87b

8251fae87b Update lipe version to 1.17.
87ee780007 EX-1613 scripts: Use ticket to start/stop hotpools
0ce8cdc011 EX-3078 lipe: quote FIDs in remote commands
dcd24fe01e EX-3034 lamigo: check for available agents early
0ff4506a1d EX-3043 lamigo: remove debugging leftover
22cf972d45 EX-3043 lamigo: simplify changelog cleaning check
de214cee10 EX-3009 lamigo: dump changelog status
8e7ee6cc80 EX-3017 lpurge: for stats for skipped objects
b0bce08ec5 EX-2768 lamigo: don't register a SIGCHLD handler
8981cfa704 EX-3036 lipe: version and revision support
1f562d542b EX-3020 lamigo: prevent out of order changelog clearing
d859e4bff2 EX-3021 lipe: refactor lipe_ssh context handling
4124346b0d EX-2718 lpurge: use local mountpoint for purge operations
7caf05b672 EX-3030 lipe: join multiple threads in lamigo_check_jobs()
911cf5018b EX-2994 lipe: update lpurge purged stats correctly
3d1af585e3 Update lipe version to 1.16.
842d8c5aad EX-2962 lipe: Fix config autodetect
df423e9539 EX-2948 build: less checks in lipe configuration
6745370c5b EX-2983 lamigo: reduce log level in lamigo_exec_cmd()
77e90c1df3 EX-2979 lamigo: do not count setprefer as replication
eda4f09711 EX-2608 scripts: Auto detect previous values
4ca909e2ea Update lipe version to 1.15.
2f80b12aaf EX-2770 lpurge: set lop_mdt_idx before spawning thread
370ba3a118 EX-2930 lipe: fix errno.h include
4f672e1346 EX-2921 lipe: merge tools/lipe to lipe subtree
17a2a63533 EX-2778 lipe: lipe.spec fixes

git-subtree-dir: lipe
git-subtree-split: 8251fae87b508e36caab6397b1063b308dcb2b05

4 years agoLU-14450 kernel: kernel update RHEL8.3 [4.18.0-240.15.1.el8_3]
Jian Yu [Mon, 22 Mar 2021 23:07:14 +0000 (16:07 -0700)]
LU-14450 kernel: kernel update RHEL8.3 [4.18.0-240.15.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.15.1.el8_3.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Change-Id: I92ca7769fac17221da376788cfe79887ecc4c19c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42088
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14430 mdd: don't assert on default ACL big buffer
Mikhail Pershin [Fri, 26 Feb 2021 14:48:36 +0000 (17:48 +0300)]
LU-14430 mdd: don't assert on default ACL big buffer

Previous patch may cause situations when default ACL buffer
is bigger than ACL buffer, so that default ACL EA may fit
into the former but not in the latter, causing assertion in
mdd_acl_init().

There is no need in assertion actually, just return -ERANGE so
ACL buffer will be re-allocated.

Lustre-commit: b66b530c18c910ded562e279c9db02fcdad42176
Lustre-change: https://review.whamcloud.com/41775

Fixes: f3d03bc38a3a ("LU-14430 mdd: fix inheritance of big default ACLs")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8c0665ba693c60506812926a8372b61095d08f78
Reviewed-on: https://review.whamcloud.com/42059
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14529 kernel: kernel update SLES15 SP2 [5.3.18-24.52.1]
Jian Yu [Sat, 3 Apr 2021 18:14:52 +0000 (11:14 -0700)]
LU-14529 kernel: kernel update SLES15 SP2 [5.3.18-24.52.1]

Update SLES15 SP2 kernel to 5.3.18-24.52.1 for Lustre client.

Test-Parameters: trivial \
env=SANITY_EXCEPT="100 130 136 817" \
clientdistro=sles15sp2 serverdistro=el7.9 \
testlist=sanity

Change-Id: Ifbcfdac3e7dedeb5bde9f4a31575ad5008518c80
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43204
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2933 tests: replace the newline with a space in $params
Jian Yu [Thu, 1 Apr 2021 18:49:29 +0000 (11:49 -0700)]
EX-2933 tests: replace the newline with a space in $params

This patch replaces the newline with a space in $params
passing to wait_import_state().

Lustre-change: https://review.whamcloud.com/43159
Lustre-commit: 94332d277e0d79cf0dd345533ba186e73a9e19af

Fixes: ab4a750a51 ("EX-2882 tests: fix hot-pools.sh issues on single node")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I09c9b72bc4e59cf1ceaf8ae17c36c7f8c567c730
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14577 ldiskfs: support Ubuntu 20.04 kernel 5.4.0-1007
Jian Yu [Thu, 1 Apr 2021 02:04:13 +0000 (19:04 -0700)]
LU-14577 ldiskfs: support Ubuntu 20.04 kernel 5.4.0-1007

While applying 5.4.0-66-ubuntu20.series ldiskfs patches
to kernel 5.4.0-1007, there is a conflict in
ext4_update_dx_flag() in ubuntu2004/ext4-pdirop.patch.
It turns out the ext4_update_dx_flag() codes in kernel
5.4.0-1007 are the same with those in kernel version
smaller than 5.4.0-66. So, 5.4.0-42-ubuntu20.series works.

This patch fixes lustre-build-ldiskfs.m4 to detect
5.4.0-42-ubuntu20.series for kernel 5.4.0-1007.

Test-Parameters: trivial

Change-Id: I3cd932b8ae2d7c7f4f900b8b18647a4252d100b2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43188
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2745 tests: limit debug log entries for hot-pools.sh
Jian Yu [Mon, 5 Apr 2021 16:27:08 +0000 (09:27 -0700)]
EX-2745 tests: limit debug log entries for hot-pools.sh

This patch adds --since "$duration seconds ago" option
to journalctl to gather lamigo and lpurge service debug
logs for the exact runs.

Test-Parameters: trivial testlist=hot-pools,hot-pools
Test-Parameters: trivial testgroup=review-dne-part-2

Change-Id: Idbb81bc4dc7fc669074c1e8f8c156627abe6c610
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43208
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2745 tests: improve hot-pools.sh to gather debug logs
Jian Yu [Fri, 26 Mar 2021 23:02:06 +0000 (16:02 -0700)]
EX-2745 tests: improve hot-pools.sh to gather debug logs

This patch improves hot-pools.sh to gather debug logs
for lamigo and lpurge.

Lustre-change: https://review.whamcloud.com/43111
Lustre-commit: 075ed14d944b0078fcd32ce06aa868ecaabb3adb

Test-Parameters: trivial testlist=hot-pools,hot-pools
Test-Parameters: trivial testgroup=review-dne-part-2

Change-Id: I65f23d00744499853ab099e7f097161e5e1dd66a
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43146
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13239 ldiskfs: pass inode timestamps at initial creation
Shaun Tancheff [Thu, 1 Apr 2021 02:35:01 +0000 (19:35 -0700)]
LU-13239 ldiskfs: pass inode timestamps at initial creation

A previous patch https://github.com/Cray/lustre/commit/6d4fb6694
"LUS-4880 osd-ldiskfs: pass uid/gid/xtime directly to ldiskfs"
was intended to be ported to upstream lustre but was lost.

The patch https://review.whamcloud.com/34685/
"LU-12151 osd-ldiskfs: pass owner down rather than transfer it"
passed the inode UID and GID down to ldiskfs at inode allocation
time to avoid the overhead of transferring quota from the inode
(initially created as root) over to the actual user of the file.

The two patches differed slightly in that the LUS-4880 included
passing the a/m/ctimes from osd-ldiskfs to ldiskfs at inode
creation time avoids overhead of setting the timestamps afterward.

Benchmarks using MDTEST:
  mdtest -f 32 -l 32 -n 16384 -i 5 -p 120 -t -u -v -d mdtest

                            master                 patched
   Operation                  Mean    Std Dev         Mean   Std Dev
   ---------                  ----    -------         ----   -------
   Directory creation:   17008.593     72.700    17099.863   155.461
   Directory stat    :  170513.269   1456.002   170105.207  2349.934
   Directory removal :   80796.147   2633.832    84480.222   892.536
   File creation     :   39227.419   7014.539    40429.900  6643.868
   File stat         :  101761.395   2979.802   103818.800  1146.689
   File read         :   86583.370    871.982    85725.254   965.862
   File removal      :   74923.504    761.048    75075.180   723.966
   Tree creation     :     588.570    244.534      608.332   123.939
   Tree removal      :      39.874      1.873       44.357     2.350

This patch also reorganizes the ldiskfs patch series in
order to accommodate struct iattr being added to
ldiskfs_create_inode.
All supported server platforms RHEL 7.5+, SUSE 12+ and
ubuntu 18+ are affected.

Lustre-change: https://review.whamcloud.com/37556
Lustre-commit: 5bb641fa61175fd0fe63e830219d88304b5162c3

HPE-bug-id: LUS-7378, LUS-4880, LUS-8042, LUS-9157, LUS-8772, LUS-8769
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I87e9c792b5240820bfd3a7268e477970ebac8465
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43189
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14462 gss: remove HAVE_SETNS from lgss_keyring
Sebastien Buisson [Tue, 9 Mar 2021 16:11:44 +0000 (17:11 +0100)]
LU-14462 gss: remove HAVE_SETNS from lgss_keyring

For the sake of simplification, a previous patch removed the config
check that sets HAVE_SETNS, due to the fact that in kernels 3.10+
function setns() necessarily exists.
In this case, all #ifdef on HAVE_SETNS are erroneous because it is
not set whereas the function is actually available.
So remove all references to HAVE_SETNS in the code.

Lustre-change: https://review.whamcloud.com/41967
Lustre-commit: 9d347ae1d6aa642a86b710452b1978ea303dea09

Fixes: 8e88bbfef5 ("LU-12477 lustre: remove obsolete config checks")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iab0726c3e847a210185cc8c9353a79976acb1381
Reviewed-on: https://review.whamcloud.com/43166
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14527 kernel: kernel update RHEL7.9 [3.10.0-1160.21.1.el7]
Jian Yu [Thu, 18 Mar 2021 22:35:23 +0000 (15:35 -0700)]
LU-14527 kernel: kernel update RHEL7.9 [3.10.0-1160.21.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.21.1.el7.

Test-Parameters: clientdistro=el7.9 serverdistro=el7.9

Change-Id: I1a46fe492d280b19c0f93458aaac975a4c873caf
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42090
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14538 gss: make namespace optional in lgss_keyring
Sebastien Buisson [Fri, 19 Mar 2021 14:46:58 +0000 (15:46 +0100)]
LU-14538 gss: make namespace optional in lgss_keyring

Introduce a new tunable 'sptlrpc.gss.gss_check_upcall_ns' to
make namespace support optional in lgss_keyring.
By default it is set to 1, which means adopt the standard behavior,
consisting in checking caller's namespace and switching namespace
if necessary.
When the tunable is set to 0, lgss_keyring sticks to the current
namespace.

Lustre-change: https://review.whamcloud.com/42112
Lustre-commit: 3f8a6fd7d6d5969560157e37abe1a7d9307cc53f

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib9d4e47935a718d4aae31fbb0d13f6bc8a4005a5
Reviewed-on: https://review.whamcloud.com/43218
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2782 build: build lipe using lbuild
Minh Diep [Fri, 5 Mar 2021 17:24:50 +0000 (09:24 -0800)]
EX-2782 build: build lipe using lbuild

Lustre-change: https://review.whamcloud.com/41904
Lustre-commit: 849db551a86a8c707d7bb5b83eebf639f2e453e9

Change-Id: I1b63a0378b76984ad24f14af89553bb00f659d35
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/43124
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2984 build: fix build opa in lbuild
Minh Diep [Thu, 8 Apr 2021 20:39:14 +0000 (13:39 -0700)]
EX-2984 build: fix build opa in lbuild

Add missed a call to build_opa.

Test-Parameters: trivial
Fixes: 8f467a03e3b9 ("EX-2439 build: Add opa-src option to lbuild")
Change-Id: I857696f7099deb80d70855188be8628d678148f9
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43243
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14479 ssk: explicitly set perm on key
Sebastien Buisson [Mon, 8 Mar 2021 14:20:00 +0000 (15:20 +0100)]
LU-14479 ssk: explicitly set perm on key

When an SSK key is loaded, either via lgss_sk command or thanks to
skpath mount option, try to set permissions on the key.
This is to avoid a 'Permission denied' error when a Lustre client or
server wants to make use of the key later on.

Lustre-change: https://review.whamcloud.com/41929
Lustre-commit: f265033840996dcdffb2f05a64b51b51391a273c

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1ed712ae4d07be306cc76b4e59fab303437558bb
Reviewed-on: https://review.whamcloud.com/43164
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14534 gss: do not refresh context for LDLM callback
Sebastien Buisson [Thu, 18 Mar 2021 16:17:31 +0000 (17:17 +0100)]
LU-14534 gss: do not refresh context for LDLM callback

If the request to be sent is an LDLM callback, do not try to
refresh context.
An LDLM callback is sent by a server to a client in order to make
it release a lock, on a communication channel that uses a reverse
context. It cannot be refreshed on its own, as it is the 'reverse'
(server-side) representation of a client context.
We do not care if the reverse context is expired, and want to send
the LDLM callback anyway. Once the client receives the AST, it is
its job to refresh its own context if it has expired, hence
refreshing the associated reverse context on server side, before
being able to send the LDLM_CANCEL requested by the server.

Lustre-change: https://review.whamcloud.com/42076
Lustre-commit: 1769f262b96745b61b21fd1450cc4c0386a41b95

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ic8f4fe203f16ed5cfafd3da355c78cf58d96c3eb
Reviewed-on: https://review.whamcloud.com/43173
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2778 lipe: lipe.spec fixes
John L. Hammond [Fri, 5 Mar 2021 14:55:39 +0000 (08:55 -0600)]
EX-2778 lipe: lipe.spec fixes

In lipe.spec.in, call install without specifying file ownership. Fixup
some bogus changelog dates.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I14946251ef9b39a8bab9f9c53a46d3c544ded240
Reviewed-on: https://review.whamcloud.com/43158
Tested-by: jenkins <devops@whamcloud.com>
4 years agoEX-2930 lipe: fix includes
John L. Hammond [Mon, 29 Mar 2021 14:11:53 +0000 (09:11 -0500)]
EX-2930 lipe: fix includes

In lipe/src/lipe_expression_test.c, include the headers we need and
replace <debug.h> with "debug.h".

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I1c416da5bb61b3219025d93706dfb6e798fccc1c
Reviewed-on: https://review.whamcloud.com/43156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2930 lipe: fix errno.h include
John L. Hammond [Fri, 26 Mar 2021 14:20:10 +0000 (09:20 -0500)]
EX-2930 lipe: fix errno.h include

In lipe/src/lustre_ea.c, include "errno.h" rather than <errno.h>.

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I91c7106502cb5f1da04cdf27071584233473f469
Reviewed-on: https://review.whamcloud.com/43133
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43143

4 years agoRM-620 build: New tag 2.14.0-ddn2
Andreas Dilger [Sat, 27 Mar 2021 06:58:02 +0000 (00:58 -0600)]
RM-620 build: New tag 2.14.0-ddn2

New tag 2.14.0-ddn2

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I86c51faa9eb69723465332cc9e132a4e2915bc14

4 years agoLU-14499 revert: LU-13368 lnet: discard the callback
Serguei Smirnov [Mon, 8 Mar 2021 17:46:03 +0000 (09:46 -0800)]
LU-14499 revert: LU-13368 lnet: discard the callback

The changes introduced by LU-13368 have been shown to cause
the o2iblnd shutdown procedure to hang on lustre_rmmod
as it infinitely waits for peers to disconnect. Revert it.
This reverts commit babf0232273467b7199ec9a7c36047b1968913df.

Lustre-change: https://review.whamcloud.com/41937
Lustre-commit: TBD (from 9a1b64724bdb9452a6c3e14a92c7ef341173d19b)

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I489ae4af445b18df852ec35adc958c4fac33de09
Reviewed-on: https://review.whamcloud.com/42117
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2932 llapi: fix '%llu' type mismatch on ppc64le
Minh Diep [Fri, 26 Mar 2021 21:14:58 +0000 (14:14 -0700)]
EX-2932 llapi: fix '%llu' type mismatch on ppc64le

The ppc64le architecture unfortunately defines "__u64" as "long"

Change-Id: I0941f0345df101031cdd44c3ac77220ff6b4cc5b
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43144
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
4 years agoRM-620 build: New tag 2.14.0-ddn1
Andreas Dilger [Sun, 7 Mar 2021 06:52:23 +0000 (23:52 -0700)]
RM-620 build: New tag 2.14.0-ddn1

New tag 2.14.0-ddn1

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic113f1c81a31b132da5ed2dcf0378d47553ebbe5

4 years agoLU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag
Qian Yingjin [Mon, 30 Nov 2020 02:08:17 +0000 (10:08 +0800)]
LU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag

Add a new connection flag OBD_CONNECT2_PCCRO to solve the access
consistency from the old client without PCC-RO support.

Lustre-change: https://review.whamcloud.com/40791
Lustre-commit: TBD (from d9ac6b2e7eaaad892a2ecd0460b0f6915216c1cd)

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I19716e94a86e53353c1628d414c92e61e084dfc9
Reviewed-on: https://review.whamcloud.com/43105
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2873 pcc: async attach in the background for PCC-RO file
Qian Yingjin [Mon, 22 Mar 2021 09:16:15 +0000 (17:16 +0800)]
EX-2873 pcc: async attach in the background for PCC-RO file

In current PCC, it may have a long delay while the whole file is
being copied into the cache before it can be used. There is a
significant delay for the first file access if the file is large,
which wastes valuable computing time. Being able to shorten this
time to first access may help application efficiency.

In this patch, it adds an tuning parameter "async_threshold",
which means the size threshold to determine doing PCC-RO attach
asynchronously in the background.

When the file size is samller than the threshold, the PCC attach
during open() will be performed in synchronous way.
Otherwise, the client will start a dedicated kernel thread to
copy data from Lustre OSTs to the PCC copy in the background, but
reads could fall back to the normal Lustre I/O path from Lustre
OSTs until the file is fully cached.

This may double the reads to the Lustre filesystem initially if
the file is not read sequentially, but would avoid the high
latency for data access. This may be some cache sharing (avoiding
double reads) if the PCC copy and the application both shared
the filesystem cached pages on the client.

The tuning parameter "llite.*.pcc_async_threshold" is set with
256MiB by default.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia80992e9050cc6e4c7f61949fc4013dec303e150
Reviewed-on: https://review.whamcloud.com/42125
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-12358 pcc: add project quota support on PCC backend
Qian Yingjin [Sun, 6 Sep 2020 08:52:04 +0000 (16:52 +0800)]
LU-12358 pcc: add project quota support on PCC backend

Current PCC can enforce a quota limitation of the capacity usage
for each user and group to provide cache isolation. An admin
can specify the quota enforcement on the local PCC file system.

Users can perform PCC-cached I/O on files until they receive a
return value -ENOSPC of -EDQUOT, which means that they hit the
quota limit or that there is no free capacity left on the local
PCC backend fs during I/O or the attach process. At this time,
I/O will fall back to the normal I/O path.

This patch adds project quota on the PCC backend file system
along with user/group quota.

With this feature, it can have multiple PCC backends on a single
client with different caching rules, so we can define upfront
how much of the client FS can be used for each cache.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib93da953d4a3a7091f62094f8175bde91e819895
Reviewed-on: https://review.whamcloud.com/41928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
4 years agoEX-2455 pcc: get PCC state for a file without opening itself
Qian Yingjin [Thu, 25 Feb 2021 12:43:58 +0000 (20:43 +0800)]
EX-2455 pcc: get PCC state for a file without opening itself

Originally to get PCC state for a given file, the user needs to
open the file and then get the current PCC state of the file via
the file handle. After that, close the file.

If the file is met the predefined condition of auto prefetching
into PCC at the open time, "lfs pcc state" command on the file
will attach the file into PCC cache. This may be not the intention
of the user.

In this patch, we rework the "lfs pcc state" command. It always
open the parent directory, and then do the lookup by name/FID
without open the file itself to get the PCC state.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I310a7e73dc6c0f4318dc27df2e02ecf6559ee5b4
Reviewed-on: https://review.whamcloud.com/41927
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2455 pcc: check first before set PCC-RO on a file
Qian Yingjin [Fri, 5 Feb 2021 03:48:26 +0000 (11:48 +0800)]
EX-2455 pcc: check first before set PCC-RO on a file

In this patch, MDT takes a CR layout lock against the file object
first to check whether the file is already PCC-RO cached. If so,
return immediately; Otherwise, take an EX lock on the file to
update the FLR PCC-RO state accordingly. By this check, it can
avoid heavy lock contention and unnecessary revocation of the
layout lock granted to the other clients when multiple processes
from many clients perform read-only attach on a shared file
simultaneously.

It also adds the layout intent write (LAYOUT_INTENT_PCCRO_SET
and LAYOUT_INTENT_PCCRO_CLEAR) with FMODE_WRITE flag, so that
the conflict lock can be revoked via the ELC strategy, avoiding
unnecessary lock traffic.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Change-Id: Id01ea69335ad8ad46bade356327644e0dfb571cc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/41926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2921 lipe: add tools/lipe as lipe subtree
John L. Hammond [Thu, 25 Mar 2021 16:45:34 +0000 (11:45 -0500)]
EX-2921 lipe: add tools/lipe as lipe subtree

Merge commit 'e2a8a03f3599c42f85955d9c0339e9cc6a570214' as 'lipe'

git remote add tools/lipe ssh://review.whamcloud.com:29418/tools/lipe
git fetch --no-tags tools/lipe
git subtree add --prefix=lipe --squash tools/lipe/master

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I6e4e6e5349ea42beee3cc202a9bdaf7e29fb5b12

4 years agoSquashed 'lipe/' content from commit 38f79e56ec
John L. Hammond [Thu, 25 Mar 2021 16:45:34 +0000 (11:45 -0500)]
Squashed 'lipe/' content from commit 38f79e56ec

git-subtree-dir: lipe
git-subtree-split: 38f79e56ec2816cefda2e6d8d3e1f56f1992549d

4 years agoLU-12373 pcc: delete stale PCC copy when remove PCC backend
Qian Yingjin [Thu, 22 Oct 2020 08:22:45 +0000 (16:22 +0800)]
LU-12373 pcc: delete stale PCC copy when remove PCC backend

By defualt, when removing a PCC backend from a client, the action
is to scan the PCC backend FS, uncache (detach and remove) all
scanned PCC copies from PCC by FIDs.

However, during the tests, we found that some old stale PCC copies
are not removed when an adminstrator runs "lctl pcc del|clear".
The reason is that these PCC copies are already detached from PCC
when running the commands.

This patch fixes this bug: when removing a PCC backend from a
client, it will also delete all non-cached PCC copies from PCC
backend to free up the space.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Id829abe7e6cb1294e6baea76452f4a9178711451
Reviewed-on: https://review.whamcloud.com/41925
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14003 pcc: convert mapping pagecache for mmap
Qian Yingjin [Thu, 22 Oct 2020 01:29:12 +0000 (09:29 +0800)]
LU-14003 pcc: convert mapping pagecache for mmap

In the PCC mmap implementation, it will replace the mapping of
the PCC copy with the one of the Lustre file when do mmap() to
make the mmapped region (vma) link into the mapping of the
Lustre file not the mapping of the PCC copy.
At this time, in the old design the pagecache in the original
mapping of the PCC copy is simply dropped as the mapping of each
page is different after the replacement of the mapping.

This may have negative impact on the mmap performance.
The reason is that during PCC attach it will write the data from
Lustre into PCC copy in buffered I/O mode, these data will keep
in pagecache and managed by the mapping of the PCC copy if there
is enough system memory. Then for the latter mmap, the page fault
could directly read data from the pagecache to speed up the mmap
operation.
If drop these pagecahe due to the different mapping of each pages,
the page fault must read page from the disk and may result in bad
performance.

To make full use of these pagecache of the PCC copy, during mmap
call, it can first remove the page from the original mapping of
the PCC copy, and then convert and add it into the mapping of the
Lustre file. By this way, all pagecaches are converted and can be
reused for the latter page fault.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1591937543d7d31b8811ec62088accd0070d7d37
Reviewed-on: https://review.whamcloud.com/41924
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14003 pcc: rework PCC mmap implementation
Qian Yingjin [Wed, 30 Sep 2020 03:00:43 +0000 (11:00 +0800)]
LU-14003 pcc: rework PCC mmap implementation

In the old PCC mmap implementation, it replaces the vm_file with
the file of the PCC copy, and then call ->fault() or
->page_mkwrite() on the PCC copy, after that restore the vm_file
with the one of the Lustre file.
This design exists problem as a mmaped region (vma) could be
faulted concurrently with multiple children threads (each children
threads can clone the VM of the parent process). There is no any
atomic guarantee for the replacement and restore the vm_file during
calling ->fault() or ->page_mkwrite().

This patch reworks the mmap() implementation for PCC.
In the new design, PCC mmap replaces the inode mapping of the PCC
copy on the PCC backend filesystem with the one of the Lustre file.
By this way, the mmaped region (vma) will link into the mapping of
the Lustre inode not the mapping of the PCC copy.
It keeps using vm_file with the file handle of the PCC copy until
the PCC cached file is detached or unmmaped.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Icc5019a691dfb04b5e1fdd580d83915cfe590158
Reviewed-on: https://review.whamcloud.com/41923
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13881 pcc: comparator support for PCC rules
Qian Yingjin [Thu, 6 Aug 2020 08:29:21 +0000 (16:29 +0800)]
LU-13881 pcc: comparator support for PCC rules

There are increasing requirements for PCC rules to add comparator
support:
- File data larger or smaller than certain threshold should not
  auto cache in PCC (i.e. larger than the capacity of PCC backend
  on a client).
- Users can specify a range of UID/GID/ProjID for auto caching on
  PCC when define a rule;

In addition to the original equal (=) operator, this patch also
adds greater than (>) and less than (<) comparison operators.

The following rule expressions are supported:
- "projid={100}&size>{1M}&size<{500G}"
- "projid>{100}&projid<{110}"
- "uid<{1500}&uid>{1000}"

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I9f024eb6903f5652ba3cf04fa289456803493b2c
Reviewed-on: https://review.whamcloud.com/41920
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-12373 pcc: uncache the pcc copies when remove a PCC backend
Qian Yingjin [Fri, 14 Jun 2019 09:29:55 +0000 (05:29 -0400)]
LU-12373 pcc: uncache the pcc copies when remove a PCC backend

Currently when remove a PCC backend from a client, it does not
make any special handling for previously cached files at all.
Users can still use PCC caching service for these files. This
may not what users want. The reason is as follows:

1) For RW-PCC cached files, it does not restore the data back
into Lustre OSTs of the main filesystem. Although the PCC
backend falls back as a tranditional HSM storage solution
since the lhsmtool_posix copytool is still running at this
client. But this is dangerous, and likly to cause user data
to be lost if the PCC device may be permanently unavailable.

2) The space used by these PCC cached files may not released.

In this patch, when remove a PCC backend from a client, the
default action is to scan the PCC backend fs, uncache
(detach and remove) the PCC copy from PCC by FID.

We also add an option "--keep|-k" for PCC backend removal.
It behaves as before, just remove the PCC backend, but
retain the data on the cache.

This patch also introduces a common library to scan the HSM
backend.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib4db36137c025fd78c7022c8b8c39b63e3b9ad4d
Reviewed-on: https://review.whamcloud.com/41919
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-10918 pcc: auto RO-PCC caching when O_RDONLY open files
Qian Yingjin [Wed, 22 Aug 2018 13:19:48 +0000 (21:19 +0800)]
LU-10918 pcc: auto RO-PCC caching when O_RDONLY open files

During the file open() operation, if the file is being opened with
O_RDONLY flags, and the file matches the predefined rule, it will
be prefetched and attached into RO-PCC automatically.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib2c2ab51d67aed84eb7676c8df191faa33dfad39
Reviewed-on: https://review.whamcloud.com/41918
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-10499 pcc: add readonly mode for PCC
Qian Yingjin [Mon, 23 Jul 2018 14:19:25 +0000 (22:19 +0800)]
LU-10499 pcc: add readonly mode for PCC

Readonly Persistent Client Cache (RO-PCC) shares the same framework
with Readwrite Persistent Client Cache, expect that no HSM mechanism
is used in readonly mode of PCC. Instead, RO-PCC adds a new flag
field in the file object's layout named LOV_PATTERN_F_RDONLY to
indicate that the file is in PCC read-only state. It is protected
under the layout lock.

After introducing the readonly feature for the layout, the IO path
has some changes. For read, if the file has been valid RO-PCC
cached, the file data can be read from PCC directly; Otherwise, it
will read data using normal I/O path from OSTs. For data modifying
operations (write or truncate), it must clear the readonly flag of
the layout on MDT (which will invaliate the RO-PCC cached state on
clients via layout lock blocking callback), and then it can perform
I/O.

For RO-PCC, as the PCC cached file is actual a replication of
Lustre file, when data read on PCC failed, it can tolerate this
error by falling back to normal read path: read data from OSTs.

This patch also combines PCC-RO with FLR. Similar to the plain
layouts, PCC-RO layouts is a kind of HSM non-composite layouts,
can be treated as a basic mirror component in FLR layouts.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6badd72e00a106a0f68950621ce6f82471731a95
Reviewed-on: https://review.whamcloud.com/41917
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14503 o2iblnd: clean up zombie connections on shutdown
Serguei Smirnov [Thu, 18 Mar 2021 04:00:28 +0000 (21:00 -0700)]
LU-14503 o2iblnd: clean up zombie connections on shutdown

Clean up zombie connections on net shutdown in o2iblnd.
Wake up connd threads and wait for them to do the clean-up
before proceeding.

Lustre-change: https://review.whamcloud.com/42068
Lustre-commit: 016029d97a8af446452b9934f4a01d4ea800ea7e

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ib094e2f480077034e78fe90e2aec9b1349f7e708
Reviewed-on: https://review.whamcloud.com/42069
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>