Whamcloud - gitweb
fs/lustre-release.git
3 years agoEX-3889 lipe: lamigo/lpurge error reporting
Alexandre Ioffe [Wed, 6 Oct 2021 02:37:20 +0000 (19:37 -0700)]
EX-3889 lipe: lamigo/lpurge error reporting

Replace LAMIGO_{FATAL,ERROR,WARN,INFO,DEBUG}()
by macros with more general name
LX_{FATAL,ERROR,WARN,INFO,DEBUG}()
and use them for both lamigo and lpurge.
Since now lipe will not use llapi_printf(),
but only LX_{FATAL,ERROR,WARN,INFO,DEBUG}() and
llapi_error()

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I4516bb737ec8a308b6e39be2767fd5e03e8b3c61
Reviewed-on: https://review.whamcloud.com/45131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46096

3 years agoEX-3889 lipe: lamigo error reporting and signal handling
John L. Hammond [Wed, 22 Sep 2021 19:15:52 +0000 (14:15 -0500)]
EX-3889 lipe: lamigo error reporting and signal handling

Add new macros LAMIGO_{FATAL,ERROR,WARN,INFO,DEBUG}() to replace the
existing calls to llapi_error() and llapi_printf(). Replace almost all
open coded calls to exit() with LAMIGO_FATAL(). Handle signals
(SIGTERM, SIGUSR1, SIGUSR2) from a dedicated thread. Add
x{malloc,calloc,strdup}() macros that call LAMIGO_FATAL() on OOM
conditions. In main() replace the while (!stop) loop with a
non-breaking while (1) loop.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Idc31da6eca847305ca16b9992a7fb22aa4d0f112
Reviewed-on: https://review.whamcloud.com/45026
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-on: https://review.whamcloud.com/45210

3 years agoLU-14781 osp: osp_object_free access NULL pointer
Bobi Jam [Tue, 2 Nov 2021 07:14:52 +0000 (15:14 +0800)]
LU-14781 osp: osp_object_free access NULL pointer

If an osp_object is created by multiple threads at the same time,
lu_object_find_at() could allocate an osp_object without calling
osp_object_init(). Before hash inserting of the object, it finds another
object has been created and inserted by another thread, it will free
the uninitialized osp_object, and osp_object_free() will access
an uninitialized list_head (opo_xattr_list).

Initializes osp_object fields in osp_object_alloc() to avoid this.

Call trace:
            lu_object_free.isra.30+0xf2/0x170 [obdclass]
            lu_object_find_at+0x496/0x930 [obdclass]
            lod_initialize_objects+0x3e4/0xba0 [lod]
            lod_parse_striping+0x693/0xc20 [lod]
            lod_striping_load+0x2b2/0x660 [lod]
            lod_declare_destroy+0x12b/0x600 [lod]
            mdd_declare_finish_unlink+0x91/0x210 [mdd]
            mdd_unlink+0x48f/0xab0 [mdd]
            mdt_reint_unlink+0xc32/0x1550 [mdt]
            mdt_reint_rec+0x83/0x210 [mdt]
            mdt_reint_internal+0x6e1/0xb00 [mdt]
            mdt_reint+0x67/0x140 [mdt]
            tgt_request_handle+0xaee/0x15f0 [ptlrpc]
            ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
            ptlrpc_main+0xb34/0x1470 [ptlrpc]
            kthread+0xd1/0xe0

Lustre-commit: TBD (from 20dde5a8d428b3f9bf2d0421b333a09545be1c65)
Lustre-change: https://review.whamcloud.com/45442

Fixes: 226fd401f9d ("LU-7660 dne: support fs default stripe")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib86aca5b41e94a1758f177655ea3a0f680335e0f
Reviewed-on: https://review.whamcloud.com/46094
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-4334 tests: disable sanity test_428 temporarily
Andreas Dilger [Fri, 14 Jan 2022 06:38:33 +0000 (23:38 -0700)]
EX-4334 tests: disable sanity test_428 temporarily

sanity test_428 is crashing regularly (about 1/15 runs) on b_es6_0.
Disable it until it is fixed.

Test-Parameters: trivial testlist=sanity env=ONLY=428,ONLY_REPEAT=180
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id3c7722d5f4c4d084bf1dab83733aae8f9d8366f
Reviewed-on: https://review.whamcloud.com/46109
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-4189 lov: include FID in some lov asserts
John L. Hammond [Thu, 4 Nov 2021 16:12:57 +0000 (11:12 -0500)]
EX-4189 lov: include FID in some lov asserts

Include the file FID in the assertions in lov_entry() and
lov_mirror_entry(). Use these two functions more consistently in the
lov layer.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I65978fe409842289c158021fb1b8042916d90e23
Reviewed-on: https://review.whamcloud.com/46093
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-3046 lipe: remove lamigo_init_vars()
John L. Hammond [Mon, 20 Sep 2021 14:48:42 +0000 (09:48 -0500)]
EX-3046 lipe: remove lamigo_init_vars()

Initialize static lists in their declarations. Remove the unused ssh
global variable. Remove the then unnecessary function
lamigo_init_vars().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia5c36e9d2b8f9b0d9467d55c7a2b9d3e7b9f2cf1
Reviewed-on: https://review.whamcloud.com/43398
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45209
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-3843 lipe: lamigo signal handling
John L. Hammond [Mon, 20 Sep 2021 14:16:52 +0000 (09:16 -0500)]
EX-3843 lipe: lamigo signal handling

In lamigo, add a SIGTERM handler that calls psignal() and exits with
status 0. Remove lamigo_null_handle() and replace with SIG_IGN. Set
SA_RESTART on the handlers for SIGUSR1 and SIGUSR2.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia329a17836cedb1e0d951a67619b828a63c12e67
Reviewed-on: https://review.whamcloud.com/44987
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45208
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4552 lipe: use version-gen.sh
John L. Hammond [Thu, 13 Jan 2022 14:57:29 +0000 (08:57 -0600)]
EX-4552 lipe: use version-gen.sh

Now that b_es6_0 has a lipe tag we can uncomment the code in version-gen.sh.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Iebf497282197add8893f68b19e3bed113f388208
Reviewed-on: https://review.whamcloud.com/46095
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
3 years agoEX-1873 iokit: fix the obsolete usage of cfg_device
Hongchao Zhang [Wed, 14 Oct 2020 01:46:00 +0000 (09:46 +0800)]
EX-1873 iokit: fix the obsolete usage of cfg_device

The LCTL command "cfg_device" is obsolete and some operations
(such as "cleanup", "detach") don't support it anymore.
In mds_survey and lfsck-performance it causes the echo client
device not to be destroyed and causes LBUG when umounting the
related Lustre device.

Lustre-change: https://review.whamcloud.com/40227
Lustre-commit: 2e6342a7365825091d9c7b25418033c02ecfbb12

Change-Id: If7f6eff080906e395023289652fcd2a78dfb6fb7
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40227
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45879
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14895 osd-ldiskfs: combine checksum functions
Andreas Dilger [Wed, 4 Aug 2021 09:42:37 +0000 (03:42 -0600)]
LU-14895 osd-ldiskfs: combine checksum functions

Reduce code duplication for nearly-identical checksum calculations.
The osd_dif_type1_generate() and osd_dif_type3_generate() were nearly
the same, as were osd_dif_type1_verify() and osd_dif_type3_verify().
Combine these functions to share the code, and handle the difference
between T10-PI type 1 and type 3 with an argument.

Lustre-change: https://review.whamcloud.com/44656
Lustre-commit: 7fdd664b3518e5e8d8a243898d48d9c62c22e18a

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I40afb15fd80577ef6de918c90e4111e775ce7057
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15069 llite: Add start_idx debug
Patrick Farrell [Wed, 15 Dec 2021 17:08:32 +0000 (12:08 -0500)]
LU-15069 llite: Add start_idx debug

When readahead is triggered, current readahead debug
prints the page the user requested which triggered
readahead and the number of pages read by readahead.

However, readahead does not necessarily start reading from
the user requested page, so it's important to also print
the page where readahead starts.

Test-paremeters: trivial

lustre-change: https://review.whamcloud.com/45674/
lustre-commit: ca2bea3659e43649c5f229d7db3f850964b035c6 (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie474811f3b0076f4f914fae7f74496e96ddb31da
Reviewed-on: https://review.whamcloud.com/45865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15317 llite: Add D_IOTRACE
Patrick Farrell [Wed, 15 Dec 2021 17:07:59 +0000 (12:07 -0500)]
LU-15317 llite: Add D_IOTRACE

In looking in to performance problems, it's very important
to be able to trace the I/O patterns from userspace in to
Lustre, and also understand the key basics of how Lustre
handles that I/O (readahead, RPC generation).

This is best done with a dedicated debug flag - No
userspace tool can provide all this information, and
existing debug flags collect a huge number of unrelated
pieces of, well, debug information.

The goal is for customers to be able to quickly gather log
files of a reasonable size which contain the necessary
information and which can easily be interpreted by
engineering.  This is not possible if the information is
spread out across a number of heavyweight debug flags.

This is a first pass at adding the flag and the debug
required to track basic data I/O.  One significant
omission in the first patch is RPC generation - I have not
decided how best to do that yet.  That will be added in a
future patch.

lustre-change: https://review.whamcloud.com/#/c/45752/
lustre-commit: e77ef62eb25195ddc4ef63c75dbe7342ddb2b3f5 (tbd)

test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0ed003ec1488e1c267b194c871f64b34f6dc6025
Reviewed-on: https://review.whamcloud.com/45864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15317 libcfs: Remove D_TTY
Patrick Farrell [Wed, 15 Dec 2021 17:06:42 +0000 (12:06 -0500)]
LU-15317 libcfs: Remove D_TTY

The D_TTY flag is almost entirely unused and certainly not
needed.  Remove it so we have a spare flag to use for
iotrace.

test-parameters: trivial

lustre-change: https://review.whamcloud.com/45751/
lustre-commit: 8317690ae36918109594208811c3c6358fe46e18 (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1127cbcf6ee51adc07d560a8827fa1e32d16c90c
Reviewed-on: https://review.whamcloud.com/45863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15137 socklnd: decrement connection counters on close
Serguei Smirnov [Sat, 30 Oct 2021 18:39:26 +0000 (11:39 -0700)]
LU-15137 socklnd: decrement connection counters on close

To gracefully handle potential race with delayed connection create,
decrement connection counters per type as connections are being
closed.

Lustre-change: https://review.whamcloud.com/45422
Lustre-commit: 7e26413aa85fdc931721cde36bae3bf2bb97e63f

Test-Parameters: trivial testlist=sanity-lnet
Fixes: cbf740d0 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ieb3b44701e4999ea1fe63234162dd5878d65958a
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46051
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15137 socklnd: expect two control connections maximum
Serguei Smirnov [Thu, 4 Nov 2021 18:35:43 +0000 (11:35 -0700)]
LU-15137 socklnd: expect two control connections maximum

As a result of connecting to ourselves, e.g. pinging own nid,
two control type connections are established vs. just one
in case of connecting externally.
Fix the control connection counter to be able to handle that.

Lustre-change: https://review.whamcloud.com/45461
Lustre-commit: ee9a03d8308c5918a17e2e45fd59ee5a4c38acaf

Test-Parameters: trivial testlist=sanity-lnet
Fixes: cbf740d0 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Idce01d81e3924226b5b163d2472cbcd4f6eb5819
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46050
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
3 years agoLU-14138 ptlrpc: move more members in PTLRPC request into pill
Qian Yingjin [Tue, 17 Nov 2020 15:12:44 +0000 (23:12 +0800)]
LU-14138 ptlrpc: move more members in PTLRPC request into pill

Some data members in the data structure @ptlrpc_request can be
moved into the data structure @rep_capsule:
/** Request message - what client sent */
struct lustre_msg *rq_reqmsg;
/** Reply message - server response */
struct lustre_msg *rq_repmsg;
/** Fields that help to see if request and reply were swabbed */
__u32 rq_req_swab_mask;
__u32 rq_rep_swab_mask;

After these data structures are reconstructed, @rep_capsule can
be more common used and it makes pack and unpack sub requests
in a batch PtlRPC request for the coming batch metadata processing
more easily.

Lustre-change: https://review.whamcloud.com/40669
Lustre-commit: f75d2a1fc9b17b384bbcbc13bcb80ba10412cf29

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib6d942b79ebf1a444d63b55ad4bc94813cf947c7
Reviewed-on: https://review.whamcloud.com/46029
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13055 doc: update changelog manpages
Mikhail Pershin [Thu, 17 Jun 2021 14:11:51 +0000 (17:11 +0300)]
LU-13055 doc: update changelog manpages

Add lctl-changelog_register.8 and lctl-changelog_deregister.8
manpages and update lctl.8 manpage to refer to them.

Lustre-change: https://review.whamcloud.com/44022
Lustre-commit: 393885c027793d27ec948fd4fccb47aa530d2bf8

Fixes: 15305c3c3fe7 ("LU-12214 build: fix build without lustre_utils")
Test-Parameters: trivial
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ie41db630c72f61a884cd8000e0a4aeeb42ca60eb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46007
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-5369 mdt: check lock handle instead assert
Yang Sheng [Mon, 13 Sep 2021 21:04:00 +0000 (05:04 +0800)]
LU-5369 mdt: check lock handle instead assert

The lock handle could be NULL inn some corner case.
We should check it instead of LBUG.

Lustre-change: https://review.whamcloud.com/44905
Lustre-commit: 5e4411e99cd7d0ccf4e51fac1442673844626639

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I1afa7f8c129c104b012ae23141318365c388c503
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46019
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14474 llog: don't destroy next llog
Alex Zhuravlev [Tue, 21 Sep 2021 12:23:56 +0000 (15:23 +0300)]
LU-14474 llog: don't destroy next llog

do not destroy empty llog if it's referenced as
the next one in a catalog.

Lustre-change: https://review.whamcloud.com/44998
Lustre-commit: 4521f6af35d1dc20b531b87ff3633d89dbac86ec

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I78bfeb90435aaee2b8536b647aa3acec56642ea0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45892
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15168 osd: use large allocation for idc cache
Alex Zhuravlev [Wed, 27 Oct 2021 05:48:03 +0000 (08:48 +0300)]
LU-15168 osd: use large allocation for idc cache

as in some cases (e.g. ofd precreate) the cache can grow to dozens
of kilobytes (sizeof(struct idc_map_cache)=40 * 1024).

Lustre-commit: a3aa2eefd3d4708ce7094ed644c30b784c39eb2c
Lustre-change: https://review.whamcloud.com/45382

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id9e0996a7a1d07065f4a50c1d5be5051e756559a
Reviewed-on: https://review.whamcloud.com/46040
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14959 ldlm: Check return value of ldlm_resource_get()
Oleg Drokin [Tue, 24 Aug 2021 03:44:45 +0000 (23:44 -0400)]
LU-14959 ldlm: Check return value of ldlm_resource_get()

Fix the comment to properly indicate it returns ERR_PTR on
error and fix osc_req_attr_set() and mdc_get_lock_handle()
to actually check the return value before passing it on and
causing an unintended crash.

Lustre-change: https://review.whamcloud.com/44738
Lustre-commit: 3e0aa9ca6e0a9a6981b9a3ad5f556cd6554a6b5b

Change-Id: Ib85a62140a39744e85989c9a9c8aa2ed771d70d1
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46016
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4270 kernel: increase kernel version to ddn16
Andreas Dilger [Sat, 8 Jan 2022 06:17:20 +0000 (23:17 -0700)]
EX-4270 kernel: increase kernel version to ddn16

Increase kernel build version to -ddn16 due to new kernel patch.

Lustre-change: https://review.whamcloud.com/45869
Lustre-commit: 9cb39cdf470f444decaf183af7b4b6f6a79f80bf

Fixes: afd8b0df0aba ("EX-4270 snapshot: avoid call quota op recursively")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icf0d404ea5ebfb1009078a286585d837b37417ea
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/46023
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4270 snapshot: avoid call quota op recursively
Hongchao Zhang [Tue, 30 Nov 2021 10:11:01 +0000 (18:11 +0800)]
EX-4270 snapshot: avoid call quota op recursively

In ext4_snapshot_test_and_cow, if there is already in some quota
call, it could cause deadlock if the snapshot calls quota function
to allocate space recursively.

[only the change to snapshot-jbd2-rhel7.7.patch]

Lustre-change: https://review.whamcloud.com/45680
Lustre-commit: 4722f1a0ca9d24bf6fa2678659ccf2cb1be5cdf1

Test-Parameters: trivial
Change-Id: Iac354744fcee8955d8e41020f9cee6d433f38e80
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14793 hsm: record index for further HSM action scanning
Qian Yingjin [Fri, 25 Jun 2021 08:22:35 +0000 (16:22 +0800)]
LU-14793 hsm: record index for further HSM action scanning

there is contention between HSM archive request and "hsm_cdtr"
kernel thread:
->mdt_hsm_request()
  ->mdt_hsm_add_actions()
    ->mdt_hsm_register_hal()
      ->mdt_agent_record_add()
        ->down_write(&cdt->cdt_llog_lock)
        ->llog_cat_add()
        ->up_write(&cdt->cdt_llog_lock)

->mdt_coordinator()
  ->cdt_llog_process()
    ->down_write(&cdt->cdt_llog_lock);
    ->llog_cat_process()
    ->up_write(&cdt->cdt_llog_lock);

HSM archive request and HSM cat llog scanning in the kernel daemon
"hsm_cdtr" are both contenting for write llog lock to add or
update the "hsm_actions" llog.

In the tesing, it uses max_requests = 1000000.
In the current implementation, it means kernel daemon thread
"hsm_cdtr" needs to scan nearly whole "hsm_actions" llog from the
beginning position with write llog lock held.
This will slow down the HSM archive requests which is contented
for write llog lock.

As llog is append-only, we record the latest handled position in
the llog, thus next scanning can start from the previous recorded
postion (llog index), does not need to start from the beginning.

Another way to mitigate this probelm is:
when the llog scanner found that there are other process
contended for the llog lock, it will stop the llog scanning and
release write llog lock properly for incoming HSM archive requests.

After applied this patch, with 200000 HSM actions in llog, the time
to queue 10000 HSM archive requests reduces from 10 seconds to 4
seconds.

Lustre-change: https://review.whamcloud.com/44077
Lustre-commit: a15a5432f8063e3a04a87d74eafac0060a8f9d26

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I2e92daf34844605ee648787daf859143335c68bf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46013
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14724 nrs: TBF rule list broken when change rule rank
Qian Yingjin [Fri, 28 May 2021 03:56:12 +0000 (11:56 +0800)]
LU-14724 nrs: TBF rule list broken when change rule rank

When change rank of two adjacent rules in the TBF rule list in
@nrs_tbf_rule_change_rank():
list_move(&rule->tr_linkage, next_rule->tr_linkage.prev);

The previous pointer of @next_rule is @rule, using list_move
directly will break the rule list.
In this patch, it use list_del + list_add to repace list_move to
avoid TBF rule broken.
And also add a test case sanityn test_77o for this bug.

Lustre-change: https://review.whamcloud.com/43925
Lustre-commit: e688f29275deeadc0ef4faa01f166986bade301f

Fixes: aa14b0b9a152 ("LU-8006 ptlrpc: specify ordering of TBF policy rules")
Change-Id: Ica30d3329f07914657ac2c4089d66f934021b763
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46017
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14713 llite: mend the trunc_sem_up_write()
Bobi Jam [Tue, 3 Aug 2021 06:38:46 +0000 (14:38 +0800)]
LU-14713 llite: mend the trunc_sem_up_write()

The original lli_trunc_sem replace change (commit e5914a61ac) fixed a
lock scenario:

  t1 (page fault)          t2 (dio read)              t3 (truncate)
|- vm_mmap_pgoff()       |- vvp_io_read_start()     |- vvp_io_setattr
 |- down_write(mmap_sem)  |- down_read(trunc_sem)            _start()
  |- do_map()              |- ll_direct_IO_impl()
   |- vvp_io_fault_start    |- ll_get_user_pages()

                                                     |- down_write(
                             |- down_read(mmap_sem)        trunc_sem)
    |- down_read(trunc_sem)

t1 waits for read semaphore of trunc_sem which is hindered by t3,
since t3 is waiting for the write semaphore while t2 take its read
semaphore, and t2 is waiting for mmap_sem which has been taken by t1,
and a deadlock ensues.

commit e5914a61ac changes the down_read(trunc_sem) to
trunc_sem_down_read_nowait() in page fault path, to make it ignore
that there is a down_write(trunc_sem) waiting, just takes the read
semaphore if no writer has taken the semaphore, and breaks the
deadlock.

But there is a delicacy in using wake_up_var(), wake_up_var()->
__wake_up_bit()->waitqueue_active() locklessly test for waiters on the
queue, and if it's called without explicit smp_mb() it's possible for
the waitqueue_active() to ge hoisted before the condition store such
that we'll observe an empty wait list and the waiter might not
observe the condition, and the waiter won't get woke up whereafter.

Lustre-change: https://review.whamcloud.com/43844
Lustre-commit: 39745c8b5493159bbca62add54ca9be7cac6564f

Fixes: e5914a61ac ("LU-12460 llite: replace lli_trunc_sem")
Change-Id: Ifdda2c1c8a4171466be1723923c136e84de8ce0e
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
3 years agoLU-14854 mdd: proper handle error in mdd_swap_layouts()
Bobi Jam [Thu, 15 Jul 2021 18:22:38 +0000 (02:22 +0800)]
LU-14854 mdd: proper handle error in mdd_swap_layouts()

Only restore object's HSM xattr on error if it's for
SWAP_LAYOUTS_MDS_HSM.

Lustre-change: https://review.whamcloud.com/44319
Lustre-commit: 7648c1c905b0976fc789cfd9c6bac382389385ee

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9d4c58cd3107c3900e72a0946d0ec7d7286dd43f
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46021
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
3 years agoLU-14895 brw: log T10 GRD tags during checksum calcs
Andreas Dilger [Wed, 4 Aug 2021 08:08:12 +0000 (02:08 -0600)]
LU-14895 brw: log T10 GRD tags during checksum calcs

Log the T10 guard tags during checksum calculation on the client and
target to help identify where checksum errors are being introduced.
The added debugging is only active on RPC resend, so will not add
overhead during the normal IO path.

Lustre-change: https://review.whamcloud.com/44655
Lustre-commit: 75ebfb994fb0bce8a0f0400429f04127ead50ea4

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia4f14f2f2296da096acf629c74558386e7ce7057
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46053
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14598 ofd: fix for IDIF sequence at ofd_preprw_write
Alexander Boyko [Thu, 8 Apr 2021 08:23:54 +0000 (04:23 -0400)]
LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write

During recovery write operation could create and load a sequence
if it comes before creation request from MDT0. ofd_preprw_write() uses
wrong logic for taking sequence for IDIF fids. And if oid overflows
32bit and takes a part at IDIF sequence, write request loads wrong
ofd sequence. And after that it is used for other IO. The next
create from MDT0 cause an error:
Too many FIDs to precreate OST replaced or reformatted...

The test 122b reproduce issue when OST using a wrong sequence for
MDT0 IDIF. This error requires objects id grater than 32bit, and
write request during recovery, it should be processed before a create
requset from MDT0.
For a visible error at console the last object id should be
1<<32 + (OST_MAX_PRECREATE * 5). Error is
lustre-OST0000: Too many FIDs to precreate OST replaced or
    reformatted: LFSCK will clean up

Lustre-change: https://review.whamcloud.com/43248
Lustre-commit: 747fed818be5a4e09281ab1d9fd5b3a13763ab40

HPE-bug-id: LUS-9595
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I09e6f88b1f0d03fec59b24ef096cbc7baa5388ae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46015
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-14951 llite: protect fd_{lease_}och
Bobi Jam [Wed, 18 Aug 2021 13:32:21 +0000 (21:32 +0800)]
LU-14951 llite: protect fd_{lease_}och

Access ll_file_data::fd_och and fd_lease_och needs to lli_och_mutex
protection.

Lustre-change: https://review.whamcloud.com/44700
Lustre-commit: b275ccd9787753b9cbf4368d8611c2ac94726e2e

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie9136aa345c6bf015aa73067acdaecf1a765b9f6
Reviewed-on: https://review.whamcloud.com/46030
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-15156 kernel: back port patch for rwsem issue
Yang Sheng [Tue, 11 Jan 2022 17:06:05 +0000 (01:06 +0800)]
LU-15156 kernel: back port patch for rwsem issue

RHEL7 included a defect in rwsem. It can cause a
thread hung on rwsem waiting infinity. Backport
commit: 5c1ec49b60cdb31e51010f8a647f3189b774bddf
to fix this issue.

Lustre-commit: 85362faed8f5ee94ffee1f3f6330beee57ea9284
Lustre-change: https://review.whamcloud.com/45383

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ic5c469ce744ad5882c13163a9bfe14faef8fd446
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14734 ldiskfs: improve message for large_dir
Andreas Dilger [Tue, 11 Jan 2022 17:49:41 +0000 (09:49 -0800)]
LU-14734 ldiskfs: improve message for large_dir

Make it more clear that the large_dir feature has already been
enabled, rather than making the admin think that they need to
enable the feature themselves.

Lustre-change: https://review.whamcloud.com/45046
Lustre-commit: 2a24b6ec67da9224e1cb6226166cde3a9c95431d

Test-Parameters: trivial
Fixes: f5967b06aac5 ("LU-14734 osd-ldiskfs: enable large_dir automatically")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ica59d3370148ed277d3541c05be065c4638daf8d
Reviewed-on: https://review.whamcloud.com/46045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-13397 llite: support fallocate() on selected mirror
Mikhail Pershin [Sun, 22 Aug 2021 19:41:33 +0000 (22:41 +0300)]
LU-13397 llite: support fallocate() on selected mirror

- add ability to do fallocate() on designated mirror in
  FLR file
- add missing FALLOC_FL_KEEP_SIZE flag to fallocate() call
  in llapi_hole_punch(). It was just not working without
  that flag silently
- add corresponding test_50d in sanity-flr.sh

Lustre-change: https://review.whamcloud.com/44721
Lustre-commit: 89736d502cc99f095237dde7520fc4ca86191882

Fixes: 4126fbb30c ("LU-13397 lfs: mirror resync to keep sparseness")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I8d700fce904c84458a50650f1d3cb09d23989eba
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-3626 build: build ptlrpc_gss during ubuntu dkms
Minh Diep [Mon, 9 Aug 2021 19:45:45 +0000 (12:45 -0700)]
EX-3626 build: build ptlrpc_gss during ubuntu dkms

include ptlrpc_gss in dkms.conf

Lustre-change: https://review.whamcloud.com/44539

Change-Id: I952a7019b2bc5687507fdb1f274c100152dae6cd
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46018
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-14807 lfsck: fix race in lfsck_pos_fill
Hongchao Zhang [Sun, 27 Jun 2021 21:00:20 +0000 (05:00 +0800)]
LU-14807 lfsck: fix race in lfsck_pos_fill

There is a race for lfsck->li_di_dir between lfsck_di_dir_put and
lfsck_pos_fill, which could cause lfsck_pos_fill to use freed
lfsck->li_di_dir (struct osd_it_ea) and trigger GPF.

Lustre-change: https://review.whamcloud.com/44130
Lustre-commit: 911f638bd6c547591e784fcec668fe9811916e21

Change-Id: Iedadf03ac15d128bb051aea8aafa24dbcd2704fb
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46020
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-15244 llite: set ra_pages of backing_dev_info with 0
Qian Yingjin [Wed, 5 Jan 2022 00:46:35 +0000 (16:46 -0800)]
LU-15244 llite: set ra_pages of backing_dev_info with 0

The latest RHEL8.5 kernel sets initial @ra_pages of
backing_dev_info with VM_READAHEAD_PAGES:
struct backing_dev_info *bdi_alloc(int node_id)
{
...
bdi->ra_pages = VM_READAHEAD_PAGES;
bdi->io_pages = VM_READAHEAD_PAGES;
...
}

This will cause that @ra_pages of file readahead state is set
with @bdi->ra_pages, make the readahead is out of Lustre control
and trigger the readahead logic in Linux kernel wrongly. And it
results in the failure sanity 101j.

In this patch, we force to set @ra_pages of backing_dev_info with
0 after setup the backing device info. By this way, it disables
kernel readahead in the super block.

This patch also cleanups the unnecessary setting of @ra_pages in
llite "file.c" and "vvp_io.c".

Lustre-change: https://review.whamcloud.com/45712
Lustre-commit: 878561880d2aba038db95e199f82b186f22daa45

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If6468109620269c1e76abe3a1cd73c3b40a417a8
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45971
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn28
Minh Diep [Tue, 11 Jan 2022 17:56:08 +0000 (09:56 -0800)]
RM-620 build: New tag 2.14.0-ddn28

Change-Id: I1a1d9c767cfab91a833dece3b4663ec89b2c759b

3 years agoEX-4052 tests: use stack_trap within a subtest in sanity-lipe
Jian Yu [Mon, 10 Jan 2022 08:37:25 +0000 (00:37 -0800)]
EX-4052 tests: use stack_trap within a subtest in sanity-lipe

Trap in a subshell is handled differently across bash versions.
This patch moves stack_trap into subtests to make them work
reliably.

Test-Parameters: trivial clientdistro=el7.9 testlist=sanity-lipe
Test-Parameters: trivial clientdistro=el8.4 testlist=sanity-lipe

Lustre-change: https://review.whamcloud.com/45889
Lustre-commit: db8e16fd8434651ed6102a6be79828346775f87e

Change-Id: I00eac5a8cf9511c8e1e531eb54f52ce443e5f77e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46026
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15417 build: build MOFED 5.5
Minh Diep [Thu, 6 Jan 2022 21:02:39 +0000 (13:02 -0800)]
LU-15417 build: build MOFED 5.5

The path the mofed header files has change to
/usr/src/ofa_kernel/x86_64/<kernel>
so we cannot assume it's /usr/src/ofa_kernel/default

Test-Parameters: trivial
Change-Id: I10f375b459f04b84003e70951e4e423295001f40
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46004
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn27
Andreas Dilger [Thu, 16 Dec 2021 08:37:59 +0000 (01:37 -0700)]
RM-620 build: New tag 2.14.0-ddn27

New tag 2.14.0-ddn27

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2ec7b94cf7b8ea4d90b0e8ff1f2301e48f4d3b0e

3 years agoLU-15337 kernel: kernel update SLES15 SP3 [5.3.18-59.37.2]
Jian Yu [Wed, 8 Dec 2021 08:25:29 +0000 (00:25 -0800)]
LU-15337 kernel: kernel update SLES15 SP3 [5.3.18-59.37.2]

Update SLES15 SP3 kernel to 5.3.18-59.37.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles15sp3 \
env=SANITY_EXCEPT="103 125 154" \
testlist=sanity

Change-Id: Ie89a1d805460420b79bb7f345918b299e08de853
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45787
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]
Jian Yu [Sun, 5 Dec 2021 09:20:45 +0000 (01:20 -0800)]
LU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.25.1.el8_4 for Lustre client.

Test-Parameters: trivial clientdistro=el8.4 testlist=sanity

Change-Id: Ic70f7330f90a36646bb36e0c6015ea22882b20b9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
3 years agoLU-14690 kernel: RHEL 8.4 server support
Jian Yu [Thu, 30 Sep 2021 19:05:35 +0000 (12:05 -0700)]
LU-14690 kernel: RHEL 8.4 server support

This patch makes changes to support RHEL 8.4 release with
kernel 4.18.0-305.19.1.el8_4 for Lustre server.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Lustre-change: https://review.whamcloud.com/43791
Lustre-commit: 644a14196810f0c6b663957720414e042d2ae965

Change-Id: I484af80c4764367b40b28ce459a6ff9d87edf3a8
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44061
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13783 ldiskfs: Add support for mainline 5.8 kernel
Mr NeilBrown [Sat, 11 Sep 2021 07:11:51 +0000 (00:11 -0700)]
LU-13783 ldiskfs: Add support for mainline 5.8 kernel

Various changes needed for 5.8 over 5.4:
 - ext4_mark_inode_dirty is now a macro, so export
     __export_mark_inode_dirty instead
 - procfs additions need to use 'struct proc_ops'
 - inode-test.c is a new C file that we MUST NOT build
 - various ordinary conflicts

Lustre-change: https://review.whamcloud.com/40373
Lustre-commit: 849e93f4091a3003706668076864f086b9d59238

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I681ab26c60fb35a1ef5f518ee7cac8766e6fde47
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44361
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15136 socklnd: default conns_per_peer to 0
Serguei Smirnov [Thu, 21 Oct 2021 02:09:06 +0000 (19:09 -0700)]
LU-15136 socklnd: default conns_per_peer to 0

Setting conns_per_peer to 0 triggers socklnd to choose the
(heuristically) optimal setting for the interface given its speed.
Make 0 the default for socklnd conns_per_peer.

Lustre-change: https://review.whamcloud.com/45319
Lustre-commit: 30a028e2ee2b3eead94abd6657edc3880ec89434

Test-parameters: trivial testlist=sanity-lnet

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Fixes: c44afcfb72 ("LU-12815 socklnd: set conns_per_peer based on link speed")
Change-Id: Ie6e76eaee8693472384cce362b394b216142884e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45744
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-12815 socklnd: set conns_per_peer based on link speed
Serguei Smirnov [Wed, 28 Jul 2021 21:47:39 +0000 (14:47 -0700)]
LU-12815 socklnd: set conns_per_peer based on link speed

Specifying conns_per_peer=0 for a ni is now used to set
the conns_per_peer as a function of the corresponding link speed
as follows:
conns_per_peer = (ilog2(Gbps) / 2 + 1)

Listed below are the resulting defaults for common link speeds:
100Gbps, 200Gbps -> 4
        50Gbps  -> 3
        5Gbps, 10Gbps  -> 2
        less than 4Gbps  -> 1

Lustre-change: https://review.whamcloud.com/44417
Lustre-commit: c44afcfb72a1c2fd8392bfab3143c3835b146be6

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ief2b33a796c180d8669bd5796b3e35ec748423a5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45742
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15150 tests: sanity-lnet removes testsuite log on failure
Chris Horn [Fri, 22 Oct 2021 01:34:23 +0000 (01:34 +0000)]
LU-15150 tests: sanity-lnet removes testsuite log on failure

cleanup_testsuite() needs to be more selective when removing files
created by sub-tests.

Lustre-change: https://review.whamcloud.com/45342
Lustre-commit: 29918b2db487e7ec8b0bdf785b0a436332824db6

Test-Parameters: trivial testlist=sanity-lnet
Fixes: aa739144551 ("LU-13569 tests: Check LNet Health recovery logic")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic17a68ff2aa552594a0f1ea470c39177abe985fc
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45743
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-12815 socklnd: allow dynamic setting of conns_per_peer
Serguei Smirnov [Mon, 2 Aug 2021 14:48:35 +0000 (10:48 -0400)]
LU-12815 socklnd: allow dynamic setting of conns_per_peer

Modify lnetctl and associated code to allow dynamic setting
of conns_per_peer lnd parameter per ni.

The parameter can be set for a specific active nid:
        lnetctl net set --nid 192.168.122.10@tcp --conns-per-peer=4

Or when adding a new net, taking effect on the new nid:
        lnetctl net add --net tcp --if eth0 --conns-per-peer=1

By default, conns_per_peer value specified as the module parameter
shall be used.

Lustre-change: https://review.whamcloud.com/41463
Lustre-commit: a5cbe7883db6d77b82fbd83ad4c662499421d229

LU-15089 tests: allow enough time to create tcp connections

Allow enough time to create tcp connections before counting them
when testing socklnd conns_per_peer setting in sanity-lnet test_230

Lustre-change: https://review.whamcloud.com/45331
Lustre-commit: 5c766b005bf3e0bca0efa9d87ccf230e7cba97cc

LU-14991 tests: Correct whitespace in sanity-lnet test_101/102

sanity-lnet.sh test_100 and test_101 use tab characters in the
expected yaml output, but yaml syntax does not allow tab characters.

Lustre-change: https://review.whamcloud.com/44856
Lustre-commit: 38b18436f220931924210c9019028ea8589adc1d

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I11625b9ad61f0311c294001a38b7855465491aaf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45741
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14990 tests: Detect correct LNet interface for sanity-lnet
Chris Horn [Tue, 7 Sep 2021 15:24:14 +0000 (10:24 -0500)]
LU-14990 tests: Detect correct LNet interface for sanity-lnet

Determine the names of the interfaces used for LNet by parsing the
NIDs configured after calling load_modules(). Tests which reference
eth0 are modified to use the interface associated with the primary
NID (i.e. first NID output by lctl list_nids).

Lustre-change: https://review.whamcloud.com/44857
Lustre-commit: f9669c4d3092d44cbc2e2d3c225aee6ebaf268e9

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-10385
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Id715aa3e5470d9c110f6248620b1a83920875e7b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45760
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15171 osd-ldiskfs: xattr_sem locking missing dquot_transfer
Andrew Perepechko [Sun, 31 Oct 2021 20:03:30 +0000 (23:03 +0300)]
LU-15171 osd-ldiskfs: xattr_sem locking missing dquot_transfer

Kernel commit 7a9ca53ae (~v4.13) added the requirement for xattr_sem
locking when calling *dquot_transfer. As of now, in rare cases, it is
possible that we can modify inode xattrs and perform their consistency
checks in parallel, which can fail.

Lustre-change: https://review.whamcloud.com/45424
Lustre-commit: e6c7fcdaf40b130c39af2e3ee8b108c6e31a8ca8

Change-Id: I041694e30ce6c8398864c0ad57671df0bffd2f52
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-10549
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45750
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
3 years agoLU-15245 mdc: GET(X)ATTR to READPAGE portal
Patrick Farrell [Wed, 17 Nov 2021 20:11:45 +0000 (15:11 -0500)]
LU-15245 mdc: GET(X)ATTR to READPAGE portal

Send the MDS_GETATTR and MDS_GETXATTR RPCs to the
MDS_READPAGE_PORTAL instead of the default portal to avoid
deadlocks with other MDS_REINT RPCs that may block all of
the MDS service threads on that portal.

This deadlock occurs with MDS_GETXATTR when selinux is
enabled, because getxattr becomes part of lookup, so it
takes a reference on a lock used for lookup.  However, all
of the MDS service threads on the default portal can be
consumed by threads waiting for that lock, resulting in
a deadlock when the getxattr can't be processed.

Lustre-change: https://review.whamcloud.com/45593
Lustre-commit: ebb035756eb059b255d4c8245d42bc5d5b96bab9 (tbd)

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4fbae266022ee9fa38f3196acb1443df5056fe5e
Reviewed-on: https://review.whamcloud.com/45594
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4052 tests: skip sanity-lipe test_8/9
Andreas Dilger [Thu, 9 Dec 2021 00:37:51 +0000 (17:37 -0700)]
EX-4052 tests: skip sanity-lipe test_8/9

Tests failing 100% and need to be skipped until fixed.

Test-Parameters: trivial testlist=sanity-lipe
Test-Parameters: testlist=sanity-lipe mdscount=2 mdtcount=4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7b29c2f147652ae99522f71f7d156e0934a48d8a
Reviewed-on: https://review.whamcloud.com/45802
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 years agoLU-13076 dne: dir migrate in QOS mode
Lai Siyao [Tue, 7 Sep 2021 09:33:21 +0000 (05:33 -0400)]
LU-13076 dne: dir migrate in QOS mode

Support "lfs migrate -m -1 ..." to migrate directory to MDTs by
space and inode usage, if system is balanced, the target MDT is
chosen in roundrobin mode, otherwise the less full MDTs will be
chosen, and the most full MDT is avoided.

Another minor change: if directory is migrated to specific MDTs,
and the target stripe count is more than 1, its subdirs may not be
migrated to the specified MDT in the command, but migrated to the
MDT where its parent stripe is located (subdir will be striped too),
as can avoid unnecessary remote directories. NB, for command like
"lfs migrate -m 0,1,2 ...", though the subdir may be located on
either MDT0, MDT1 or MDT2, its stripes will be striped over these
three MDTs, but for command like "lfs migrate -m 0 -c 3...", the
subdir may be striped on other MDTs if the subdir is not located on
MDT0.

Add sanity 230u.

Lustre-change: https://review.whamcloud.com/44886
Lustre-commit: 378c7567876b430d06031f7d380112b9bdb15166

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I6e9c3d75bfc240b21c65ba27cd5e4bcca7058325
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45478
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-6142 lustre: remove non-static 'inline' markings.
Mr NeilBrown [Thu, 15 Oct 2020 00:10:47 +0000 (11:10 +1100)]
LU-6142 lustre: remove non-static 'inline' markings.

There is rarely any point in marking a non-static function as
'inline'.  The result is to compile a state-alone function that other
files can refer to, and also to inline the code where it is used in
the same file.

In many cases the non-static inline functions are not used in the same
file, so the 'inline' marking has no effect.  In other cases it may
have an effect, but it can only be needed in highly performance
critical situations where a function call must be avoided, and that
doesn't seem like in any of these cases.

So just remove the "inline".

Lustre-change: https://review.whamcloud.com/40289
Lustre-commit: f0736a6a52ed95814d2cac875caf34f7fc233bf3

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic3243ee80f9bfd75a67dd8c89ea07d08dc36425c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45727
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
Jian Yu [Wed, 1 Dec 2021 17:58:21 +0000 (09:58 -0800)]
LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]

This patch makes changes to support new RHEL 8.5 release
for Lustre client.

Test-Parameters: trivial env=SANITY_EXCEPT="101j" \
clientdistro=el8.5

Lustre-change: https://review.whamcloud.com/45285
Lustre-commit: 951f31789f76295d182f56bef1fa8d92f69e7e2a

Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15184 llite: properly detect SELinux disabled case
Sebastien Buisson [Tue, 30 Nov 2021 00:23:35 +0000 (16:23 -0800)]
LU-15184 llite: properly detect SELinux disabled case

Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.

Lustre-change: https://review.whamcloud.com/45501
Lustre-commit: 42661f7ba106b7d2e02f85a65880061585ca6ccb

Test-Parameters: trivial

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45541
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7]
Jian Yu [Thu, 2 Dec 2021 20:44:41 +0000 (12:44 -0800)]
LU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.49.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I356b8a8345a4a91d6d1c1a4a9b4eab4bb5afe75b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45716
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-2659 tests: allow multiple MDTs in sanity-lipe.sh
Jian Yu [Tue, 28 Sep 2021 18:45:13 +0000 (11:45 -0700)]
EX-2659 tests: allow multiple MDTs in sanity-lipe.sh

This patch improves sanity-lipe.sh to support
multiple MDTs.

Test-Parameters: trivial testlist=sanity-lipe
Test-Parameters: trivial testlist=sanity-lipe facet=mds1
Test-Parameters: trivial mdscount=2 mdtcount=4 \
testlist=sanity-lipe

Test-Parameters: trivial env=LIPE_FIND=lipe_find2 \
testlist=sanity-lipe
Test-Parameters: trivial env=LIPE_FIND=lipe_find2 \
testlist=sanity-lipe facet=mds1
Test-Parameters: trivial env=LIPE_FIND=lipe_find2 \
mdscount=2 mdtcount=4 testlist=sanity-lipe

Change-Id: I9db6f01e810e8c40e419dcfad409741a3334687c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44588
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn26
Andreas Dilger [Sun, 14 Nov 2021 03:18:43 +0000 (20:18 -0700)]
RM-620 build: New tag 2.14.0-ddn26

New tag 2.14.0-ddn26

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I20db5a76274fd9dfbb59032da633798e1878cded

3 years agoLU-15127 llite: Remove path from discard_warn
Patrick Farrell [Fri, 12 Nov 2021 18:18:05 +0000 (13:18 -0500)]
LU-15127 llite: Remove path from discard_warn

It is unfortunately not safe to get the path from inside
dirty page discard warn.  It results in us getting and then
putting a bunch of dentries, and if 'dget' we do on our
file is the last reference on it, we deadlock like this:
ptlrpc_check_set
brw_interpret
osc_extent_finish
osc_ap_completion
cl_page_completion
vvp_page_completion_write
ll_dirty_page_discard_warn
dput
dentry_kill
__dentry_kill
evict
ll_delete_inode
cl_sync_file_range
cl_io_loop
cl_io_start
lov_io_call
cl_io_start
osc_io_fsync_start
osc_cache_writeback_range
osc_cache_wait_range
osc_extent_wait

ll_delete_inode is calling back in to *this file*, which
we are already working on, so this thread ends up waiting
for itself.

This is particularly common if the discard warn is racing
with an unmount, which will be destroying all the inodes
(not deleting them - just removing them from the local
VFS).

There is no way to safely get the path from this location.
If we are deeply committed to the functionality, it would
be possible to rewrite osc_extent_finish + brw_interpret
so they could attempt path lookup *after* the extent has
been completed.

This patch fixes the deadlock, any rewrite is left for
later.

Lustre-change: https://review.whamcloud.com/45550/
Lustre-commit: d04a1929b3adb776173b02e0f6b82d396046dd14 (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I537fd0d2e110c180a1369a9a3b1a644e613b18e4
Reviewed-on: https://review.whamcloud.com/45555
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15207 libcfs: reset hs_rehash_bits
Alex Zhuravlev [Thu, 11 Nov 2021 08:19:46 +0000 (11:19 +0300)]
LU-15207 libcfs: reset hs_rehash_bits

if rehash work is cancelled, then nobody resets
hs_rehash_bits and the first iterator asserts
at LASSERT(!cfs_hash_is_rehashing(hs)) in
cfs_hash_for_each_relax().

Lustre-change: https://review.whamcloud.com/45533
Lustre-commit: TBD (from 0c51e83b1345059c7f6847ea394e589ebffd0121)

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I1a567f6be77ca6c45e5d4f256722206b12588554
Reviewed-on: https://review.whamcloud.com/45557
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15216 lmv: improve MDT QOS space balance
Lai Siyao [Sat, 6 Nov 2021 19:16:49 +0000 (15:16 -0400)]
LU-15216 lmv: improve MDT QOS space balance

When MDTs are not balanced, QOS code tries to keep subdirectory
creation local to the same MDT when it is deep in the directory
tree, to avoid creating too many remote directories, but the
existing weight to stay on the parent MDT until 50% of other MDTs
is too radical, and causes mkdirs to be "stuck" on the same MDT.

* remove "lq_threshold_rr" from above calculation because the check
  in ltd_qos_is_usable() handles this, so use only "dir_depth".
* the factor is changed to "16 / (dir_depth + 10)", then it's less
  likely to stick to the parent MDT for top levels, while more
  likely to stay on the parent MDT for low levels:
  depth=0 -> 160%, depth=4 -> 114%, depth=6 -> 100%,
  depth=8 -> 88%, depth=12 -> 72%
* rename lli_depth to lli_dir_depth to make usage more clear.

Lustre-change: https://review.whamcloud.com/45544
Lustre-commit: TBD (from 95398b056f7a88ec7830da353170e8993cecf036)

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iec6b77919b630d4baee6d54bee7bdb8ca9fb8574
Reviewed-on: https://review.whamcloud.com/45556
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn25
Andreas Dilger [Wed, 10 Nov 2021 18:08:02 +0000 (11:08 -0700)]
RM-620 build: New tag 2.14.0-ddn25

New tag 2.14.0-ddn25

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I344f62b9984e7b2338feaae34ddb310ee1a3bc17

3 years agoLU-15098 tests: sanity-sec 27a exec commands on right node
Sebastien Buisson [Tue, 19 Oct 2021 15:59:33 +0000 (17:59 +0200)]
LU-15098 tests: sanity-sec 27a exec commands on right node

In nodemap_exercise_fileset called from sanity-sec test 27a,
make sure all commands are executed on first client, as we are
testing properties of nodemaps 'default' and 'c0'.
And make sure 'default' nodemap has admin and trusted properties
set to 1, as we are carrying operations as root.

Lustre-change: https://review.whamcloud.com/45293
Lustre-commit: b45169276ce1ab09dae7a733859f89a6c92808e5

Test-Parameters: trivial
Test-Parameters: testlist=sanity-sec clientcount=2 env=ONLY=27a
Fixes: 0daeebcbdc ("LU-14797 nodemap: map project id")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idd9f391db60475721f3a3856b5e3bee1a18bbbca
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45488
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14797 nodemap: map project id
Sebastien Buisson [Wed, 30 Jun 2021 16:30:57 +0000 (18:30 +0200)]
LU-14797 nodemap: map project id

Add calls to nodemap_map_id() in order to map project IDs from
client ID to server ID and conversely.
Also extend nodemap_can_setquota() to allow setquota on project
only if ID is not squashed or deny_unknown is not set.
Update sanity-sec test_27a to exercise the feature.

Lustre-change: https://review.whamcloud.com/44119
Lustre-commit: 0daeebcbdc4e89d59221299f2687cfd3c4f00b5b

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id66458550d312404b1993ead8940c3d12eaadccd
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45487
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14797 sec: add projid to nodemap
Sebastien Buisson [Tue, 29 Jun 2021 15:54:59 +0000 (17:54 +0200)]
LU-14797 sec: add projid to nodemap

Add the ability to create id maps of a new type, projid. This also
requires adding a new value to map_mode, projid_only. Finally, a new
property named squash_projid is used to map all project ID to a
default one.
Update lctl man pages to mention these additions.
Update sanity-sec test_12 and test_15 to exercise projid mapping and
squash_projid property.

Lustre-change: https://review.whamcloud.com/44108
Lustre-commit: 8a770616a5ad21360ecba63c3643cadd245a2a50

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I63eba8b0d33feaa7ece8c1788cb587fcb330357a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45486
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15141 quota: optimize capability check for root squash
Sebastien Buisson [Thu, 21 Oct 2021 06:56:44 +0000 (08:56 +0200)]
LU-15141 quota: optimize capability check for root squash

On client side, checking for owner/group quota can be directly
bypassed if this is for root and there is no root squash.

Lustre-change: https://review.whamcloud.com/45322
Lustre-commit: TBD (15aa2e9264f0604b185ce280df4b34ea5a280b3f)

Change-Id: If29eca428d8748df412a717615e4d0a4886ddd04
Fixes: 0b057b7179 ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/45321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14739 quota: fix quota with root squash enabled
Wang Shilong [Tue, 20 Jul 2021 02:36:31 +0000 (10:36 +0800)]
LU-14739 quota: fix quota with root squash enabled

This patch tries to fix several problems:

1. OSD will ignore quota if IO comes from client
cache or root, however since following change:

LU-12687 osc: consume grants for direct I/O

DIO now consumes grant too, following check for
sync IO is wrong now:

(lnb[i].lnb_flags & (OBD_BRW_FROM_GRANT | OBD_BRW_SYNC))
        == OBD_BRW_FROM_GRANT)

This wass originally added to support 1.8 client, it is
going to be 2.15 now, so let's remove this broken check.

2. Server side will clear OBD_BRW_NOQUOTA if root squash
is enabled, this will revert fixes from:

"LU-13228 clio: mmap write when overquota"

We need to separate @ci_noquota and @oi_cap_sys_resource cases,
introduce a new flag OBD_BRW_SYS_RESOURCE, and extend test_75
to cover this case.

3. LU-14739 missed case that DoM quota should be considered
as well.

4. If EDQUOT is returned for root, we check the new root squash
flag OBD_FL_ROOT_SQUASH from server side. If this flag is not set,
we bypass quota for root, otherwise all root writes become sync
writes.

5. Fix a leftover problem with LU-9671 for DOM

Lustre-change: https://review.whamcloud.com/44347
Lustre-commit: bbfdc7c1670c92747a8f98d39e1e43dc39e59e30

Fixes: a4fbe7341baf12 ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3fd23da7d56acb5b485540333208e5d5b0b48023
Reviewed-on: https://review.whamcloud.com/45310
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14739 quota: nodemap squashed root cannot bypass quota
Sebastien Buisson [Fri, 11 Jun 2021 14:49:47 +0000 (16:49 +0200)]
LU-14739 quota: nodemap squashed root cannot bypass quota

When root on client is squashed via a nodemap's squash_uid/squash_gid,
its IOs must not bypass quota enforcement as it normally does without
squashing.
So on client side, do not set OBD_BRW_FROM_GRANT for every page being
used by root. And on server side, check if root is squashed via a
nodemap and remove OBD_BRW_NOQUOTA.

Lustre-change: https://review.whamcloud.com/43988
Lustre-commit: a4fbe7341baf12c00c6048bb290f8aa26c05cbac

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I95b31277273589e363193cba8b84870f008bb07a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/45485
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14929 gss: detect libkeyutils dependency
Sebastien Buisson [Wed, 11 Aug 2021 15:44:08 +0000 (17:44 +0200)]
LU-14929 gss: detect libkeyutils dependency

When building GSS support, gss_keyring requires libkeyutils.
So make sure this dependency is properly detected at configure time,
and include keyutils.h only when required.

Lustre-change: https://review.whamcloud.com/44597
Lustre-commit: 15998eb78e279f1bfa5059f0f65087f7851d40ff

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9fa5750f4609250ecdc1c47f68b97bff9be13ace
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45484
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15070 llite: update default LMV upon any change
Lai Siyao [Tue, 12 Oct 2021 22:15:37 +0000 (18:15 -0400)]
LU-15070 llite: update default LMV upon any change

max_inherit and max_inherit_rr was newly added, and they are missing
in lsm_md_eq(), therefore client may not update default LMV when
either of these two fields is changed.

Add sanityn 112.

Lustre-change: https://review.whamcloud.com/45237
Lustre-commit: f3314706b4e5c21f14908650decd92a30fdc1db9

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iac71b530b3702105c4213715826b1782c6aba7ca
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15070 mdt: revoke remote LOOKUP lock for default LMV
Lai Siyao [Tue, 12 Oct 2021 22:20:21 +0000 (18:20 -0400)]
LU-15070 mdt: revoke remote LOOKUP lock for default LMV

When setting default LMV, it will revoke LOOKUP lock, while if dir
is remote dir, its LOOKUP lock is on MDT where its parent is located.

Lustre-change: https://review.whamcloud.com/45236
Lustre-commit: b4645b5469c0722fdf66697379be878c071839cf

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9f079a0bcff530603725ce72cd89c14935ba913b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45495
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn24
Andreas Dilger [Thu, 4 Nov 2021 17:26:18 +0000 (11:26 -0600)]
RM-620 build: New tag 2.14.0-ddn24

New tag 2.14.0-ddn24

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie7c61bd7bbc50ccfa4e39ec82403895215548627

3 years agoLU-15154 kernel: kernel update SLES15 SP3 [5.3.18-59.27.1]
Jian Yu [Sat, 23 Oct 2021 01:19:31 +0000 (18:19 -0700)]
LU-15154 kernel: kernel update SLES15 SP3 [5.3.18-59.27.1]

Update SLES15 SP3 kernel to 5.3.18-59.27.1 for Lustre client.

Test-Parameters: trivial

Change-Id: Ie3c369a8e93a75b4afbde55489bd3819bb39e1de
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45350
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14996 lov: prefer mirrors on non-rotational OSTs
Alex Zhuravlev [Thu, 9 Sep 2021 08:16:41 +0000 (11:16 +0300)]
LU-14996 lov: prefer mirrors on non-rotational OSTs

consider non-rotational OSTs as preferred unless explicit prefer
flag is set on a mirror.

Lustre-change: https://review.whamcloud.com/44883
Lustre-commit: 8507472dd37ebc07bf7eb1b772c2ff619009c233

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I787bcba0b5e45842c9d4762c7f97a8f44a4fc9cb
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4157 lipe: comment out ldiskfs functions for client-only lpcc_purge
Lei Feng [Thu, 28 Oct 2021 09:06:36 +0000 (05:06 -0400)]
EX-4157 lipe: comment out ldiskfs functions for client-only lpcc_purge

Sometime the client system does not have ldiskfs libs, then lpcc_purge
fails to start on it because some ldiskfs symbols cannot be found.
So comment out these codes for a client-only building.

Change-Id: I4a38f1128b9e66d495f94ef7ebd91f26ea052b67
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=200-202,ONLY_REPEAT=50 \
                 clientextra_install_params="--packages lipe-lpcc"
Reviewed-on: https://review.whamcloud.com/45394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15170 llite: Switch pcc to lookup_one_len
Patrick Farrell [Tue, 26 Oct 2021 22:52:47 +0000 (18:52 -0400)]
LU-15170 llite: Switch pcc to lookup_one_len

Using kern_path to lookup files in the PCC cache means we
are subject to user namespaces, so the PCC volume must be
mapped in to a container or the cached files cannot be
found.

One solution is to switch to using lookup_one_len - this is
what the code which *creates* PCC files does.  This
manually walks the path from the root, which avoids
namespace issues.

This is appropriate because PCC is kernel functionality -
the user should not be able to directly access the volume,
but it should be accessible as a cache.

Lustre-change: https://review.whamcloud.com/45436
Lustre-commit: 96c90859e14f3960b57eae54b3886aeef62f6f40 (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idd15574ace29543bed1a9937cb35404781714791
Reviewed-on: https://review.whamcloud.com/45380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn23
Andreas Dilger [Sat, 30 Oct 2021 06:05:55 +0000 (00:05 -0600)]
RM-620 build: New tag 2.14.0-ddn23

New tag 2.14.0-ddn23

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I115d1266b69a6abd9cba775ce6f4a4aa0f7cc1cb

3 years agoEX-4158 lipe: keep cache data when PCC is stopped by lpcc service
Lei Feng [Thu, 28 Oct 2021 11:09:12 +0000 (07:09 -0400)]
EX-4158 lipe: keep cache data when PCC is stopped by lpcc service

If a PCC backend is managed by lpcc service, keep the cache data
in PCC backend when it is stopped by lpcc service.

Change-Id: I15e80d28ff017573b8f7b24449979072256ab6b2
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=200-202,ONLY_REPEAT=10 \
                 clientextra_install_params="--packages lipe-lpcc"
Reviewed-on: https://review.whamcloud.com/45396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4006 doc: further improvements to lfs-pcc man pages
Andreas Dilger [Wed, 13 Oct 2021 06:11:18 +0000 (00:11 -0600)]
EX-4006 doc: further improvements to lfs-pcc man pages

Describe fields in lfs-pcc-state.1.  Add lfs-pcc-delete.1 page.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c165c12e674af0edbb5e84c2e5f8aeed73ebbe5
Reviewed-on: https://review.whamcloud.com/45338
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
3 years agoEX-4152 revert: "LU-14177 pcc: clear PCC-RO cache from old client access"
Andreas Dilger [Fri, 29 Oct 2021 17:24:03 +0000 (17:24 +0000)]
EX-4152 revert: "LU-14177 pcc: clear PCC-RO cache from old client access"

This reverts commit c4d7cc7b871688ebdc631e907938dce2b5c10503
because it causes 2.12 clients to hang in some cases.

Change-Id: I895914b7e1204ecf308650988fa91d634d951550
Test-Parameters: trivial testlist=sanity-pfl clientversion=2.12.6-ddn42
Reviewed-on: https://review.whamcloud.com/45412
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn22
Andreas Dilger [Fri, 22 Oct 2021 17:06:56 +0000 (11:06 -0600)]
RM-620 build: New tag 2.14.0-ddn22

New tag 2.14.0-ddn22

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic1d3a194f5434e667e9b8562887888c10b1f7161

3 years agoLU-15010 mdc: add support for grant shrink
Alex Zhuravlev [Thu, 16 Sep 2021 08:20:18 +0000 (11:20 +0300)]
LU-15010 mdc: add support for grant shrink

just re-use existing mechanism used in OSC

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4cdca057d35eaff6493d047127f1fe5eee9e9620
Reviewed-on: https://review.whamcloud.com/45177
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15043 lod: check for spilling loops
Alex Zhuravlev [Mon, 18 Oct 2021 14:18:27 +0000 (17:18 +0300)]
LU-15043 lod: check for spilling loops

at setting to avoid possible confusion.

Lustre-change: https://review.whamcloud.com/45083
Lustre-commit: TBD (from 1d4502e7ef3288f575849268232aca0086342822)

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I901b08f614c162607b1b5c6a992aa5b188fd8e75
Reviewed-on: https://review.whamcloud.com/45104
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4055 pcc: command to remove PCC mirror component
Qian Yingjin [Wed, 20 Oct 2021 04:07:14 +0000 (12:07 +0800)]
EX-4055 pcc: command to remove PCC mirror component

This patach adds a command "lfs pcc delete $FILE" to delete the
PCC foreign mirror layout component.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I3f56fb8134bd1e7673ef8e04dff9b8482f0e32c3
Reviewed-on: https://review.whamcloud.com/45305
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14177 pcc: clear PCC-RO cache from old client access
Qian Yingjin [Thu, 3 Dec 2020 09:08:38 +0000 (17:08 +0800)]
LU-14177 pcc: clear PCC-RO cache from old client access

For the purpose of the compatibility and interoperability, we have
added a PCC-RO connection flags.

To avoid inconsistent data access, MDT does not (try to) grant
layout lock to the client at the time of getattr() and open().
When an old client without PCC-RO support requests a layout lock
via a intent lock request on the file in LCM_FL_PCC_RDONLY state,
MDT needs to clear the LCM_FL_PCC_RDONLY flag on the layout first
which will invalidate all PCC-RO caches on the clients, and then
return the layout to the old client.

Lustre-change: https://review.whamcloud.com/40850
Lustre-commit: TBD (from 0c76ae7f3cb6fc3a9f70d1398f773d8afffa50f1)

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I69707d1ac53decaddd32bcf231b15d3565fb200f
Reviewed-on: https://review.whamcloud.com/45269
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15038 mgc: release cl_mgc_mutex on error
Andreas Dilger [Mon, 27 Sep 2021 18:29:58 +0000 (12:29 -0600)]
LU-15038 mgc: release cl_mgc_mutex on error

If local_oid_storage_init() returns an error, the cl_mgc_mutex()
should be released.

Lustre-change: https://review.whamcloud.com/45063
Lustre-commit: 7cf10b90d62256aa4d177486ff13bd61dfb9a5ff

Fixes: 3e38436dc09 ("LU-2059 llog: MGC to use OSD API for backup logs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I921dde4e9202733874d8e7f980e95af23739a655
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45330
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4099 revert: "LU-14739 quota: nodemap squashed root cannot bypass quota"
Andreas Dilger [Wed, 20 Oct 2021 17:42:05 +0000 (17:42 +0000)]
EX-4099 revert: "LU-14739 quota: nodemap squashed root cannot bypass quota"

This reverts commit 0b057b71796cd901813f7dbc08d9459efa266740
due to 20% write performance regression in IOR.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2a05f426795fc671691e916c92b62fa107ee5620
Reviewed-on: https://review.whamcloud.com/45314
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 years agoEX-4056 revert LU-12019 build: Recognize Debian Kernel and set KMP dir
Minh Diep [Thu, 21 Oct 2021 19:17:52 +0000 (12:17 -0700)]
EX-4056 revert LU-12019 build: Recognize Debian Kernel and set KMP dir

This reverts commit 230d4500d5a9dfada392199d77fc413382f24750.

This caused MOFED modules to use in-kernel modules, and causing
lustre fails to load.

Test-Parameters: trivial
Change-Id: I205368dccc5fd18ed0ee096c1d85e140c3de5d6d
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45327
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15093 libcfs: Check if param_set_uint_minmax is provided
Chris Horn [Mon, 27 Sep 2021 20:48:02 +0000 (15:48 -0500)]
LU-15093 libcfs: Check if param_set_uint_minmax is provided

Linux kernel v5.15 commit 2a14c9ae15a38148484a128b84bff7e9ffd90d68
moved param_set_uint_minmax to common code.

Lustre-change: https://review.whamcloud.com/45214/
Lustre-commit: 8bc83a6a9e9558e78c11351f6698d06d29e3dac1

HPE-bug-id: LUS-10469
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifd1d72ae531f0f6c7cd96cc28fbc07c8a8b70886
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45324
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4101 lipe: fix default POOL name
Minh Diep [Wed, 20 Oct 2021 18:21:52 +0000 (11:21 -0700)]
EX-4101 lipe: fix default POOL name

* set default valuies before using them

Test-Parameters: trivial

Change-Id: I69e3176b8f469f1bb0510e10e88e7f2843ee98b3
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45315
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4101 lipe: fix stratagem-hp-config.sh typo
John L. Hammond [Wed, 20 Oct 2021 11:04:07 +0000 (06:04 -0500)]
EX-4101 lipe: fix stratagem-hp-config.sh typo

In stratagem-hp-config.sh, change is_valid_precent to
is_valid_percent.

Fixes: dcc46813c0c3 ("EX-3890 lipe: raise pool spilling threshold")
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ic0ae2c24238fcf18ee9b6760c0f5c067aaccd84e
Reviewed-on: https://review.whamcloud.com/45307
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14331 tests: add version check for sanity-pfl 0b
James Nunez [Tue, 19 Oct 2021 19:40:07 +0000 (13:40 -0600)]
LU-14331 tests: add version check for sanity-pfl 0b

sanity-pfl test 0b was modified to not shrink the number
of stripes due to size constraints (LOV_MAX_STRIPE_COUNT).
The modified test 0b landed to Lustre 2.13.57.36 and we
should skip this test if the server version is less than
2.13.57.36.

Fixes: a58cdc9196f ("LU-14191 lod: comp stripe count limit check")

Test-Parameters: trivial serverversion=2.12.6-ddn42 serverdistro=el7.9 testlist=sanity-pfl env=ONLY=0b

Change-Id: I60de0e227ed566835476579dbca0d745f89245bf
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4098 lipe: lpcc uses class only in python3
Lei Feng [Wed, 20 Oct 2021 00:55:09 +0000 (20:55 -0400)]
EX-4098 lipe: lpcc uses class only in python3

FileNotFoundError is only available from python3.
Change it to matching class in python2.

Change-Id: I63676ef8ff6a5461a7af6e9177d6bc76e39c0bc5
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=200-202,ONLY_REPEAT=50 \
                 clientextra_install_params="--packages lipe-lpcc"
Reviewed-on: https://review.whamcloud.com/45304
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn21
Andreas Dilger [Fri, 15 Oct 2021 23:05:18 +0000 (17:05 -0600)]
RM-620 build: New tag 2.14.0-ddn21

New tag 2.14.0-ddn21

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If1d6bdec0b847b1ff6be9b8f3cdc61e2cfa66d61

3 years agoUpdate lipe version to 2.20.
John L. Hammond [Thu, 16 Sep 2021 21:15:54 +0000 (16:15 -0500)]
Update lipe version to 2.20.

Update lipe version to 2.20. Tag to be created at a later date.

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I7d4c4c07bb38aae809229087ee66b57c6c128dd6
Reviewed-on: https://review.whamcloud.com/45206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-3390 lipe: add lipe_purge script
Jian Yu [Sun, 29 Aug 2021 07:45:09 +0000 (00:45 -0700)]
EX-3390 lipe: add lipe_purge script

Usage: lipe_purge [OPTION]... --client-mount=CLIENT_MOUNT --pool=POOL -- DEVICE EXPRESSION
Scan DEVICE for files matching EXPRESSION and purge the matching files from the POOL.

Mandatory arguments to long options are mandatory for short options too.
  -c CLIENT_MOUNT, --client-mount=CLIENT_MOUNT Lustre client mount point
  -d, --debug                                  display debug information
  --dry-run                                    only display what would be purged
  -p POOL, --pool=POOL                         OST pool name
  -t THREAD_COUNT, --threads=THREAD_COUNT      count of the scanning thread
  --no-convert                                 do not convert the EXPRESSION
  -h, --help                                   display this help message and exit
  -v, --version                                display version information and exit

For expression details see lipe_find(1).

For example:

$ lipe_purge --dry-run -c /mnt/lustre -p hdd  -- /dev/mapper/mds1_flakey -pool hdd
lfs mirror split --destroy --pool=hdd /mnt/lustre/.lustre/fid/0x200000414:0x2672:0x0

Test-Parameters: trivial

Change-Id: I29284dadef5677f48ed6977cf40d021e38dfcaa8
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44543
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44972

3 years agoEX-3389 lipe: add lipe_delete script
Jian Yu [Sun, 29 Aug 2021 07:36:25 +0000 (00:36 -0700)]
EX-3389 lipe: add lipe_delete script

Usage: lipe_delete [OPTION]... --client-mount=CLIENT_MOUNT -- DEVICE EXPRESSION
Scan DEVICE for files matching EXPRESSION and delete the matching files.

Mandatory arguments to long options are mandatory for short options too.
  -c CLIENT_MOUNT, --client-mount=CLIENT_MOUNT Lustre client mount point
  -d, --debug                                  display debug information
  --dry-run                                    only display what would be deleted
  -t THREAD_COUNT, --threads=THREAD_COUNT      count of the scanning thread
  --no-convert                                 do not convert the EXPRESSION
  -h, --help                                   display this help message and exit
  -v, --version                                display version information and exit

For expression details see lipe_find(1).

For example:

$ lipe_delete --dry-run -c /mnt/lustre -- /dev/mapper/mds1_flakey -fid '*'
lfs rmfid /mnt/lustre 0x200000408:0x14245:0x0 0x200000408:0x14246:0x0

Test-Parameters: trivial

Change-Id: I9ed5247992c807e81ac4445986a07ac0d2196de3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44971

3 years agoEX-3198 lipe: add lipe_find2 script
Jian Yu [Thu, 26 Aug 2021 07:13:32 +0000 (00:13 -0700)]
EX-3198 lipe: add lipe_find2 script

Add a simplified lipe_find2 script to wrap lipe_convert_expr
and lipe_scan2.

Usage: lipe_find2 [OPTION]... -- DEVICE [EXPRESSION]
  --client-mount=MOUNT  use the Lustre client at MOUNT for FID to path
  --print-fid           print FID of each file matching expression
  --print-json[=ATTRS]  print a JSON object with atributes specified by ATTRS
                        describing each file matching expression. ATTRS must be
                        a comma separated list of attribute names.
                        use 'lipe_find2 --list-attrs' to see available attributes names
  --print-path[=WHICH]  print path(s) of each file matching expression
                        WHICH must be 'one' (default) or 'all'
  --absolute-paths      prefix paths with MOUNT/ (for --print-path)
                        prefix FIDs with MOUNT/.lustre/fid/ (for --print-fid)
  --delimiter=DELIM     use DELIM intead of newline to delimit matches
  --null                use a NUL byte intead of newline to delimit matches
  --threads=COUNT       use COUNT scanning threads
  -h,--help             display this help text and exit
  --debug               display debug information
  --list-attrs          list available attribute names
  --no-convert          do not convert the EXPRESSION
  --version             output version information and exit

If --absolute-paths or --print-path is used then --client-mount is
required. It is also required if --print-json is used and ATTRS
includes paths.

For expression details see lipe_find(1).

Test-Parameters: trivial testlist=sanity-lipe
Test-Parameters: trivial env=LIPE_FIND=lipe_find2 testlist=sanity-lipe
Change-Id: Iacca61f9f5d28d17a45596b9d96f958dc50ca57f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44963
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-3687 osp: do force disconnect if import is not ready
Mikhail Pershin [Wed, 25 Aug 2021 17:03:47 +0000 (20:03 +0300)]
EX-3687 osp: do force disconnect if import is not ready

Send OSP_DISCONNECT only on health import. Otherwise,
force local disconnect for unhealthy imports.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Icd9f171271f4e17a65503fcc710ad3aaa2b84e1e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45253
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>