Whamcloud - gitweb
Patrick Farrell [Wed, 15 Dec 2021 17:07:59 +0000 (12:07 -0500)]
LU-15317 llite: Add D_IOTRACE
In looking in to performance problems, it's very important
to be able to trace the I/O patterns from userspace in to
Lustre, and also understand the key basics of how Lustre
handles that I/O (readahead, RPC generation).
This is best done with a dedicated debug flag - No
userspace tool can provide all this information, and
existing debug flags collect a huge number of unrelated
pieces of, well, debug information.
The goal is for customers to be able to quickly gather log
files of a reasonable size which contain the necessary
information and which can easily be interpreted by
engineering. This is not possible if the information is
spread out across a number of heavyweight debug flags.
This is a first pass at adding the flag and the debug
required to track basic data I/O. One significant
omission in the first patch is RPC generation - I have not
decided how best to do that yet. That will be added in a
future patch.
lustre-change: https://review.whamcloud.com/#/c/45752/
lustre-commit:
e77ef62eb25195ddc4ef63c75dbe7342ddb2b3f5 (tbd)
test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0ed003ec1488e1c267b194c871f64b34f6dc6025
Reviewed-on: https://review.whamcloud.com/45864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Wed, 15 Dec 2021 17:06:42 +0000 (12:06 -0500)]
LU-15317 libcfs: Remove D_TTY
The D_TTY flag is almost entirely unused and certainly not
needed. Remove it so we have a spare flag to use for
iotrace.
test-parameters: trivial
lustre-change: https://review.whamcloud.com/45751/
lustre-commit:
8317690ae36918109594208811c3c6358fe46e18 (tbd)
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1127cbcf6ee51adc07d560a8827fa1e32d16c90c
Reviewed-on: https://review.whamcloud.com/45863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Sat, 30 Oct 2021 18:39:26 +0000 (11:39 -0700)]
LU-15137 socklnd: decrement connection counters on close
To gracefully handle potential race with delayed connection create,
decrement connection counters per type as connections are being
closed.
Lustre-change: https://review.whamcloud.com/45422
Lustre-commit:
7e26413aa85fdc931721cde36bae3bf2bb97e63f
Test-Parameters: trivial testlist=sanity-lnet
Fixes:
cbf740d0 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ieb3b44701e4999ea1fe63234162dd5878d65958a
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46051
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Thu, 4 Nov 2021 18:35:43 +0000 (11:35 -0700)]
LU-15137 socklnd: expect two control connections maximum
As a result of connecting to ourselves, e.g. pinging own nid,
two control type connections are established vs. just one
in case of connecting externally.
Fix the control connection counter to be able to handle that.
Lustre-change: https://review.whamcloud.com/45461
Lustre-commit:
ee9a03d8308c5918a17e2e45fd59ee5a4c38acaf
Test-Parameters: trivial testlist=sanity-lnet
Fixes:
cbf740d0 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Idce01d81e3924226b5b163d2472cbcd4f6eb5819
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46050
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Qian Yingjin [Tue, 17 Nov 2020 15:12:44 +0000 (23:12 +0800)]
LU-14138 ptlrpc: move more members in PTLRPC request into pill
Some data members in the data structure @ptlrpc_request can be
moved into the data structure @rep_capsule:
/** Request message - what client sent */
struct lustre_msg *rq_reqmsg;
/** Reply message - server response */
struct lustre_msg *rq_repmsg;
/** Fields that help to see if request and reply were swabbed */
__u32 rq_req_swab_mask;
__u32 rq_rep_swab_mask;
After these data structures are reconstructed, @rep_capsule can
be more common used and it makes pack and unpack sub requests
in a batch PtlRPC request for the coming batch metadata processing
more easily.
Lustre-change: https://review.whamcloud.com/40669
Lustre-commit:
f75d2a1fc9b17b384bbcbc13bcb80ba10412cf29
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib6d942b79ebf1a444d63b55ad4bc94813cf947c7
Reviewed-on: https://review.whamcloud.com/46029
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Thu, 17 Jun 2021 14:11:51 +0000 (17:11 +0300)]
LU-13055 doc: update changelog manpages
Add lctl-changelog_register.8 and lctl-changelog_deregister.8
manpages and update lctl.8 manpage to refer to them.
Lustre-change: https://review.whamcloud.com/44022
Lustre-commit:
393885c027793d27ec948fd4fccb47aa530d2bf8
Fixes:
15305c3c3fe7 ("LU-12214 build: fix build without lustre_utils")
Test-Parameters: trivial
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ie41db630c72f61a884cd8000e0a4aeeb42ca60eb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46007
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Yang Sheng [Mon, 13 Sep 2021 21:04:00 +0000 (05:04 +0800)]
LU-5369 mdt: check lock handle instead assert
The lock handle could be NULL inn some corner case.
We should check it instead of LBUG.
Lustre-change: https://review.whamcloud.com/44905
Lustre-commit:
5e4411e99cd7d0ccf4e51fac1442673844626639
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I1afa7f8c129c104b012ae23141318365c388c503
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46019
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Tue, 21 Sep 2021 12:23:56 +0000 (15:23 +0300)]
LU-14474 llog: don't destroy next llog
do not destroy empty llog if it's referenced as
the next one in a catalog.
Lustre-change: https://review.whamcloud.com/44998
Lustre-commit:
4521f6af35d1dc20b531b87ff3633d89dbac86ec
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I78bfeb90435aaee2b8536b647aa3acec56642ea0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45892
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Wed, 27 Oct 2021 05:48:03 +0000 (08:48 +0300)]
LU-15168 osd: use large allocation for idc cache
as in some cases (e.g. ofd precreate) the cache can grow to dozens
of kilobytes (sizeof(struct idc_map_cache)=40 * 1024).
Lustre-commit:
a3aa2eefd3d4708ce7094ed644c30b784c39eb2c
Lustre-change: https://review.whamcloud.com/45382
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id9e0996a7a1d07065f4a50c1d5be5051e756559a
Reviewed-on: https://review.whamcloud.com/46040
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Tue, 24 Aug 2021 03:44:45 +0000 (23:44 -0400)]
LU-14959 ldlm: Check return value of ldlm_resource_get()
Fix the comment to properly indicate it returns ERR_PTR on
error and fix osc_req_attr_set() and mdc_get_lock_handle()
to actually check the return value before passing it on and
causing an unintended crash.
Lustre-change: https://review.whamcloud.com/44738
Lustre-commit:
3e0aa9ca6e0a9a6981b9a3ad5f556cd6554a6b5b
Change-Id: Ib85a62140a39744e85989c9a9c8aa2ed771d70d1
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46016
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Sat, 8 Jan 2022 06:17:20 +0000 (23:17 -0700)]
EX-4270 kernel: increase kernel version to ddn16
Increase kernel build version to -ddn16 due to new kernel patch.
Lustre-change: https://review.whamcloud.com/45869
Lustre-commit:
9cb39cdf470f444decaf183af7b4b6f6a79f80bf
Fixes:
afd8b0df0aba ("EX-4270 snapshot: avoid call quota op recursively")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icf0d404ea5ebfb1009078a286585d837b37417ea
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/46023
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Hongchao Zhang [Tue, 30 Nov 2021 10:11:01 +0000 (18:11 +0800)]
EX-4270 snapshot: avoid call quota op recursively
In ext4_snapshot_test_and_cow, if there is already in some quota
call, it could cause deadlock if the snapshot calls quota function
to allocate space recursively.
[only the change to snapshot-jbd2-rhel7.7.patch]
Lustre-change: https://review.whamcloud.com/45680
Lustre-commit:
4722f1a0ca9d24bf6fa2678659ccf2cb1be5cdf1
Test-Parameters: trivial
Change-Id: Iac354744fcee8955d8e41020f9cee6d433f38e80
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Qian Yingjin [Fri, 25 Jun 2021 08:22:35 +0000 (16:22 +0800)]
LU-14793 hsm: record index for further HSM action scanning
there is contention between HSM archive request and "hsm_cdtr"
kernel thread:
->mdt_hsm_request()
->mdt_hsm_add_actions()
->mdt_hsm_register_hal()
->mdt_agent_record_add()
->down_write(&cdt->cdt_llog_lock)
->llog_cat_add()
->up_write(&cdt->cdt_llog_lock)
->mdt_coordinator()
->cdt_llog_process()
->down_write(&cdt->cdt_llog_lock);
->llog_cat_process()
->up_write(&cdt->cdt_llog_lock);
HSM archive request and HSM cat llog scanning in the kernel daemon
"hsm_cdtr" are both contenting for write llog lock to add or
update the "hsm_actions" llog.
In the tesing, it uses max_requests = 1000000.
In the current implementation, it means kernel daemon thread
"hsm_cdtr" needs to scan nearly whole "hsm_actions" llog from the
beginning position with write llog lock held.
This will slow down the HSM archive requests which is contented
for write llog lock.
As llog is append-only, we record the latest handled position in
the llog, thus next scanning can start from the previous recorded
postion (llog index), does not need to start from the beginning.
Another way to mitigate this probelm is:
when the llog scanner found that there are other process
contended for the llog lock, it will stop the llog scanning and
release write llog lock properly for incoming HSM archive requests.
After applied this patch, with 200000 HSM actions in llog, the time
to queue 10000 HSM archive requests reduces from 10 seconds to 4
seconds.
Lustre-change: https://review.whamcloud.com/44077
Lustre-commit:
a15a5432f8063e3a04a87d74eafac0060a8f9d26
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I2e92daf34844605ee648787daf859143335c68bf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46013
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Qian Yingjin [Fri, 28 May 2021 03:56:12 +0000 (11:56 +0800)]
LU-14724 nrs: TBF rule list broken when change rule rank
When change rank of two adjacent rules in the TBF rule list in
@nrs_tbf_rule_change_rank():
list_move(&rule->tr_linkage, next_rule->tr_linkage.prev);
The previous pointer of @next_rule is @rule, using list_move
directly will break the rule list.
In this patch, it use list_del + list_add to repace list_move to
avoid TBF rule broken.
And also add a test case sanityn test_77o for this bug.
Lustre-change: https://review.whamcloud.com/43925
Lustre-commit:
e688f29275deeadc0ef4faa01f166986bade301f
Fixes:
aa14b0b9a152 ("LU-8006 ptlrpc: specify ordering of TBF policy rules")
Change-Id: Ica30d3329f07914657ac2c4089d66f934021b763
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46017
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Tue, 3 Aug 2021 06:38:46 +0000 (14:38 +0800)]
LU-14713 llite: mend the trunc_sem_up_write()
The original lli_trunc_sem replace change (commit
e5914a61ac) fixed a
lock scenario:
t1 (page fault) t2 (dio read) t3 (truncate)
|- vm_mmap_pgoff() |- vvp_io_read_start() |- vvp_io_setattr
|- down_write(mmap_sem) |- down_read(trunc_sem) _start()
|- do_map() |- ll_direct_IO_impl()
|- vvp_io_fault_start |- ll_get_user_pages()
|- down_write(
|- down_read(mmap_sem) trunc_sem)
|- down_read(trunc_sem)
t1 waits for read semaphore of trunc_sem which is hindered by t3,
since t3 is waiting for the write semaphore while t2 take its read
semaphore, and t2 is waiting for mmap_sem which has been taken by t1,
and a deadlock ensues.
commit
e5914a61ac changes the down_read(trunc_sem) to
trunc_sem_down_read_nowait() in page fault path, to make it ignore
that there is a down_write(trunc_sem) waiting, just takes the read
semaphore if no writer has taken the semaphore, and breaks the
deadlock.
But there is a delicacy in using wake_up_var(), wake_up_var()->
__wake_up_bit()->waitqueue_active() locklessly test for waiters on the
queue, and if it's called without explicit smp_mb() it's possible for
the waitqueue_active() to ge hoisted before the condition store such
that we'll observe an empty wait list and the waiter might not
observe the condition, and the waiter won't get woke up whereafter.
Lustre-change: https://review.whamcloud.com/43844
Lustre-commit:
39745c8b5493159bbca62add54ca9be7cac6564f
Fixes:
e5914a61ac ("LU-12460 llite: replace lli_trunc_sem")
Change-Id: Ifdda2c1c8a4171466be1723923c136e84de8ce0e
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Bobi Jam [Thu, 15 Jul 2021 18:22:38 +0000 (02:22 +0800)]
LU-14854 mdd: proper handle error in mdd_swap_layouts()
Only restore object's HSM xattr on error if it's for
SWAP_LAYOUTS_MDS_HSM.
Lustre-change: https://review.whamcloud.com/44319
Lustre-commit:
7648c1c905b0976fc789cfd9c6bac382389385ee
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9d4c58cd3107c3900e72a0946d0ec7d7286dd43f
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46021
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Andreas Dilger [Wed, 4 Aug 2021 08:08:12 +0000 (02:08 -0600)]
LU-14895 brw: log T10 GRD tags during checksum calcs
Log the T10 guard tags during checksum calculation on the client and
target to help identify where checksum errors are being introduced.
The added debugging is only active on RPC resend, so will not add
overhead during the normal IO path.
Lustre-change: https://review.whamcloud.com/44655
Lustre-commit:
75ebfb994fb0bce8a0f0400429f04127ead50ea4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia4f14f2f2296da096acf629c74558386e7ce7057
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46053
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Boyko [Thu, 8 Apr 2021 08:23:54 +0000 (04:23 -0400)]
LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write
During recovery write operation could create and load a sequence
if it comes before creation request from MDT0. ofd_preprw_write() uses
wrong logic for taking sequence for IDIF fids. And if oid overflows
32bit and takes a part at IDIF sequence, write request loads wrong
ofd sequence. And after that it is used for other IO. The next
create from MDT0 cause an error:
Too many FIDs to precreate OST replaced or reformatted...
The test 122b reproduce issue when OST using a wrong sequence for
MDT0 IDIF. This error requires objects id grater than 32bit, and
write request during recovery, it should be processed before a create
requset from MDT0.
For a visible error at console the last object id should be
1<<32 + (OST_MAX_PRECREATE * 5). Error is
lustre-OST0000: Too many FIDs to precreate OST replaced or
reformatted: LFSCK will clean up
Lustre-change: https://review.whamcloud.com/43248
Lustre-commit:
747fed818be5a4e09281ab1d9fd5b3a13763ab40
HPE-bug-id: LUS-9595
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I09e6f88b1f0d03fec59b24ef096cbc7baa5388ae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46015
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Bobi Jam [Wed, 18 Aug 2021 13:32:21 +0000 (21:32 +0800)]
LU-14951 llite: protect fd_{lease_}och
Access ll_file_data::fd_och and fd_lease_och needs to lli_och_mutex
protection.
Lustre-change: https://review.whamcloud.com/44700
Lustre-commit:
b275ccd9787753b9cbf4368d8611c2ac94726e2e
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie9136aa345c6bf015aa73067acdaecf1a765b9f6
Reviewed-on: https://review.whamcloud.com/46030
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Yang Sheng [Tue, 11 Jan 2022 17:06:05 +0000 (01:06 +0800)]
LU-15156 kernel: back port patch for rwsem issue
RHEL7 included a defect in rwsem. It can cause a
thread hung on rwsem waiting infinity. Backport
commit:
5c1ec49b60cdb31e51010f8a647f3189b774bddf
to fix this issue.
Lustre-commit:
85362faed8f5ee94ffee1f3f6330beee57ea9284
Lustre-change: https://review.whamcloud.com/45383
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ic5c469ce744ad5882c13163a9bfe14faef8fd446
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 11 Jan 2022 17:49:41 +0000 (09:49 -0800)]
LU-14734 ldiskfs: improve message for large_dir
Make it more clear that the large_dir feature has already been
enabled, rather than making the admin think that they need to
enable the feature themselves.
Lustre-change: https://review.whamcloud.com/45046
Lustre-commit:
2a24b6ec67da9224e1cb6226166cde3a9c95431d
Test-Parameters: trivial
Fixes:
f5967b06aac5 ("LU-14734 osd-ldiskfs: enable large_dir automatically")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ica59d3370148ed277d3541c05be065c4638daf8d
Reviewed-on: https://review.whamcloud.com/46045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Mikhail Pershin [Sun, 22 Aug 2021 19:41:33 +0000 (22:41 +0300)]
LU-13397 llite: support fallocate() on selected mirror
- add ability to do fallocate() on designated mirror in
FLR file
- add missing FALLOC_FL_KEEP_SIZE flag to fallocate() call
in llapi_hole_punch(). It was just not working without
that flag silently
- add corresponding test_50d in sanity-flr.sh
Lustre-change: https://review.whamcloud.com/44721
Lustre-commit:
89736d502cc99f095237dde7520fc4ca86191882
Fixes:
4126fbb30c ("LU-13397 lfs: mirror resync to keep sparseness")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I8d700fce904c84458a50650f1d3cb09d23989eba
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Minh Diep [Mon, 9 Aug 2021 19:45:45 +0000 (12:45 -0700)]
EX-3626 build: build ptlrpc_gss during ubuntu dkms
include ptlrpc_gss in dkms.conf
Lustre-change: https://review.whamcloud.com/44539
Change-Id: I952a7019b2bc5687507fdb1f274c100152dae6cd
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46018
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Hongchao Zhang [Sun, 27 Jun 2021 21:00:20 +0000 (05:00 +0800)]
LU-14807 lfsck: fix race in lfsck_pos_fill
There is a race for lfsck->li_di_dir between lfsck_di_dir_put and
lfsck_pos_fill, which could cause lfsck_pos_fill to use freed
lfsck->li_di_dir (struct osd_it_ea) and trigger GPF.
Lustre-change: https://review.whamcloud.com/44130
Lustre-commit:
911f638bd6c547591e784fcec668fe9811916e21
Change-Id: Iedadf03ac15d128bb051aea8aafa24dbcd2704fb
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46020
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Qian Yingjin [Wed, 5 Jan 2022 00:46:35 +0000 (16:46 -0800)]
LU-15244 llite: set ra_pages of backing_dev_info with 0
The latest RHEL8.5 kernel sets initial @ra_pages of
backing_dev_info with VM_READAHEAD_PAGES:
struct backing_dev_info *bdi_alloc(int node_id)
{
...
bdi->ra_pages = VM_READAHEAD_PAGES;
bdi->io_pages = VM_READAHEAD_PAGES;
...
}
This will cause that @ra_pages of file readahead state is set
with @bdi->ra_pages, make the readahead is out of Lustre control
and trigger the readahead logic in Linux kernel wrongly. And it
results in the failure sanity 101j.
In this patch, we force to set @ra_pages of backing_dev_info with
0 after setup the backing device info. By this way, it disables
kernel readahead in the super block.
This patch also cleanups the unnecessary setting of @ra_pages in
llite "file.c" and "vvp_io.c".
Lustre-change: https://review.whamcloud.com/45712
Lustre-commit:
878561880d2aba038db95e199f82b186f22daa45
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If6468109620269c1e76abe3a1cd73c3b40a417a8
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45971
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Minh Diep [Tue, 11 Jan 2022 17:56:08 +0000 (09:56 -0800)]
RM-620 build: New tag 2.14.0-ddn28
Change-Id: I1a1d9c767cfab91a833dece3b4663ec89b2c759b
Jian Yu [Mon, 10 Jan 2022 08:37:25 +0000 (00:37 -0800)]
EX-4052 tests: use stack_trap within a subtest in sanity-lipe
Trap in a subshell is handled differently across bash versions.
This patch moves stack_trap into subtests to make them work
reliably.
Test-Parameters: trivial clientdistro=el7.9 testlist=sanity-lipe
Test-Parameters: trivial clientdistro=el8.4 testlist=sanity-lipe
Lustre-change: https://review.whamcloud.com/45889
Lustre-commit:
db8e16fd8434651ed6102a6be79828346775f87e
Change-Id: I00eac5a8cf9511c8e1e531eb54f52ce443e5f77e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46026
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Minh Diep [Thu, 6 Jan 2022 21:02:39 +0000 (13:02 -0800)]
LU-15417 build: build MOFED 5.5
The path the mofed header files has change to
/usr/src/ofa_kernel/x86_64/<kernel>
so we cannot assume it's /usr/src/ofa_kernel/default
Test-Parameters: trivial
Change-Id: I10f375b459f04b84003e70951e4e423295001f40
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46004
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Andreas Dilger [Thu, 16 Dec 2021 08:37:59 +0000 (01:37 -0700)]
RM-620 build: New tag 2.14.0-ddn27
New tag 2.14.0-ddn27
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2ec7b94cf7b8ea4d90b0e8ff1f2301e48f4d3b0e
Jian Yu [Wed, 8 Dec 2021 08:25:29 +0000 (00:25 -0800)]
LU-15337 kernel: kernel update SLES15 SP3 [5.3.18-59.37.2]
Update SLES15 SP3 kernel to 5.3.18-59.37.2 for Lustre client.
Test-Parameters: trivial clientdistro=sles15sp3 \
env=SANITY_EXCEPT="103 125 154" \
testlist=sanity
Change-Id: Ie89a1d805460420b79bb7f345918b299e08de853
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45787
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Sun, 5 Dec 2021 09:20:45 +0000 (01:20 -0800)]
LU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]
Update RHEL8.4 kernel to 4.18.0-305.25.1.el8_4 for Lustre client.
Test-Parameters: trivial clientdistro=el8.4 testlist=sanity
Change-Id: Ic70f7330f90a36646bb36e0c6015ea22882b20b9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Jian Yu [Thu, 30 Sep 2021 19:05:35 +0000 (12:05 -0700)]
LU-14690 kernel: RHEL 8.4 server support
This patch makes changes to support RHEL 8.4 release with
kernel 4.18.0-305.19.1.el8_4 for Lustre server.
Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity
Test-Parameters: trivial fstype=zfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity
Lustre-change: https://review.whamcloud.com/43791
Lustre-commit:
644a14196810f0c6b663957720414e042d2ae965
Change-Id: I484af80c4764367b40b28ce459a6ff9d87edf3a8
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44061
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mr NeilBrown [Sat, 11 Sep 2021 07:11:51 +0000 (00:11 -0700)]
LU-13783 ldiskfs: Add support for mainline 5.8 kernel
Various changes needed for 5.8 over 5.4:
- ext4_mark_inode_dirty is now a macro, so export
__export_mark_inode_dirty instead
- procfs additions need to use 'struct proc_ops'
- inode-test.c is a new C file that we MUST NOT build
- various ordinary conflicts
Lustre-change: https://review.whamcloud.com/40373
Lustre-commit:
849e93f4091a3003706668076864f086b9d59238
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I681ab26c60fb35a1ef5f518ee7cac8766e6fde47
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44361
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Serguei Smirnov [Thu, 21 Oct 2021 02:09:06 +0000 (19:09 -0700)]
LU-15136 socklnd: default conns_per_peer to 0
Setting conns_per_peer to 0 triggers socklnd to choose the
(heuristically) optimal setting for the interface given its speed.
Make 0 the default for socklnd conns_per_peer.
Lustre-change: https://review.whamcloud.com/45319
Lustre-commit:
30a028e2ee2b3eead94abd6657edc3880ec89434
Test-parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Fixes:
c44afcfb72 ("LU-12815 socklnd: set conns_per_peer based on link speed")
Change-Id: Ie6e76eaee8693472384cce362b394b216142884e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45744
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Serguei Smirnov [Wed, 28 Jul 2021 21:47:39 +0000 (14:47 -0700)]
LU-12815 socklnd: set conns_per_peer based on link speed
Specifying conns_per_peer=0 for a ni is now used to set
the conns_per_peer as a function of the corresponding link speed
as follows:
conns_per_peer = (ilog2(Gbps) / 2 + 1)
Listed below are the resulting defaults for common link speeds:
100Gbps, 200Gbps -> 4
50Gbps -> 3
5Gbps, 10Gbps -> 2
less than 4Gbps -> 1
Lustre-change: https://review.whamcloud.com/44417
Lustre-commit:
c44afcfb72a1c2fd8392bfab3143c3835b146be6
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ief2b33a796c180d8669bd5796b3e35ec748423a5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45742
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Fri, 22 Oct 2021 01:34:23 +0000 (01:34 +0000)]
LU-15150 tests: sanity-lnet removes testsuite log on failure
cleanup_testsuite() needs to be more selective when removing files
created by sub-tests.
Lustre-change: https://review.whamcloud.com/45342
Lustre-commit:
29918b2db487e7ec8b0bdf785b0a436332824db6
Test-Parameters: trivial testlist=sanity-lnet
Fixes:
aa739144551 ("LU-13569 tests: Check LNet Health recovery logic")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic17a68ff2aa552594a0f1ea470c39177abe985fc
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45743
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Serguei Smirnov [Mon, 2 Aug 2021 14:48:35 +0000 (10:48 -0400)]
LU-12815 socklnd: allow dynamic setting of conns_per_peer
Modify lnetctl and associated code to allow dynamic setting
of conns_per_peer lnd parameter per ni.
The parameter can be set for a specific active nid:
lnetctl net set --nid 192.168.122.10@tcp --conns-per-peer=4
Or when adding a new net, taking effect on the new nid:
lnetctl net add --net tcp --if eth0 --conns-per-peer=1
By default, conns_per_peer value specified as the module parameter
shall be used.
Lustre-change: https://review.whamcloud.com/41463
Lustre-commit:
a5cbe7883db6d77b82fbd83ad4c662499421d229
LU-15089 tests: allow enough time to create tcp connections
Allow enough time to create tcp connections before counting them
when testing socklnd conns_per_peer setting in sanity-lnet test_230
Lustre-change: https://review.whamcloud.com/45331
Lustre-commit:
5c766b005bf3e0bca0efa9d87ccf230e7cba97cc
LU-14991 tests: Correct whitespace in sanity-lnet test_101/102
sanity-lnet.sh test_100 and test_101 use tab characters in the
expected yaml output, but yaml syntax does not allow tab characters.
Lustre-change: https://review.whamcloud.com/44856
Lustre-commit:
38b18436f220931924210c9019028ea8589adc1d
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I11625b9ad61f0311c294001a38b7855465491aaf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45741
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Tue, 7 Sep 2021 15:24:14 +0000 (10:24 -0500)]
LU-14990 tests: Detect correct LNet interface for sanity-lnet
Determine the names of the interfaces used for LNet by parsing the
NIDs configured after calling load_modules(). Tests which reference
eth0 are modified to use the interface associated with the primary
NID (i.e. first NID output by lctl list_nids).
Lustre-change: https://review.whamcloud.com/44857
Lustre-commit:
f9669c4d3092d44cbc2e2d3c225aee6ebaf268e9
Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-10385
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Id715aa3e5470d9c110f6248620b1a83920875e7b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45760
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andrew Perepechko [Sun, 31 Oct 2021 20:03:30 +0000 (23:03 +0300)]
LU-15171 osd-ldiskfs: xattr_sem locking missing dquot_transfer
Kernel commit
7a9ca53ae (~v4.13) added the requirement for xattr_sem
locking when calling *dquot_transfer. As of now, in rare cases, it is
possible that we can modify inode xattrs and perform their consistency
checks in parallel, which can fail.
Lustre-change: https://review.whamcloud.com/45424
Lustre-commit:
e6c7fcdaf40b130c39af2e3ee8b108c6e31a8ca8
Change-Id: I041694e30ce6c8398864c0ad57671df0bffd2f52
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-10549
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45750
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Patrick Farrell [Wed, 17 Nov 2021 20:11:45 +0000 (15:11 -0500)]
LU-15245 mdc: GET(X)ATTR to READPAGE portal
Send the MDS_GETATTR and MDS_GETXATTR RPCs to the
MDS_READPAGE_PORTAL instead of the default portal to avoid
deadlocks with other MDS_REINT RPCs that may block all of
the MDS service threads on that portal.
This deadlock occurs with MDS_GETXATTR when selinux is
enabled, because getxattr becomes part of lookup, so it
takes a reference on a lock used for lookup. However, all
of the MDS service threads on the default portal can be
consumed by threads waiting for that lock, resulting in
a deadlock when the getxattr can't be processed.
Lustre-change: https://review.whamcloud.com/45593
Lustre-commit:
ebb035756eb059b255d4c8245d42bc5d5b96bab9 (tbd)
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4fbae266022ee9fa38f3196acb1443df5056fe5e
Reviewed-on: https://review.whamcloud.com/45594
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 9 Dec 2021 00:37:51 +0000 (17:37 -0700)]
EX-4052 tests: skip sanity-lipe test_8/9
Tests failing 100% and need to be skipped until fixed.
Test-Parameters: trivial testlist=sanity-lipe
Test-Parameters: testlist=sanity-lipe mdscount=2 mdtcount=4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7b29c2f147652ae99522f71f7d156e0934a48d8a
Reviewed-on: https://review.whamcloud.com/45802
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Lai Siyao [Tue, 7 Sep 2021 09:33:21 +0000 (05:33 -0400)]
LU-13076 dne: dir migrate in QOS mode
Support "lfs migrate -m -1 ..." to migrate directory to MDTs by
space and inode usage, if system is balanced, the target MDT is
chosen in roundrobin mode, otherwise the less full MDTs will be
chosen, and the most full MDT is avoided.
Another minor change: if directory is migrated to specific MDTs,
and the target stripe count is more than 1, its subdirs may not be
migrated to the specified MDT in the command, but migrated to the
MDT where its parent stripe is located (subdir will be striped too),
as can avoid unnecessary remote directories. NB, for command like
"lfs migrate -m 0,1,2 ...", though the subdir may be located on
either MDT0, MDT1 or MDT2, its stripes will be striped over these
three MDTs, but for command like "lfs migrate -m 0 -c 3...", the
subdir may be striped on other MDTs if the subdir is not located on
MDT0.
Add sanity 230u.
Lustre-change: https://review.whamcloud.com/44886
Lustre-commit:
378c7567876b430d06031f7d380112b9bdb15166
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I6e9c3d75bfc240b21c65ba27cd5e4bcca7058325
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45478
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mr NeilBrown [Thu, 15 Oct 2020 00:10:47 +0000 (11:10 +1100)]
LU-6142 lustre: remove non-static 'inline' markings.
There is rarely any point in marking a non-static function as
'inline'. The result is to compile a state-alone function that other
files can refer to, and also to inline the code where it is used in
the same file.
In many cases the non-static inline functions are not used in the same
file, so the 'inline' marking has no effect. In other cases it may
have an effect, but it can only be needed in highly performance
critical situations where a function call must be avoided, and that
doesn't seem like in any of these cases.
So just remove the "inline".
Lustre-change: https://review.whamcloud.com/40289
Lustre-commit:
f0736a6a52ed95814d2cac875caf34f7fc233bf3
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic3243ee80f9bfd75a67dd8c89ea07d08dc36425c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45727
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Wed, 1 Dec 2021 17:58:21 +0000 (09:58 -0800)]
LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]
This patch makes changes to support new RHEL 8.5 release
for Lustre client.
Test-Parameters: trivial env=SANITY_EXCEPT="101j" \
clientdistro=el8.5
Lustre-change: https://review.whamcloud.com/45285
Lustre-commit:
951f31789f76295d182f56bef1fa8d92f69e7e2a
Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Tue, 30 Nov 2021 00:23:35 +0000 (16:23 -0800)]
LU-15184 llite: properly detect SELinux disabled case
Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.
Lustre-change: https://review.whamcloud.com/45501
Lustre-commit:
42661f7ba106b7d2e02f85a65880061585ca6ccb
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45541
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Thu, 2 Dec 2021 20:44:41 +0000 (12:44 -0800)]
LU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7]
Update RHEL7.9 kernel to 3.10.0-1160.49.1.el7.
Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9
Change-Id: I356b8a8345a4a91d6d1c1a4a9b4eab4bb5afe75b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45716
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Tue, 28 Sep 2021 18:45:13 +0000 (11:45 -0700)]
EX-2659 tests: allow multiple MDTs in sanity-lipe.sh
This patch improves sanity-lipe.sh to support
multiple MDTs.
Test-Parameters: trivial testlist=sanity-lipe
Test-Parameters: trivial testlist=sanity-lipe facet=mds1
Test-Parameters: trivial mdscount=2 mdtcount=4 \
testlist=sanity-lipe
Test-Parameters: trivial env=LIPE_FIND=lipe_find2 \
testlist=sanity-lipe
Test-Parameters: trivial env=LIPE_FIND=lipe_find2 \
testlist=sanity-lipe facet=mds1
Test-Parameters: trivial env=LIPE_FIND=lipe_find2 \
mdscount=2 mdtcount=4 testlist=sanity-lipe
Change-Id: I9db6f01e810e8c40e419dcfad409741a3334687c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44588
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Andreas Dilger [Sun, 14 Nov 2021 03:18:43 +0000 (20:18 -0700)]
RM-620 build: New tag 2.14.0-ddn26
New tag 2.14.0-ddn26
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I20db5a76274fd9dfbb59032da633798e1878cded
Patrick Farrell [Fri, 12 Nov 2021 18:18:05 +0000 (13:18 -0500)]
LU-15127 llite: Remove path from discard_warn
It is unfortunately not safe to get the path from inside
dirty page discard warn. It results in us getting and then
putting a bunch of dentries, and if 'dget' we do on our
file is the last reference on it, we deadlock like this:
ptlrpc_check_set
brw_interpret
osc_extent_finish
osc_ap_completion
cl_page_completion
vvp_page_completion_write
ll_dirty_page_discard_warn
dput
dentry_kill
__dentry_kill
evict
ll_delete_inode
cl_sync_file_range
cl_io_loop
cl_io_start
lov_io_call
cl_io_start
osc_io_fsync_start
osc_cache_writeback_range
osc_cache_wait_range
osc_extent_wait
ll_delete_inode is calling back in to *this file*, which
we are already working on, so this thread ends up waiting
for itself.
This is particularly common if the discard warn is racing
with an unmount, which will be destroying all the inodes
(not deleting them - just removing them from the local
VFS).
There is no way to safely get the path from this location.
If we are deeply committed to the functionality, it would
be possible to rewrite osc_extent_finish + brw_interpret
so they could attempt path lookup *after* the extent has
been completed.
This patch fixes the deadlock, any rewrite is left for
later.
Lustre-change: https://review.whamcloud.com/45550/
Lustre-commit:
d04a1929b3adb776173b02e0f6b82d396046dd14 (tbd)
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I537fd0d2e110c180a1369a9a3b1a644e613b18e4
Reviewed-on: https://review.whamcloud.com/45555
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Thu, 11 Nov 2021 08:19:46 +0000 (11:19 +0300)]
LU-15207 libcfs: reset hs_rehash_bits
if rehash work is cancelled, then nobody resets
hs_rehash_bits and the first iterator asserts
at LASSERT(!cfs_hash_is_rehashing(hs)) in
cfs_hash_for_each_relax().
Lustre-change: https://review.whamcloud.com/45533
Lustre-commit: TBD (from
0c51e83b1345059c7f6847ea394e589ebffd0121)
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I1a567f6be77ca6c45e5d4f256722206b12588554
Reviewed-on: https://review.whamcloud.com/45557
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Sat, 6 Nov 2021 19:16:49 +0000 (15:16 -0400)]
LU-15216 lmv: improve MDT QOS space balance
When MDTs are not balanced, QOS code tries to keep subdirectory
creation local to the same MDT when it is deep in the directory
tree, to avoid creating too many remote directories, but the
existing weight to stay on the parent MDT until 50% of other MDTs
is too radical, and causes mkdirs to be "stuck" on the same MDT.
* remove "lq_threshold_rr" from above calculation because the check
in ltd_qos_is_usable() handles this, so use only "dir_depth".
* the factor is changed to "16 / (dir_depth + 10)", then it's less
likely to stick to the parent MDT for top levels, while more
likely to stay on the parent MDT for low levels:
depth=0 -> 160%, depth=4 -> 114%, depth=6 -> 100%,
depth=8 -> 88%, depth=12 -> 72%
* rename lli_depth to lli_dir_depth to make usage more clear.
Lustre-change: https://review.whamcloud.com/45544
Lustre-commit: TBD (from
95398b056f7a88ec7830da353170e8993cecf036)
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iec6b77919b630d4baee6d54bee7bdb8ca9fb8574
Reviewed-on: https://review.whamcloud.com/45556
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Wed, 10 Nov 2021 18:08:02 +0000 (11:08 -0700)]
RM-620 build: New tag 2.14.0-ddn25
New tag 2.14.0-ddn25
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I344f62b9984e7b2338feaae34ddb310ee1a3bc17
Sebastien Buisson [Tue, 19 Oct 2021 15:59:33 +0000 (17:59 +0200)]
LU-15098 tests: sanity-sec 27a exec commands on right node
In nodemap_exercise_fileset called from sanity-sec test 27a,
make sure all commands are executed on first client, as we are
testing properties of nodemaps 'default' and 'c0'.
And make sure 'default' nodemap has admin and trusted properties
set to 1, as we are carrying operations as root.
Lustre-change: https://review.whamcloud.com/45293
Lustre-commit:
b45169276ce1ab09dae7a733859f89a6c92808e5
Test-Parameters: trivial
Test-Parameters: testlist=sanity-sec clientcount=2 env=ONLY=27a
Fixes:
0daeebcbdc ("LU-14797 nodemap: map project id")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idd9f391db60475721f3a3856b5e3bee1a18bbbca
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45488
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Wed, 30 Jun 2021 16:30:57 +0000 (18:30 +0200)]
LU-14797 nodemap: map project id
Add calls to nodemap_map_id() in order to map project IDs from
client ID to server ID and conversely.
Also extend nodemap_can_setquota() to allow setquota on project
only if ID is not squashed or deny_unknown is not set.
Update sanity-sec test_27a to exercise the feature.
Lustre-change: https://review.whamcloud.com/44119
Lustre-commit:
0daeebcbdc4e89d59221299f2687cfd3c4f00b5b
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id66458550d312404b1993ead8940c3d12eaadccd
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45487
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Tue, 29 Jun 2021 15:54:59 +0000 (17:54 +0200)]
LU-14797 sec: add projid to nodemap
Add the ability to create id maps of a new type, projid. This also
requires adding a new value to map_mode, projid_only. Finally, a new
property named squash_projid is used to map all project ID to a
default one.
Update lctl man pages to mention these additions.
Update sanity-sec test_12 and test_15 to exercise projid mapping and
squash_projid property.
Lustre-change: https://review.whamcloud.com/44108
Lustre-commit:
8a770616a5ad21360ecba63c3643cadd245a2a50
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I63eba8b0d33feaa7ece8c1788cb587fcb330357a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45486
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Thu, 21 Oct 2021 06:56:44 +0000 (08:56 +0200)]
LU-15141 quota: optimize capability check for root squash
On client side, checking for owner/group quota can be directly
bypassed if this is for root and there is no root squash.
Lustre-change: https://review.whamcloud.com/45322
Lustre-commit: TBD (
15aa2e9264f0604b185ce280df4b34ea5a280b3f)
Change-Id: If29eca428d8748df412a717615e4d0a4886ddd04
Fixes:
0b057b7179 ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/45321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Wang Shilong [Tue, 20 Jul 2021 02:36:31 +0000 (10:36 +0800)]
LU-14739 quota: fix quota with root squash enabled
This patch tries to fix several problems:
1. OSD will ignore quota if IO comes from client
cache or root, however since following change:
LU-12687 osc: consume grants for direct I/O
DIO now consumes grant too, following check for
sync IO is wrong now:
(lnb[i].lnb_flags & (OBD_BRW_FROM_GRANT | OBD_BRW_SYNC))
== OBD_BRW_FROM_GRANT)
This wass originally added to support 1.8 client, it is
going to be 2.15 now, so let's remove this broken check.
2. Server side will clear OBD_BRW_NOQUOTA if root squash
is enabled, this will revert fixes from:
"LU-13228 clio: mmap write when overquota"
We need to separate @ci_noquota and @oi_cap_sys_resource cases,
introduce a new flag OBD_BRW_SYS_RESOURCE, and extend test_75
to cover this case.
3. LU-14739 missed case that DoM quota should be considered
as well.
4. If EDQUOT is returned for root, we check the new root squash
flag OBD_FL_ROOT_SQUASH from server side. If this flag is not set,
we bypass quota for root, otherwise all root writes become sync
writes.
5. Fix a leftover problem with LU-9671 for DOM
Lustre-change: https://review.whamcloud.com/44347
Lustre-commit:
bbfdc7c1670c92747a8f98d39e1e43dc39e59e30
Fixes:
a4fbe7341baf12 ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3fd23da7d56acb5b485540333208e5d5b0b48023
Reviewed-on: https://review.whamcloud.com/45310
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Fri, 11 Jun 2021 14:49:47 +0000 (16:49 +0200)]
LU-14739 quota: nodemap squashed root cannot bypass quota
When root on client is squashed via a nodemap's squash_uid/squash_gid,
its IOs must not bypass quota enforcement as it normally does without
squashing.
So on client side, do not set OBD_BRW_FROM_GRANT for every page being
used by root. And on server side, check if root is squashed via a
nodemap and remove OBD_BRW_NOQUOTA.
Lustre-change: https://review.whamcloud.com/43988
Lustre-commit:
a4fbe7341baf12c00c6048bb290f8aa26c05cbac
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I95b31277273589e363193cba8b84870f008bb07a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/45485
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Wed, 11 Aug 2021 15:44:08 +0000 (17:44 +0200)]
LU-14929 gss: detect libkeyutils dependency
When building GSS support, gss_keyring requires libkeyutils.
So make sure this dependency is properly detected at configure time,
and include keyutils.h only when required.
Lustre-change: https://review.whamcloud.com/44597
Lustre-commit:
15998eb78e279f1bfa5059f0f65087f7851d40ff
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9fa5750f4609250ecdc1c47f68b97bff9be13ace
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45484
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Tue, 12 Oct 2021 22:15:37 +0000 (18:15 -0400)]
LU-15070 llite: update default LMV upon any change
max_inherit and max_inherit_rr was newly added, and they are missing
in lsm_md_eq(), therefore client may not update default LMV when
either of these two fields is changed.
Add sanityn 112.
Lustre-change: https://review.whamcloud.com/45237
Lustre-commit:
f3314706b4e5c21f14908650decd92a30fdc1db9
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iac71b530b3702105c4213715826b1782c6aba7ca
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Tue, 12 Oct 2021 22:20:21 +0000 (18:20 -0400)]
LU-15070 mdt: revoke remote LOOKUP lock for default LMV
When setting default LMV, it will revoke LOOKUP lock, while if dir
is remote dir, its LOOKUP lock is on MDT where its parent is located.
Lustre-change: https://review.whamcloud.com/45236
Lustre-commit:
b4645b5469c0722fdf66697379be878c071839cf
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9f079a0bcff530603725ce72cd89c14935ba913b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45495
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 4 Nov 2021 17:26:18 +0000 (11:26 -0600)]
RM-620 build: New tag 2.14.0-ddn24
New tag 2.14.0-ddn24
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie7c61bd7bbc50ccfa4e39ec82403895215548627
Jian Yu [Sat, 23 Oct 2021 01:19:31 +0000 (18:19 -0700)]
LU-15154 kernel: kernel update SLES15 SP3 [5.3.18-59.27.1]
Update SLES15 SP3 kernel to 5.3.18-59.27.1 for Lustre client.
Test-Parameters: trivial
Change-Id: Ie3c369a8e93a75b4afbde55489bd3819bb39e1de
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45350
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Thu, 9 Sep 2021 08:16:41 +0000 (11:16 +0300)]
LU-14996 lov: prefer mirrors on non-rotational OSTs
consider non-rotational OSTs as preferred unless explicit prefer
flag is set on a mirror.
Lustre-change: https://review.whamcloud.com/44883
Lustre-commit:
8507472dd37ebc07bf7eb1b772c2ff619009c233
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I787bcba0b5e45842c9d4762c7f97a8f44a4fc9cb
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lei Feng [Thu, 28 Oct 2021 09:06:36 +0000 (05:06 -0400)]
EX-4157 lipe: comment out ldiskfs functions for client-only lpcc_purge
Sometime the client system does not have ldiskfs libs, then lpcc_purge
fails to start on it because some ldiskfs symbols cannot be found.
So comment out these codes for a client-only building.
Change-Id: I4a38f1128b9e66d495f94ef7ebd91f26ea052b67
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=200-202,ONLY_REPEAT=50 \
clientextra_install_params="--packages lipe-lpcc"
Reviewed-on: https://review.whamcloud.com/45394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Tue, 26 Oct 2021 22:52:47 +0000 (18:52 -0400)]
LU-15170 llite: Switch pcc to lookup_one_len
Using kern_path to lookup files in the PCC cache means we
are subject to user namespaces, so the PCC volume must be
mapped in to a container or the cached files cannot be
found.
One solution is to switch to using lookup_one_len - this is
what the code which *creates* PCC files does. This
manually walks the path from the root, which avoids
namespace issues.
This is appropriate because PCC is kernel functionality -
the user should not be able to directly access the volume,
but it should be accessible as a cache.
Lustre-change: https://review.whamcloud.com/45436
Lustre-commit:
96c90859e14f3960b57eae54b3886aeef62f6f40 (tbd)
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Idd15574ace29543bed1a9937cb35404781714791
Reviewed-on: https://review.whamcloud.com/45380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 30 Oct 2021 06:05:55 +0000 (00:05 -0600)]
RM-620 build: New tag 2.14.0-ddn23
New tag 2.14.0-ddn23
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I115d1266b69a6abd9cba775ce6f4a4aa0f7cc1cb
Lei Feng [Thu, 28 Oct 2021 11:09:12 +0000 (07:09 -0400)]
EX-4158 lipe: keep cache data when PCC is stopped by lpcc service
If a PCC backend is managed by lpcc service, keep the cache data
in PCC backend when it is stopped by lpcc service.
Change-Id: I15e80d28ff017573b8f7b24449979072256ab6b2
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=200-202,ONLY_REPEAT=10 \
clientextra_install_params="--packages lipe-lpcc"
Reviewed-on: https://review.whamcloud.com/45396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Wed, 13 Oct 2021 06:11:18 +0000 (00:11 -0600)]
EX-4006 doc: further improvements to lfs-pcc man pages
Describe fields in lfs-pcc-state.1. Add lfs-pcc-delete.1 page.
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c165c12e674af0edbb5e84c2e5f8aeed73ebbe5
Reviewed-on: https://review.whamcloud.com/45338
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Andreas Dilger [Fri, 29 Oct 2021 17:24:03 +0000 (17:24 +0000)]
EX-4152 revert: "LU-14177 pcc: clear PCC-RO cache from old client access"
This reverts commit
c4d7cc7b871688ebdc631e907938dce2b5c10503
because it causes 2.12 clients to hang in some cases.
Change-Id: I895914b7e1204ecf308650988fa91d634d951550
Test-Parameters: trivial testlist=sanity-pfl clientversion=2.12.6-ddn42
Reviewed-on: https://review.whamcloud.com/45412
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 22 Oct 2021 17:06:56 +0000 (11:06 -0600)]
RM-620 build: New tag 2.14.0-ddn22
New tag 2.14.0-ddn22
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic1d3a194f5434e667e9b8562887888c10b1f7161
Alex Zhuravlev [Thu, 16 Sep 2021 08:20:18 +0000 (11:20 +0300)]
LU-15010 mdc: add support for grant shrink
just re-use existing mechanism used in OSC
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4cdca057d35eaff6493d047127f1fe5eee9e9620
Reviewed-on: https://review.whamcloud.com/45177
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Mon, 18 Oct 2021 14:18:27 +0000 (17:18 +0300)]
LU-15043 lod: check for spilling loops
at setting to avoid possible confusion.
Lustre-change: https://review.whamcloud.com/45083
Lustre-commit: TBD (from
1d4502e7ef3288f575849268232aca0086342822)
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I901b08f614c162607b1b5c6a992aa5b188fd8e75
Reviewed-on: https://review.whamcloud.com/45104
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Wed, 20 Oct 2021 04:07:14 +0000 (12:07 +0800)]
EX-4055 pcc: command to remove PCC mirror component
This patach adds a command "lfs pcc delete $FILE" to delete the
PCC foreign mirror layout component.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I3f56fb8134bd1e7673ef8e04dff9b8482f0e32c3
Reviewed-on: https://review.whamcloud.com/45305
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Qian Yingjin [Thu, 3 Dec 2020 09:08:38 +0000 (17:08 +0800)]
LU-14177 pcc: clear PCC-RO cache from old client access
For the purpose of the compatibility and interoperability, we have
added a PCC-RO connection flags.
To avoid inconsistent data access, MDT does not (try to) grant
layout lock to the client at the time of getattr() and open().
When an old client without PCC-RO support requests a layout lock
via a intent lock request on the file in LCM_FL_PCC_RDONLY state,
MDT needs to clear the LCM_FL_PCC_RDONLY flag on the layout first
which will invalidate all PCC-RO caches on the clients, and then
return the layout to the old client.
Lustre-change: https://review.whamcloud.com/40850
Lustre-commit: TBD (from
0c76ae7f3cb6fc3a9f70d1398f773d8afffa50f1)
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I69707d1ac53decaddd32bcf231b15d3565fb200f
Reviewed-on: https://review.whamcloud.com/45269
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Mon, 27 Sep 2021 18:29:58 +0000 (12:29 -0600)]
LU-15038 mgc: release cl_mgc_mutex on error
If local_oid_storage_init() returns an error, the cl_mgc_mutex()
should be released.
Lustre-change: https://review.whamcloud.com/45063
Lustre-commit:
7cf10b90d62256aa4d177486ff13bd61dfb9a5ff
Fixes:
3e38436dc09 ("LU-2059 llog: MGC to use OSD API for backup logs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I921dde4e9202733874d8e7f980e95af23739a655
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45330
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Wed, 20 Oct 2021 17:42:05 +0000 (17:42 +0000)]
EX-4099 revert: "LU-14739 quota: nodemap squashed root cannot bypass quota"
This reverts commit
0b057b71796cd901813f7dbc08d9459efa266740
due to 20% write performance regression in IOR.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2a05f426795fc671691e916c92b62fa107ee5620
Reviewed-on: https://review.whamcloud.com/45314
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Minh Diep [Thu, 21 Oct 2021 19:17:52 +0000 (12:17 -0700)]
EX-4056 revert LU-12019 build: Recognize Debian Kernel and set KMP dir
This reverts commit
230d4500d5a9dfada392199d77fc413382f24750.
This caused MOFED modules to use in-kernel modules, and causing
lustre fails to load.
Test-Parameters: trivial
Change-Id: I205368dccc5fd18ed0ee096c1d85e140c3de5d6d
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45327
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Chris Horn [Mon, 27 Sep 2021 20:48:02 +0000 (15:48 -0500)]
LU-15093 libcfs: Check if param_set_uint_minmax is provided
Linux kernel v5.15 commit
2a14c9ae15a38148484a128b84bff7e9ffd90d68
moved param_set_uint_minmax to common code.
Lustre-change: https://review.whamcloud.com/45214/
Lustre-commit:
8bc83a6a9e9558e78c11351f6698d06d29e3dac1
HPE-bug-id: LUS-10469
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifd1d72ae531f0f6c7cd96cc28fbc07c8a8b70886
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45324
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Minh Diep [Wed, 20 Oct 2021 18:21:52 +0000 (11:21 -0700)]
EX-4101 lipe: fix default POOL name
* set default valuies before using them
Test-Parameters: trivial
Change-Id: I69e3176b8f469f1bb0510e10e88e7f2843ee98b3
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45315
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
John L. Hammond [Wed, 20 Oct 2021 11:04:07 +0000 (06:04 -0500)]
EX-4101 lipe: fix stratagem-hp-config.sh typo
In stratagem-hp-config.sh, change is_valid_precent to
is_valid_percent.
Fixes:
dcc46813c0c3 ("EX-3890 lipe: raise pool spilling threshold")
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ic0ae2c24238fcf18ee9b6760c0f5c067aaccd84e
Reviewed-on: https://review.whamcloud.com/45307
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
James Nunez [Tue, 19 Oct 2021 19:40:07 +0000 (13:40 -0600)]
LU-14331 tests: add version check for sanity-pfl 0b
sanity-pfl test 0b was modified to not shrink the number
of stripes due to size constraints (LOV_MAX_STRIPE_COUNT).
The modified test 0b landed to Lustre 2.13.57.36 and we
should skip this test if the server version is less than
2.13.57.36.
Fixes:
a58cdc9196f ("LU-14191 lod: comp stripe count limit check")
Test-Parameters: trivial serverversion=2.12.6-ddn42 serverdistro=el7.9 testlist=sanity-pfl env=ONLY=0b
Change-Id: I60de0e227ed566835476579dbca0d745f89245bf
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Wed, 20 Oct 2021 00:55:09 +0000 (20:55 -0400)]
EX-4098 lipe: lpcc uses class only in python3
FileNotFoundError is only available from python3.
Change it to matching class in python2.
Change-Id: I63676ef8ff6a5461a7af6e9177d6bc76e39c0bc5
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=200-202,ONLY_REPEAT=50 \
clientextra_install_params="--packages lipe-lpcc"
Reviewed-on: https://review.whamcloud.com/45304
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Andreas Dilger [Fri, 15 Oct 2021 23:05:18 +0000 (17:05 -0600)]
RM-620 build: New tag 2.14.0-ddn21
New tag 2.14.0-ddn21
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If1d6bdec0b847b1ff6be9b8f3cdc61e2cfa66d61
John L. Hammond [Thu, 16 Sep 2021 21:15:54 +0000 (16:15 -0500)]
Update lipe version to 2.20.
Update lipe version to 2.20. Tag to be created at a later date.
Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I7d4c4c07bb38aae809229087ee66b57c6c128dd6
Reviewed-on: https://review.whamcloud.com/45206
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Sun, 29 Aug 2021 07:45:09 +0000 (00:45 -0700)]
EX-3390 lipe: add lipe_purge script
Usage: lipe_purge [OPTION]... --client-mount=CLIENT_MOUNT --pool=POOL -- DEVICE EXPRESSION
Scan DEVICE for files matching EXPRESSION and purge the matching files from the POOL.
Mandatory arguments to long options are mandatory for short options too.
-c CLIENT_MOUNT, --client-mount=CLIENT_MOUNT Lustre client mount point
-d, --debug display debug information
--dry-run only display what would be purged
-p POOL, --pool=POOL OST pool name
-t THREAD_COUNT, --threads=THREAD_COUNT count of the scanning thread
--no-convert do not convert the EXPRESSION
-h, --help display this help message and exit
-v, --version display version information and exit
For expression details see lipe_find(1).
For example:
$ lipe_purge --dry-run -c /mnt/lustre -p hdd -- /dev/mapper/mds1_flakey -pool hdd
lfs mirror split --destroy --pool=hdd /mnt/lustre/.lustre/fid/0x200000414:0x2672:0x0
Test-Parameters: trivial
Change-Id: I29284dadef5677f48ed6977cf40d021e38dfcaa8
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44543
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44972
Jian Yu [Sun, 29 Aug 2021 07:36:25 +0000 (00:36 -0700)]
EX-3389 lipe: add lipe_delete script
Usage: lipe_delete [OPTION]... --client-mount=CLIENT_MOUNT -- DEVICE EXPRESSION
Scan DEVICE for files matching EXPRESSION and delete the matching files.
Mandatory arguments to long options are mandatory for short options too.
-c CLIENT_MOUNT, --client-mount=CLIENT_MOUNT Lustre client mount point
-d, --debug display debug information
--dry-run only display what would be deleted
-t THREAD_COUNT, --threads=THREAD_COUNT count of the scanning thread
--no-convert do not convert the EXPRESSION
-h, --help display this help message and exit
-v, --version display version information and exit
For expression details see lipe_find(1).
For example:
$ lipe_delete --dry-run -c /mnt/lustre -- /dev/mapper/mds1_flakey -fid '*'
lfs rmfid /mnt/lustre 0x200000408:0x14245:0x0 0x200000408:0x14246:0x0
Test-Parameters: trivial
Change-Id: I9ed5247992c807e81ac4445986a07ac0d2196de3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44971
Jian Yu [Thu, 26 Aug 2021 07:13:32 +0000 (00:13 -0700)]
EX-3198 lipe: add lipe_find2 script
Add a simplified lipe_find2 script to wrap lipe_convert_expr
and lipe_scan2.
Usage: lipe_find2 [OPTION]... -- DEVICE [EXPRESSION]
--client-mount=MOUNT use the Lustre client at MOUNT for FID to path
--print-fid print FID of each file matching expression
--print-json[=ATTRS] print a JSON object with atributes specified by ATTRS
describing each file matching expression. ATTRS must be
a comma separated list of attribute names.
use 'lipe_find2 --list-attrs' to see available attributes names
--print-path[=WHICH] print path(s) of each file matching expression
WHICH must be 'one' (default) or 'all'
--absolute-paths prefix paths with MOUNT/ (for --print-path)
prefix FIDs with MOUNT/.lustre/fid/ (for --print-fid)
--delimiter=DELIM use DELIM intead of newline to delimit matches
--null use a NUL byte intead of newline to delimit matches
--threads=COUNT use COUNT scanning threads
-h,--help display this help text and exit
--debug display debug information
--list-attrs list available attribute names
--no-convert do not convert the EXPRESSION
--version output version information and exit
If --absolute-paths or --print-path is used then --client-mount is
required. It is also required if --print-json is used and ATTRS
includes paths.
For expression details see lipe_find(1).
Test-Parameters: trivial testlist=sanity-lipe
Test-Parameters: trivial env=LIPE_FIND=lipe_find2 testlist=sanity-lipe
Change-Id: Iacca61f9f5d28d17a45596b9d96f958dc50ca57f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44963
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Mikhail Pershin [Wed, 25 Aug 2021 17:03:47 +0000 (20:03 +0300)]
EX-3687 osp: do force disconnect if import is not ready
Send OSP_DISCONNECT only on health import. Otherwise,
force local disconnect for unhealthy imports.
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Icd9f171271f4e17a65503fcc710ad3aaa2b84e1e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45253
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 15 Oct 2021 18:12:56 +0000 (12:12 -0600)]
RM-620 build: New tag 2.14.0-ddn20
New tag 2.14.0-ddn20
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I211d330581f8998fc190add692d3d73cc5800d41
Sergey Cheremencev [Mon, 12 Apr 2021 23:44:34 +0000 (02:44 +0300)]
LU-14300 quota: avoid nested lqe lookup
lqe_locate called from qmt_pool_lqes_lookup for lqe
that hasn't an entry on a disk calls qmt_lqe_set_default.
This may call qmt_set_id_notify->qmt_pool_lqes_spec
and rewrite already added lqes in a qti. Rewritten
lqes may trigger an assertion:
LustreError: 5072:0:(qmt_pool.c:838:qmt_pool_lqes_lookup())
ASSERTION( (((qmt_info(env)->qti_lqes_num) > 16 ?
qmt_info(env)->qti_lqes : qmt_info(env)->qti_lqes_small)[(
qmt_info(env)->qti_glbl_lqe_idx)])->lqe_is_global ) failed:
LustreError: 5072:0:(qmt_pool.c:838:qmt_pool_lqes_lookup()) LBUG
Pid: 5072, comm: mdt_rdpg00_003 3.10.0-957.1.3957.1.3.x4.1.15.x86_64
Call Trace:
[<
ffffffffc046f62c>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[<
ffffffffc046f94c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[<
ffffffffc0e4ae38>] qmt_pool_lqes_lookup+0x798/0x8f0 [lquota]
[<
ffffffffc0e3b0ce>] qmt_intent_policy+0x86e/0xe00 [lquota]
[<
ffffffffc109d53d>] mdt_intent_opc+0x3bd/0xb40 [mdt]
[<
ffffffffc10a5134>] mdt_intent_policy+0x1a4/0x360 [mdt]
[<
ffffffffc0a7bedb>] ldlm_lock_enqueue+0x3cb/0xad0 [ptlrpc]
[<
ffffffffc0aa4a46>] ldlm_handle_enqueue0+0xa56/0x1610 [ptlrpc]
[<
ffffffffc0b304b2>] tgt_enqueue+0x62/0x210 [ptlrpc]
[<
ffffffffc0b3753a>] tgt_request_handle+0x7ea/0x1750 [ptlrpc]
or a deadlock(2 same lqes qti_lqes array):
call_rwsem_down_write_failed+0x17/0x30
qti_lqes_write_lock+0xb1/0x1b0 [lquota]
qmt_dqacq0+0x2ee/0x1ac0 [lquota]
qmt_intent_policy+0xbfe/0xe00 [lquota]
mdt_intent_opc+0x3ba/0xb50 [mdt]
mdt_intent_policy+0x1a1/0x360 [mdt]
ldlm_lock_enqueue+0x3d6/0xaa0 [ptlrpc]
ldlm_handle_enqueue0+0xa76/0x1620 [ptlrpc]
tgt_enqueue+0x62/0x210 [ptlrpc]
tgt_request_handle+0x96a/0x1680 [ptlrpc]
kthread+0xd1/0xe0
Patch adds a sanity-quota_73b to check that the isssue
doesn't exist anymore.
Lustre-change: https://review.whamcloud.com/43326
Lustre-commit:
188112fc806c8c61d536ba3230b8d50f65e4f8fc
Change-Id: Ib1ebe82c3b6e819b2538f30af08930060bd659ae
HPE-bug-id: LUS-9902
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45183
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sergey Cheremencev [Thu, 10 Sep 2020 12:48:01 +0000 (15:48 +0300)]
LU-13952 quota: default OST Pool Quotas
Patch makes ability to set default quota
limits per OST pool.
Patch also adds sanity-quota_73.
Lustre-change: https://review.whamcloud.com/39873
Lustre-commit:
25a70a88c9eb35b7e43347c0d8220e334591d25e
Test-Parameters: testlist=sanity-quota
HPE-bug-id: LUS-9133
Change-Id: I9e49def231aeeed4588e5e3fbcd29fdd62a35855
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-on: https://review.whamcloud.com/45182
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Qian Yingjin [Wed, 22 Sep 2021 03:39:19 +0000 (11:39 +0800)]
EX-3880 pcc: add pcc_async_affinity for async PCC attach
This patch adds a tunable parameter "llite.*.pcc_async_affinity"
that enables or disables the CPT selection in PCC-RO asynchronous
attach for testing.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1473a7547555a2d6c615d37182b6cc359194aae0
Reviewed-on: https://review.whamcloud.com/45011
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mr NeilBrown [Tue, 28 Sep 2021 06:44:40 +0000 (16:44 +1000)]
LU-6142 lod: return pools_hash_params to being static.
A recent patch changes pools_hash_params in lod_pool.c to no longer
be 'static'. This is not ideal.
rhashtable interfaces are mostly 'static inlines' which contain a lot
of code which is mostly optimised away providing that the 'params'
structure is const and locally visible. When these interfaces are
called with a params structure in another file, the code produces is
quite inefficient and wasteful.
It is generally cleaner to provide accessor functions which can be
exported to other compilation units. It is even beneficial to do that
within the one file.
This patch introduces
lod_pool_exists()
and
lod_pool_find()
The first is 'extern' and thus 'pools_hash_params' can not be static.
The second is used in several places in lod_pool.c, improving code
quality and maintainability.
Lustre-change: https://review.whamcloud.com/45070
Lustre-commit:
9ec5e2329bf3d7e38fa899a259221aa58fb48cd4
Fixes:
0a998f4723f5 ("LU-14825 lod: pool spilling")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ieafe2f23fe5cc71d9bdce73cbe7360f5cb540edf
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45185
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Tue, 20 Jul 2021 13:30:24 +0000 (16:30 +0300)]
LU-15011 lod: count all spilling events
when target pool is used to as the original has no enough space.
lctl lod.*.pool.<poolname>.spill_hit can be used to get the counter.
Lustre-change: https://review.whamcloud.com/44947
Lustre-commit: TBD (from
6348594defc0b1a414b45abe309a8a18b1da303e)
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I6d54a2b910705da182b5f4118e535d196cdab004
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44948
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
James Nunez [Tue, 12 Oct 2021 23:35:17 +0000 (17:35 -0600)]
EX-4041 tests: look for lamigo log on MDS
hot-pools tests 7, 8 and 10 grep the lamigo log from a
client node looking for job completion or other lamigo
information. The lamigo log is written to a space on the
MDS. Change the tests to cat the log file on the MDS
and pipe that to grep.
Test-Parameters: trivial testlist=hot-pools
Change-Id: I78813d6188216b2529e1f6e41252bc6fe1fc9514
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45218
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 14 Oct 2021 20:01:30 +0000 (14:01 -0600)]
LU-15106 ofd: quiet deprecated param warning
There are a number of obdfilter parameter files that report a
warning even when they are read, which is confusing for users
if there is a tool that is scraping all available parameters:
# lctl get_param obdfilter.*.*
ofd: 'obdfilter.*.read_cache_enabled' is deprecated,
use 'osd-*.read_cache_enabled' instead
ofd: 'obdfilter.*.readcache_max_filesize' is deprecated,
use 'osd-*.readcache_max_filesize' instead
ofd: 'obdfilter.*.sync_on_lock_cancel' is deprecated,
use 'obdfilter.*.sync_lock_cancel' instead
ofd: 'obdfilter.*.writethrough_cache_enabled' is deprecated,
use 'osd-*.writethrough_cache_enabled' instead
It should only print a message if the parameters are actually written.
Also fix the messages to reference the correct parameter names.
Most of these parameter links were added in 2.4 with the addition of
osd-ldiskfs. However, the deprecation warnings were only added in
2.12.53 and slated for removal in 2.15, but were not backported to
2.12 LTS, and there hasn't been an LTS release since then, so it is
better bump removal so the upcoming 2.15 LTS release includes them.
Fix the test scripts to only use the new parameter names, to avoid
spurious warning messages. We don't test interop with 2.3 anymore.
Lustre-change: https://review.whamcloud.com/45246
Lustre-commit: TBD (from
cf05bfb35f00c72f27ec36619259cc77eb56186d)
Test-Parameters: trivial
Fixes:
7df7347b7b18 ("LU-12967 ofd: restore sync_on_lock_cancel tunable")
Fixes:
493cd8088388 ("LU-8066 osd: migrate from proc to sysfs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie548e5b6af5463959fb4774e31996097373ebbe5
Reviewed-on: https://review.whamcloud.com/45247
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Jian Yu [Wed, 13 Oct 2021 16:55:54 +0000 (09:55 -0700)]
LU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7]
Update RHEL7.9 kernel to 3.10.0-1160.45.1.el7.
Test-Parameters: trivial \
env=SANITY_EXCEPT="415 418" \
clientdistro=el7.9 serverdistro=el7.9 \
testlist=sanity
Change-Id: I11c307bfd6a6b353bc7b6fe40bb5d604bc9b3fdc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45228
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Thu, 16 Sep 2021 18:55:04 +0000 (12:55 -0600)]
NVDA-86 pcc: print kern_path() error on failure
When looking up a PCC cache pathname in kern_path() returns an error,
print the pathname and returned error before it is ignored.
Test-Parameters: trivial testlist=sanity-pcc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9ef59ffedc2c6d11cc7ab1abb4098c56f23ebbe5
Reviewed-on: https://review.whamcloud.com/44962
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Andreas Dilger [Wed, 13 Oct 2021 20:05:38 +0000 (14:05 -0600)]
LU-13997 tests: fix sanity test_418 lock cancellation
Use "do_nodes" directly to cancel DLM locks, rather than
"do_rpc_nodes", since that is very heavy to use in a loop
(each call takes 3s, but the loop delay is only 0.2s).
Due to DoM reserving grant space for the DoM files, the "avail"
space shown by "df" may be smaller in the aggregate returned by
the MDT compared to the individual values from "lfs df".
Skip this part of the check until MDC grant cancel is fixed.
Lustre-change: https://review.whamcloud.com/45231
Lustre-commit: TBD (from
e4bb910bba91199af21199080c5f3eb1070c59c9)
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I83d989688ce671f0ff9c62ebdf3144746a3ebbe5
Reviewed-on: https://review.whamcloud.com/45232
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>