Whamcloud - gitweb
Andreas Dilger [Fri, 28 May 2021 21:15:10 +0000 (15:15 -0600)]
LU-14489 utils: fix 'lfs find --mdt-count'
Running "lfs find --mdt-count" causes the find to exit if there
is no directory striping, rather than continuing to the next item.
If cb_get_dirstripe() receives ENODATA then it should consider
that directory as not having any striping and move on, rather
than returning this error to the caller.
Don't crash in cb_getdirstripe() if it is called with a NULL
directory pointer or no directory is opened.
Lustre-change: https://review.whamcloud.com/43866
Lustre-commit:
baba1fd07a977a62295482919e9218f877c0535a
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8dd135a86a6a8911bf804542132b2e7a3ce7057
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/44945
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Yang Sheng [Fri, 10 Jul 2020 15:31:17 +0000 (23:31 +0800)]
LU-11776 utils: add support lfs find with mdt hash flag
The lfs find can use mdt hash flag as a condition. Also
change it can find with one more mdt hash type.
Lustre-change: https://review.whamcloud.com/39340
Lustre-commit:
00141b1a746d4733c2f52c7a7edec36da4cedcac
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I599bb1a3cc2c9ea2a523f50f119bd93a5520d213
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44944
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Fri, 4 Jun 2021 03:58:29 +0000 (11:58 +0800)]
LU-14780 llite: failed ASSERTION(ldlm_has_layout(lock))
When setting layout in layout lock, the lock could lost its layout
bits, and we'd try fetch the layout lock again.
Lustre-change: https://review.whamcloud.com/44054
Lustre-commit:
1b166d6dd6a2f39dfe35b60be169b288665d0283
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I10f96e4cb03cfe228d3c1ea1500b1a8d8e4e5e54
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44934
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
James Nunez [Mon, 13 Sep 2021 21:16:19 +0000 (15:16 -0600)]
EX-3342 tests: correct Lustre version in test skip checks
Many patches land to the EXAScaler branches as ports from
other branches. Sometimes the tests that are included with
the ported patches check the version of Lustre to ensure
that the feature it tests exists in this version of Lustre.
These version values are not always changed when patches
are ported from one branch to another.
Change Lustre test suite version checks to be relative to
this branch.
Fixes:
5cfcd52d (“LU-13417 mdd: set default LMV on ROOT”)
Fixes:
c75e68e5 (“LU-14804 nodemap: do not return error for improper ACL”)
Fixes:
f2d1c4ee (“LU-14647 flr: mmap write/punch does not stale other mirrors”)
Fixes:
86847243 (“LU-13730 lod: don't confuse stale with primary flag”)
Test-Parameters: trivial env=ONLY="0 432" testlist=sanity
Test-Parameters: env=ONLY="50 207" testlist=sanity-flr
Change-Id: I9f1f0c1b89d5df7082e4fcaee385b724a453f331
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44906
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Fri, 27 Aug 2021 05:42:56 +0000 (08:42 +0300)]
LU-14967 obdclass: EAGAIN after rhashtable_walk_next()
rhashtable_walk_next() can return -EAGAIN when concurrent resizing
has happened. so the callers should check for this error and just
repeat rhashtable_walk_next().
Lustre-change: https://review.whamcloud.com/44766
Lustre-commit:
96aa615f91cd25b04c393f16f122e33f6744fdc9
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I15ba2cdf16c2678e18836b4f16b56a3b8bfdacd0
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44937
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Tue, 7 Sep 2021 20:05:40 +0000 (13:05 -0700)]
LU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1]
Update SLES12 SP5 kernel to 4.12.14-122.83.1 for Lustre client.
Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 160h 430c 817" testlist=sanity
Change-Id: I2b35d129550b895324bb3e2e61910ad10e846f03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Thu, 16 Sep 2021 02:45:37 +0000 (10:45 +0800)]
EX-3814 pcc: print help msg more clearly for detach
In this patch, it prints the help message for detach_fid and
detach commands more clearly when not given the required
parameters such as mount point or FIDs.
It also ignores the -EINPROGRESS error if the file is being
attached, i.e. copy data from Lustre OSTs into PCC.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I547e80e5c9c213b159039b9b79da176cdb91c4bc
Reviewed-on: https://review.whamcloud.com/44954
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Fri, 30 Jul 2021 21:36:19 +0000 (15:36 -0600)]
LU-14895 client: allow case-insensitive checksum types
The current t10ip4K and t10crc4K checksum types use an upper-case 'K'
in the name, unlike the other checksum types which are all lower-case.
This is distinction is difficult to see in some fonts, and can cause
usage errors. Accept upper-case variants of the checksum type names.
Lustre-change: https://review.whamcloud.com/44530
Lustre-commit: TBD (from
48a8218fdd0d0ed876fb39d29542fd1751c2e341)
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I97673ffa98cf8e5fc601ac7df5aaafb24b3ebbe5
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44940
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Serguei Smirnov [Mon, 5 Jul 2021 18:23:33 +0000 (11:23 -0700)]
LU-14806 o2iblnd: clear fatal error on successful failover
In IB bonding configuration link down event causes fatal error
flag to be set on the bonded interface so it is not selected by
LNet for tx, e.g. when just one of the two cables is pulled.
This change allows for the interface status to be restored on
successful failover.
Lustre-change: https://review.whamcloud.com/44139
Lustre-commit:
4668283cd13079dd6d86482704aef593f5c01dff
Test-Parameters: testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ifd55b141e73d01a187c02ede3f021f0eab18e0bb
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44933
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:01 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Avoid double posting invalidate
When the kib_tx is provisioned during kiblnd_fmr_pool_map(), spare
WRs in the kib_fast_reg_descriptor are setup and the mapping of
pages is given to the mr.
kiblnd_post_tx_locked() then posts the spare WRs from the
kib_fast_reg_descriptor.
if (rc == 0)
return 0;
The code returns and the kib_fast_reg_descriptor is still contains
the spare WRs. The next time the kib_tx is used, the
now obsolete WRs will be inadvertently posted. For rdmavt, the
obsolete invalidate will cause an -EINVAL to be returned from
the post send.
Fix by adding a state variable frd_posted to kib_fast_reg_descriptor.
The variable is set to false in kiblnd_fmr_pool_unmap().
kiblnd_post_tx_locked() is adjusted to avoid prepending the
kib_fast_reg_descriptor WRs when frd_posted is true. After
the post succeeds, the frd_posted is set to true.
Lustre-change: https://review.whamcloud.com/44190
Lustre-commit:
5930576791e864529e6ef9b46f3e09cc4b635fc2
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: I426dd05e635392e75d1aa48808782a229e83ce5f
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44932
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:00 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Move racy NULL assignment
kiblnd_fmr_pool_unmap() can race map and subsequent processing
because of this flaw in unmap:
if (frd) {
frd->frd_valid = false;
spin_lock(&fps->fps_lock);
list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list);
spin_unlock(&fps->fps_lock);
fmr->fmr_frd = NULL;
}
The fmr can be pulled off the list in kiblnd_fmr_pool_unmap() on
another CPU an fmr_frd could be in a state of flux and
potentially be seen incorrectly later on as the kib_tx is processed.
Fix my moving the fmr_frd assignment to before the fmr is added to the
list.
Lustre-change: https://review.whamcloud.com/44189
Lustre-commit:
023113fb8946f3565529e7327fdcd90ab9db3ba3
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: Ibddf132a363ecfe9db3cc06287cec873c021d2fb
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44931
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Tue, 15 Jun 2021 14:47:39 +0000 (17:47 +0300)]
LU-12577 llog: protect partial updates from readers
llog_osd_write_rec() adds a record in few steps: the header is
updated first, then the record itself is appended. per-loghandle
semaphore is used, but remote readers allocate a new separate
loghandle for every access (header reading, blocks), the the
readers can't use loghandle's semaphore to avoid accessing partial
updates. use object-based locking [censored] to serialize the writer
vs the readers.
Lustre-change: https://review.whamcloud.com/43589
Lustre-commit:
03dd1bb036d426a692584d73f66bcdb221658d79
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie4e4d4a1e9a6fcdea9fcca7d80b0da920e786424
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44935
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Oleg Drokin [Wed, 21 Jul 2021 20:03:10 +0000 (16:03 -0400)]
LU-14877 llite: Remove inode locking in ll_fsync
It does not appear to be necessary
Lustre-change: https://review.whamcloud.com/44368
Lustre-commit:
e8d76d1090e912ee5d916284ca5c8ba9195ddd9b
Change-Id: I0142a9dca4ecc6893521275b69a0a46012eab0b0
Fixes:
8f3ef1e961 ("LU-812 llite: 3.0+ kernel fsync should call write")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44921
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
James Simmons [Thu, 10 Jun 2021 16:53:57 +0000 (12:53 -0400)]
LU-14752 obdclass: handle EBUSY returned for lu_object hashtable
When the rhashtable grows to a certain size it will be rescaled.
When rescaling you can be returned a ENOMEM or EBUSY error. This
we reported as:
LustreError: 3594004:0:(lu_object.c:2472:lu_object_assign_fid()) ASSERTION( rc == 0 ) failed: failed hashtable insertion: rc = -16
LustreError: 3594004:0:(lu_object.c:2472:lu_object_assign_fid()) LBUG
Pid: 3594004, comm: mdt01_020 4.18.0-240.22.1.1toss.t4.x86_64 #1 SMP Tue Apr 13 17:18:40 PDT 2021
Call Trace TBD:
Kernel panic - not syncing: LBUG
...
Call Trace:
dump_stack+0x5c/0x80
panic+0xe7/0x2a9
lbug_with_loc.cold.10+0x18/0x18 [libcfs]
lu_object_assign_fid+0x3b8/0x3c0 [obdclass]
Add handling the EBUSY case for our lu_object hash.
Lustre-change: https://review.whamcloud.com/43968
Lustre-commit:
285a29d3b5e47f63a94c0682040ddbf09614f130
Fixes:
aff14dbc522 ("LU-8130 lu_object: convert lu_object cache to rhashtable")
Change-Id: Id85f32633117e02850b799e8d95e3e35d982cbd4
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44926
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Wed, 1 Sep 2021 08:54:04 +0000 (11:54 +0300)]
LU-13997 tests: sanity/418 to cancel all client locks
verify idea about dirty client's data
Lustre-change: https://review.whamcloud.com/44803
Lustre-commit: TBD (from
42a090928b36bc14d8cb9a73a7f5b719eff38a2e)
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifef58a98b26c7790274d2a57aa52e4475e923dd0
Reviewed-on: https://review.whamcloud.com/44936
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Tue, 14 Sep 2021 08:40:03 +0000 (16:40 +0800)]
EX-2859 pcc: keep mtime unchange when attach file into PCC
Modifying the timestamps of the files for the attach will cause
problems for the cache manager, since all files will appear new
at the time they are imported into the cache.
And that may also confuse applications if the file mtime has
changed just because of attach.
In this patch, it keeps the file mtime unchanged when attach it
into PCC.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I26f0c19c7b6cc1af0d62c192931d0042c9614993
Reviewed-on: https://review.whamcloud.com/44909
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Wed, 15 Sep 2021 18:40:10 +0000 (21:40 +0300)]
EX-3804 lod: use kstrtoint_from_user()
which checks the address to be presented in memory
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I573ac593a8886caf6ee49c285674aed870eb6b2f
Reviewed-on: https://review.whamcloud.com/44928
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 15 Sep 2021 18:48:29 +0000 (11:48 -0700)]
EX-2240 lipe: remove fake lustre headers
The fake lustre headers should be removed and we should include
normal lustre headers instead. The two headers are fake_lustre_idl.h
and fake_lustre_disk.h. Including the fake headers makes it harder
to include other lustre headers due to conflicts.
Test-Parameters: testlist=sanity-lipe
Change-Id: I95d6b50c9f9fd9f675eceba00464053124b7279c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44897
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Mikhail Pershin [Tue, 27 Jul 2021 10:37:01 +0000 (13:37 +0300)]
LU-13055 changelog: use default mask if server has no mask
When registering a new maskless user and server has no specific
mask set then effective mask to be set to DEFAULT value
Lustre-commit:
1c91131941b3c02c60c2dc852b23490dce3f2485
Lustre-change: https://review.whamcloud.com/44404
Fixes:
a15eb4f132 ("LU-13055 mdd: per-user changelog names and mask")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If799cb5cc29c60cce6ef6c987f2e493145e00e31
Reviewed-on: https://review.whamcloud.com/44411
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Qian Yingjin [Thu, 9 Sep 2021 06:47:47 +0000 (14:47 +0800)]
EX-3764 pcc: avoid panic in asynchronous attach thread
When PCC attach a file asynchronously, it wrongly uses
@pccx->pccx_file after put the file in @pcc_attach_context_free().
This may casue panic in pcc_readonly_attach_thread().
This patch fixes it by using @pccx->pccx_file before put it.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Iaa93403e4db7497923033d327e689627790fa6a0
Reviewed-on: https://review.whamcloud.com/44881
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Wed, 16 Jun 2021 20:48:33 +0000 (14:48 -0600)]
LU-14767 utils: mkfs.lustre allow lazy_itable_init=1
When "lazy_itable_init=0" was added to the mke2fs options the call
to append_unique() to see whether "lazy_itable_init" was already
listed in the mke2fs options was incorrect. It checks to see if
"lazy_itable_init=0" is already present in the options, and doesn't
match "lazy_itable_init=1" if it was specified on the command-line.
Separate the key and value passed to append_unique() so that it can
check if any form of the key is present in the existing options.
Lustre-change: https://review.whamcloud.com/44019
Lustre-commit:
a81c093a935c62b9e4586ae930aab7439948d538
Test-Parameters: trivial testlist=conf-sanity
Fixes:
701cc249594e ("LU-13533 utils: ext4lazyinit should be disabled")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic7a6dbb81f004dd35f0f1c5f5ddec0fb363ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://review.whamcloud.com/44920
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Minh Diep [Thu, 5 Aug 2021 03:57:08 +0000 (20:57 -0700)]
EX-3587 build: Use explicit python2
Use explicit python2 since lipe is not
ready to move to python3 yet
Change-Id: I289775f522d0f0e284fc8cfba3ca1737f3e27c79
Test-Parameters: trivial
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44503
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Jian Yu [Sat, 11 Sep 2021 07:31:51 +0000 (00:31 -0700)]
LU-8837 utils: move lustre_disk_data back to lustre_disk.h
This patch moves struct lustre_disk_data from mount_utils.h
back to lustre_disk.h so that it can be used in other codes
without including mount_utils.h.
Lustre-change: https://review.whamcloud.com/44829
Lustre-commit: TBD
Fixes:
d62efba975d2 ("LU-8837 utils: make tools lightweight for lustre clients")
Change-Id: I589da2710e3cbe7d93a59928143f2b5cac955e6e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44892
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Lei Feng [Wed, 18 Aug 2021 09:19:46 +0000 (17:19 +0800)]
EX-3050 lipe: lpcc_purge supports version and revision
Print unified lipe version and revision in lpcc_purge --version command
and stats dumpfile.
Change-Id: I78e500d4f765b638662a90f21f4e5d7ebd2209e2
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/44699
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Wed, 18 Aug 2021 01:47:54 +0000 (09:47 +0800)]
EX-3541 lipe: aggregate LPCC information in lpcc status command
Aggregate LPCC backend information in lpcc status command. Including
the configuration of LPCC, lpcc_purge and stats of lpcc_purge.
Change-Id: I6fd394038b0b9b6279a592bd324b76f90585808e
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/44696
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Fri, 6 Aug 2021 02:40:21 +0000 (10:40 +0800)]
EX-3554 lipe: improve log messages for lpcc_purge
lpcc service handle log_level option for lpcc_purge.
Add more log messages for lpcc_purge.
Change-Id: I1fd41d32cc6add00acea60c38985774cd5b7071a
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/44519
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 11 Sep 2021 06:53:22 +0000 (00:53 -0600)]
RM-620 build: New tag 2.14.0-ddn14
New tag 2.14.0-ddn14
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I96f78051cb1425adc5b6188a7c458e91678134f0
Minh Diep [Sat, 11 Sep 2021 03:53:01 +0000 (20:53 -0700)]
EX-3780 lipe: clean up specfile
Clean up specfile error due to previous merge
Change-Id: I1dc33310027286860c446b9eb0cc01c16b2c5407
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44891
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Andreas Dilger [Sat, 11 Sep 2021 02:44:06 +0000 (20:44 -0600)]
RM-620 build: New tag 2.14.0-ddn13
New tag 2.14.0-ddn13
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaa7a9b6e6dab2fbd4704a4bb8b737921270a5fe7
Jian Yu [Thu, 9 Sep 2021 00:05:22 +0000 (17:05 -0700)]
LU-14993 kernel: kernel update RHEL8.4 [4.18.0-305.17.1.el8_4]
Update RHEL8.4 kernel to 4.18.0-305.17.1.el8_4 for Lustre client.
Test-Parameters: trivial clientdistro=el8.4
Change-Id: I95e97e1b39e8c49a80f12c9c1b2076553c3dcd49
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44874
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Thu, 9 Sep 2021 00:56:32 +0000 (17:56 -0700)]
LU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7]
Update RHEL7.9 kernel to 3.10.0-1160.42.2.el7.
Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9
Change-Id: I377ea5d1e28c50b1087dfca7cb32f44afb9bf5f5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44879
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
John L. Hammond [Sat, 11 Sep 2021 01:28:33 +0000 (20:28 -0500)]
EX-2921 lipe: merge lipe changes from b_es5_2
Merge commit '
d4e8316dafcd4c3cdb5092dc6f2a857dc28065fa' into b_es6_0
$ git checkout b_es5_2
$ git subtree split --prefix=lipe
6085b19ae7daa054857bf14d05740ff1224aef01
$ git checkout b_es6_0
$ git subtree merge --prefix=lipe --squash
6085b19ae7daa054857bf14d05740ff1224aef01
Change-Id: I4b8bfa69d312bfe93ad37da1737df9025c9ed0b5
John L. Hammond [Sat, 11 Sep 2021 01:23:34 +0000 (20:23 -0500)]
Squashed 'lipe/' changes from
b7b776f968..
6085b19ae7
6085b19ae7 EX-3738 hotpools: Strict ordering of client and lamigo
a33bdd5d61 EX-3441 lipe: add pool spilling to stratagem-hp-{config,convert}.sh
4c16ba7e00 EX-3092 lipe: remove the lipe-hsm RPM
ef76926047 EX-3725 lipe: fix json.h include
git-subtree-dir: lipe
git-subtree-split:
6085b19ae7daa054857bf14d05740ff1224aef01
Nathaniel Clark [Wed, 1 Sep 2021 14:39:54 +0000 (10:39 -0400)]
EX-3738 hotpools: Strict ordering of client and lamigo
Ensure strict ordering for start and stop of client mount
and lamigo/lpurge.
Ensure enough of a timeout for start and stop.
Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I7c50abc5ff82f0cc3fd117fd961f7421ad2df9be
Reviewed-on: https://review.whamcloud.com/44807
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Gaurang Tapase [Tue, 10 Aug 2021 13:02:02 +0000 (18:32 +0530)]
EX-3430: Improvement in HP config for client mount
Use custom lustre-client resource agent for client
mount/unmount which in turn uses the pumount utility
to unmount. This makes sure the client is unmounted
gracefully while stopping the resource.
Remove stratagem-hp-convert.sh as it is no longer used.
Test-Parameters: trivial
Change-Id: Idc5fc2a59e4783a4011fc65a0ae3d69281b1a20f
Signed-off-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-on: https://review.whamcloud.com/44547
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Sebastien Buisson [Mon, 28 Jun 2021 18:32:16 +0000 (20:32 +0200)]
LU-14677 sec: do not expose security.c to listxattr/getxattr
security.c xattr, which contains encryption context, should not be
exposed by the xattr-related system calls such as listxattr() and
getxattr() because of its special semantics.
Update sanity-sec test_57 to test this.
Lustre-change: https://review.whamcloud.com/44101
Lustre-commit: TBD (
db49ec2ae2c96a09fae054e5fcb3d1959e26f83d)
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I919f5cbafc53f5745fbfb5b9d2d7316e892d8c9f
Reviewed-on: https://review.whamcloud.com/44183
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Fri, 9 Jul 2021 13:41:34 +0000 (15:41 +0200)]
LU-14677 llite: move env contexts to ll_inode_info level
Contrary to file, inode is always available, so move the list of
env contexts from the file data to the ll_inode_info level.
This is needed because we will have to handle env properties in
ll_get_context() and ll_xattr_list()/ll_listxattr().
This also requires changing lli_lock from a spinlock to an rwlock.
Lustre-change: https://review.whamcloud.com/44198
Lustre-commit: TBD (
932007c91333117b7b0905ce5601aafc9b3bdd4e)
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I478d2a8eabfcb09074ba52601f05840d047a6da2
Reviewed-on: https://review.whamcloud.com/44199
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Fri, 28 May 2021 16:11:53 +0000 (18:11 +0200)]
LU-14677 sec: migrate/extend/split on encrypted file
lfs migrate/extend/split makes use of volatile files to swap layouts.
When operation is carried out on an encrypted file, the volatile file
must be assigned the same encryption context as the original file, so
that data moved/copied to different OSTs is identical to the original
file's.
Also update sanity-sec test_52 to exercise these commands.
Lustre-change: https://review.whamcloud.com/43878
Lustre-commit:
09c558d16f0a80f436522edde89367c088fe2055
Change-Id: I3878b5e9e6d3738dfee0ce0f89a3646e6a7b976f
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/43879
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Wed, 1 Sep 2021 13:45:47 +0000 (21:45 +0800)]
EX-3741 pcc: add pcc_mode parameter for permission check
This patch introduced a "llite.*.pcc_mode" parameter for PCC.
By this parameter, administrator can determine what file access
permissions should be allowed to bring files into PCC device for
caching.
This paramter is set with 0 by default.
Add sanity-pcc test_46 to verify it.
In this patch, it also ignores the EEXIST error when found that the
file had already attached into PCC during the manual attach.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1e006e4f723c1c177ae84c64ad32c6049a57110f
Reviewed-on: https://review.whamcloud.com/44804
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Sat, 28 Aug 2021 08:00:21 +0000 (02:00 -0600)]
LU-13798 llite: fix LL_SBI_FLAGS array declaration
Fix the LL_SBI_FLAGS to string mapping array. On master this
has "foreign_symlink" and "foreign_symlink_upcall" before the
"parallel_dio" option, but those do not exist on b_es6_0.
Instead, there is "snapshot" in one of those slots, and the
second is unused.
Since these are in-memory flags only, the actual values are
not critical, and there is a patch in-flight to clean up this
code to be more robust.
In the meantime, what is important is that LL_SBI_PARALLEL_DIO
has the proper "parallel_dio" string in the right spot.
Test-Parameters: trivial
Fixes:
00152903a180 ("LU-13798 llite: parallelize direct i/o issuance")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie7134c051e85a5a2a90dbeb3145e8d8c09f6d24e
Reviewed-on: https://review.whamcloud.com/44776
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Jian Yu [Wed, 1 Sep 2021 17:30:01 +0000 (10:30 -0700)]
LU-14977 kernel: kernel update RHEL7.9 [3.10.0-1160.41.1.el7]
Update RHEL7.9 kernel to 3.10.0-1160.41.1.el7.
Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9
Change-Id: Ib57c3ad3b750f4af93cecb372fa5547ccee68fee
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44812
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Mon, 6 Sep 2021 07:51:55 +0000 (15:51 +0800)]
EX-3752 pcc: show attaching state for PCC state output
When set llite.*.pcc_async_threshold=0, the client will do PCC
attach in asynchronous way.
When the file is large, attaching the file into PCC may take some
time.
In this patch, we improve that output of the PCC command
"lfs pcc state" to show that the file is in PCC attaching state
when the file is still in the phase of copying from Lustre OSTs
to PCC.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I101d87638f5afac41fb4f55b4aaf95d938bc8ccd
Reviewed-on: https://review.whamcloud.com/44852
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
James Simmons [Wed, 14 Jul 2021 17:07:59 +0000 (13:07 -0400)]
LU-14844 tests: make sure mgc_requeue_timeout_min exist.
The module parameter mgc_requeue_timeout_min was introduced to reduce
testing times. Currently the test framework always tries to set this
value but it doesn't exist in earlier Lustre versions which breaks
interop testing. Set the module parameter only if it exist.
Lustre-change: https://review.whamcloud.com/44215
Lustre-commit:
dfeb63f2ee3701ef731ffcea3f79fb70d513a9dc
Test-Parameters: trivial
Change-Id: I64f62e3d6e2faeba99ced98363d241083f95d92e
Fixes:
04b2da6180d ("LU-14516 mgc: configurable wait-to-reprocess time")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44789
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Fri, 12 Mar 2021 09:00:37 +0000 (12:00 +0300)]
LU-14516 mgc: configurable wait-to-reprocess time
so we can set it shorter, for testing purposes at least. to change
minimal wait time MGC module option 'mgc_requeue_timeout_min'
should be used (in seconds). additionally a random value up to
mgc_requeue_timeout_min is added to avoid a flood of config re-read
requests from clients. if mgc_requeue_timeout_min is set to 0,
then random part will be up to 1 second.
ost-pools: before: 5840s, after:a 3474s
sanity-flr: before: 1575s, after: 1381s
sanity-quota: before: 10679s, after: 9703s
Lustre-change: https://review.whamcloud.com/42020
Lustre-commit:
04b2da6180d3c8eda21f7ab36c676462be041b74
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iff7dad4ba14d687b7e891a1c346397e4c370800d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44788
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Wed, 14 Jul 2021 09:09:39 +0000 (12:09 +0300)]
LU-14825 lod: pool spilling
To avoid the problem of the fast pool becoming full this patch
introduces so-called pool spilling: for every OST pool a target
pool can be assigned which will be used instead of original one
if the original one's use is over specified threshold:
lctl set_param lod.*.pool.pool1.spill_target=pool2
lctl set_param lod.*.pool.pool1.spill_threshold_pct=80
i.e. once pool1 is 80+% used, then new files will be created on
pool2.
A chain (up to 10 at the moment) can be configured using the
settings like above when different OST pools are considered
one by one.
Lustre-change: https://review.whamcloud.com/43989
Lustre-commit: TBD (from
be958a7bde7351856db6632d06e72e23ce916b13)
Test-Parameters: testlist=ost-pools
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7f6dd4931ba64f3db8a7ae6a3b185f942a629ed7
Reviewed-on: https://review.whamcloud.com/44303
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Thu, 26 Aug 2021 14:02:43 +0000 (22:02 +0800)]
LU-14965 ldiskfs: hold inode mutex for ldiskfs_orphan_add()
See following warning:
ldiskfs/namei.c:3331 ldiskfs_orphan_add+0x11e/0x290 [ldiskfs]
Call Trace:
dump_stack+0x19/0x1b
__warn+0xd8/0x100
warn_slowpath_null+0x1d/0x20
ldiskfs_orphan_add+0x11e/0x290 [ldiskfs]
ldiskfs_xattr_inode_orphan_add+0xbb/0x110 [ldiskfs]
ldiskfs_xattr_delete_inode+0x5c/0x350 [ldiskfs]
ldiskfs_evict_inode+0x1a8/0x630 [ldiskfs]
evict+0xb4/0x180
iput+0xfc/0x190
osd_object_delete+0x1f8/0x370 [osd_ldiskfs]
lu_object_free.isra.27+0xb8/0x1c0 [obdclass]
lu_object_put+0xa5/0x460 [obdclass]
mdt_object_put+0x30/0x110 [mdt]
mdt_reint_unlink+0x8e0/0x1890 [mdt]
mdt_reint_rec+0x83/0x210 [mdt]
mdt_reint_internal+0x720/0xaf0 [mdt]
mdt_reint+0x67/0x140 [mdt]
tgt_request_handle+0x7ea/0x1750 [ptlrpc]
ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
ptlrpc_main+0xb3c/0x14e0 [ptlrpc]
kthread+0xd1/0xe0
ret_from_fork_nospec_begin+0x21/0x21
Need to hold inode mutex on the external EA for ldiskfs_orphan_add()
to soothe the warning.
Lustre-change: https://review.whamcloud.com/44754
Lustre-commit: TBD (from
047a859723d3df090af5b1db44adf1f191a6c77c)
Fixes:
f64e9f19f68e ("LU-12977 ldiskfs: properly take inode_lock() for truncates")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I3a1abfde3289c0bbd46e0d5a5b9d2ff7d7cf9273
Reviewed-on: https://review.whamcloud.com/44771
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Tue, 7 Sep 2021 19:50:39 +0000 (12:50 -0700)]
LU-14986 kernel: kernel update SLES15 SP2 [5.3.18-24.78.1]
Update SLES15 SP2 kernel to 5.3.18-24.78.1 for Lustre client.
Test-Parameters: trivial \
env=SANITY_EXCEPT="100 130 136 817" \
clientdistro=sles15sp2 serverdistro=el7.9 \
testlist=sanity
Change-Id: I2778f5bdacf243da95b7c8c74881ab4d20ad3d91
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44862
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Tue, 7 Sep 2021 17:46:15 +0000 (11:46 -0600)]
EX-3750 tests: fix hot-pools and sanity-lipe merges
Restore missing changes in lustre/tests/hot-pools and .../sanity-lipe.
These were not included in the lipe/ subtree merge since they were
outside that dir.
Fix hot-pools test_4 to scan the lamigo log on the MDS.
Add tests for lipe_scan2
Test-Parameters: trivial testlist=hot-pools,sanity-lipe
Fixes:
2f05a3e06928 ("EX-2453 lipe: fixup striped directory paths")
Fixes:
3eee8554c0fb ("EX-2453 lipe: add SoM handling to lipe_scan2")
Fixes:
fa19bceb68e0 ("EX-1325 lamigo: improve debug/error messages")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4c34a3a8f48edf81b5231d208f1a0ff3d9a811f8
Reviewed-on: https://review.whamcloud.com/44859
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 3 Sep 2021 07:07:49 +0000 (01:07 -0600)]
EX-3676 tests: skip conf-sanity test_5a
Skip conf-sanity test_5a since it is causing constant failures since
/sbin/umount.lustre was added.
Test-Parameters: trivial testlist=conf-sanity env=ONLY=1-5 \
austeroptions=-H
Fixes:
6d62073950ac ("EX-3209 lipe: add lpcc util and service")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I77531407d02c5accc78fc239f65c4d05d995502f
Reviewed-on: https://review.whamcloud.com/44839
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Jian Yu [Thu, 2 Sep 2021 16:39:02 +0000 (09:39 -0700)]
EX-1135 lipe: build lipe_convert_expr against RHEL 8
This patch changes the python platform to python2 to
resolve the following build failure against RHEL 8:
*** ERROR: ambiguous python shebang in
/usr/bin/lipe_convert_expr: #!/usr/bin/python -u.
Change it to python3 (or python2) explicitly.
Test-Parameters: trivial
Change-Id: Id889846573b0c6e4998ef4502616e827cd69f1fa
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44825
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Andreas Dilger [Thu, 2 Sep 2021 16:19:20 +0000 (10:19 -0600)]
RM-620 build: New tag 2.14.0-ddn12
New tag 2.14.0-ddn12
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7aaf690a1452923202f5441d7de8aa2b92e8eefd
Andreas Dilger [Thu, 2 Sep 2021 03:22:34 +0000 (21:22 -0600)]
EX-3409 revert: "pcc: add owner capacity check for open attach"
This reverts commit
7cce3772e267afee328d63da9367875c63e6ad43,
since this prevented users with read permission on a file to
auto-attach or manually attach the file into the local cache.
Test-Parameters: trivial testlist=sanity-pcc
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib562f2c62acf3ed564309fa8ae56ba21bd31577c
Reviewed-on: https://review.whamcloud.com/44816
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Andreas Dilger [Thu, 2 Sep 2021 04:54:32 +0000 (22:54 -0600)]
RM-620 build: New tag 2.14.0-ddn11
New tag 2.14.0-ddn11
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I45b0e969517a5875cba100c0ab2920911fc0d9c1
Serguei Smirnov [Mon, 23 Aug 2021 19:58:51 +0000 (12:58 -0700)]
LU-14954 socklnd: fix link state detection
Due to matching only the device index, link detection implemented
in LU-14742 has issues with confusing the link events for the
virtual interfaces with the link events for the interface that
LNet was actually configured to use. Fix this by improving
the identification of the event source: use both device name and
device index.
Also, to make sure the link fatal state is cleared only when
the device is bound to the IP address used at NI creation,
subscribe to inetaddr events in addition to the netdev events.
Lustre-change: https://review.whamcloud.com/44732
Lustre-commit: TBD (from
d4dbbf3cfd692ed548c82e2dda9fdcadae052a62)
Test-Parameters: trivial
Fixes:
b842fb6fd5 ("LU-14742: detect link state to set fatal error")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ib1996c66a8ae2596970d66e3d920702190851e3f
Reviewed-on: https://review.whamcloud.com/44787
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
James Nunez [Wed, 1 Sep 2021 15:14:03 +0000 (15:14 +0000)]
EX-3686 revert: "LU-13799 llite: Adjust dio refcounting"
This reverts commit
8a31964534358dd1a5db6cf86b9c6014d3c98d48
("LU-13799 llite: Adjust dio refcounting")
This patch is causing several tests to crash with messages
similar to the following:
BUG: Bad page state in process ptlrpcd_01_01
BUG: Bad page map in process iozone
page:
ffffdafec7f35640 count:0 mapcount:-1 mapping: (null) index:0x7f5
page flags: 0x6fffff00080018(uptodate|dirty|swapbacked)
page dumped because: bad pte
addr:
7f5524800000 vm_flags:8100073
anon_vma:
ffff9e8465843fa0 mapping: (null) index:
7f5524800
WARNING: CPU: 1 PID: 9325 at lib/list_debug.c:62 list_del corruption
Test-Parameters: trivial testlist=sanity
Test-Parameters: testlist=sanity-lfsck
Test-Parameters: testlist=sanity-dom
Test-Parameters: testlist=sanity-flr
Test-Parameters: testlist=replay-single
Change-Id: I97b77b671ff0dea4cd13428f700b7643d9e94f09
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44806
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Tue, 31 Aug 2021 09:11:42 +0000 (17:11 +0800)]
EX-3636 pcc: set invalid cache state for fallback I/O
When fallback I/O to Lustre (not PCC backend), it should set the
cache state correctly (with *cached = false).
Fixes:
c3cf63c830 ("EX-3636 pcc: reset file mmaping for the file once mmaped")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I4e7dad93c589ac8062fe6c08423bf209f08432b6
Reviewed-on: https://review.whamcloud.com/44793
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Tue, 31 Aug 2021 03:45:25 +0000 (11:45 +0800)]
EX-3730 pcc: add test for concurrent read from 2 clients
This patch add a test case with concurrent read access from 2
clients.
The purpose is to verify that the client will not re-attach file
into PCC backend once attached when the file is read access
concurrently from 2 mount points on a client according to the
PCC attach stats.
This patch also fixes the help message for PCC-RO.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibb038bd3a74f43031b6fab4e65565620c416909e
Reviewed-on: https://review.whamcloud.com/44791
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Fri, 27 Aug 2021 03:46:13 +0000 (11:46 +0800)]
EX-3715 pcc: add stats for attach|detach|auto_attach
In this patch, we add stats for PCC attach, detach and
auto_attach.
With this feature, we verify that PCC can auto-attach the file
into PCC cache without having to re-fetch the data of the whole
file.
Add sanity-pcc test_44.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia0c1cd6b414998e72859aaf34c125b5a4e4e743c
Reviewed-on: https://review.whamcloud.com/44764
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
John L. Hammond [Wed, 28 Apr 2021 18:43:51 +0000 (13:43 -0500)]
LU-14693 mdt: skip DLM when opening volatile files
In mdt_reint_open(), when opening a volatile file skip taking a
MDS_INODELOCK_UPDATE lock on the parent directory.
Lustre-change: https://review.whamcloud.com/43742
Lustre-commit:
8c04afb5236f8d130419aa0bf5aaf0f52a2ad297
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I8ee89710f52e8097e1412897de91159702560e4a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44551
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
John L. Hammond [Mon, 30 Aug 2021 13:21:22 +0000 (08:21 -0500)]
EX-3725 lipe: fix json.h include
Fix json.h include in lipe_scan2.c.
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I2547f14de3fce9ccd91cb32ea1eeb8116566692d
Andreas Dilger [Sat, 28 Aug 2021 02:06:55 +0000 (20:06 -0600)]
RM-620 build: New tag 2.14.0-ddn10
New tag 2.14.0-ddn10
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie920dc93c5f27a0fc43a76b9643067e263f12020
Mikhail Pershin [Wed, 11 Aug 2021 14:30:48 +0000 (17:30 +0300)]
LU-14930 mdt: abort_recov_mdt shouldn't abort client recovery
When abort_recov_mdt is set to abort MDT-MDT recovery then
abort_recovery flag is set too inside target_stop_recovery_thread()
call, that causes not just MDT-MDT recovery abort but aborts
also clients/MDT recovery.
Lustre-commit:
6fd75f264c5f5c186bbfe559e1a98fb3769d8128
Lustre-change: https://review.whamcloud.com/44610
Fixes:
dd9e79b64d ("LU-12546 mdt: abort recovery between MDTs")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ibda05e91a2da90156e2b6c9fdcb2169cdbd50fe4
Reviewed-on: https://review.whamcloud.com/44669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Wed, 18 Aug 2021 02:03:23 +0000 (10:03 +0800)]
EX-3663 pcc: auto attach should skip if already attached
When try to auto attach a file into PCC, if found that the file
had already attached into PCC, it should skip the auto attach
processing. Otherwise, it will result in wrong PCC inode refcount
when multiple threads try to auto attach a file at the same time.
For a file once mmapped into PCC and detached due to layout lock
shrinking or manual detach command, If found that file is still
valid cached (attach into PCC again by another thread), in the
@pcc_mmap_io_init(), it should set the mapping of PCC copy with
the one of Lustre file again.
Test-Parameters: testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I5f049ca7d6db8708712e79e9ad459fc60b80f2be
Reviewed-on: https://review.whamcloud.com/44697
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Wed, 11 Aug 2021 09:32:06 +0000 (17:32 +0800)]
EX-3636 pcc: reset file mmaping for the file once mmaped
For a file once mmaped and cached on PCC, a new open will set the
mapping for the file handle of PCC copy (@file->f_mapping) with
the one of the Lustre file handle. When the file is detached from
PCC due to manual detach or layout lock shrinking, the normal I/O
(read/write) will auto-attach the file into PCC again during I/O
as the layout version is unchanged. However, it still needs to
reset the file mapping (@pcc_file->f_mapping) with the mapping of
the PCC copy. Otherwise it will cause panic as follows:
[ 935.516823] RIP: 0010:_raw_read_lock+0xa/0x20
[ 935.517077] ll_cl_find+0x19/0x60 [lustre]
[ 935.517098] ll_readpage+0x51/0x820 [lustre]
[ 935.517110] read_pages+0x122/0x190
[ 935.517119] __do_page_cache_readahead+0x1c1/0x1e0
[ 935.517131] ondemand_readahead+0x1f9/0x2c0
[ 935.517142] pagecache_get_page+0x30/0x2c0
[ 935.517165] generic_file_buffered_read+0x556/0xa00
[ 935.517189] pcc_try_auto_attach+0x3ac/0x400 [lustre]
[ 935.517552] pcc_io_init+0x146/0x560 [lustre]
[ 935.517906] pcc_file_read_iter+0x24d/0x2b0 [lustre]
[ 935.518259] ll_file_read_iter+0x74/0x2e0 [lustre]
[ 935.518604] new_sync_read+0x121/0x170
[ 935.518937] vfs_read+0x8a/0x140
This patch adds sanity-pcc test_98 to verify it.
I/O for a file previously opened before attach into PCC or once
opened while in ATTACHING state will fallback to Lustre OSTs.
For the later mmap() on the file, the mmap() I/O also needs to
fallback to Lustre OSTs and cannot read directly from local valid
cached PCC copy until all fallback file handles are closed as the
mapping of the PCC copy is replaced with the one of Lustre file
when mmapped a file.
Add sanity-pcc test_97 to verify it.
And we also forbid to auto attach the file which is still in
mmapped I/O.
This patch disables "mmap_conv" by default.
Test-Parameters: testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I11195b0bdb6fb1d0d68d0b0cd02a0af8ee1fc297
Reviewed-on: https://review.whamcloud.com/44592
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 11 Aug 2021 18:33:49 +0000 (11:33 -0700)]
LU-14925 kernel: kernel update RHEL8.4 [4.18.0-305.12.1.el8_4]
Update RHEL8.4 kernel to 4.18.0-305.12.1.el8_4 for Lustre client.
Test-Parameters: trivial clientdistro=el8.4
Change-Id: Ic7a782270dd350a211094c18411faa60a35013e9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44603
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
John L. Hammond [Wed, 25 Aug 2021 22:56:16 +0000 (17:56 -0500)]
Squashed 'lipe/' changes from
2a015e67c4..
b7b776f968
b7b776f968 Fix json.h include.
2ba146383f Update lipe version to 1.19.
f007d85105 EX-2453 lipe: fixup striped directory paths
64847ef214 EX-2453 lipe: add SoM handling to lipe_scan2
75045a94c5 EX-3701 lipe: lamigo ssh logging improvements
984381d796 DDN-2223 lipe: add ngc_exp_remv and tests
9e5f54e3e7 EX-3588 lamigo: use POSIX redirection syntax
a6240ef3bf EX-3701 lipe: lamigo lamigo log message improvements
git-subtree-dir: lipe
git-subtree-split:
b7b776f9687f084300eabe4b5cebdc20f316d8e0
John L. Hammond [Wed, 25 Aug 2021 22:56:16 +0000 (17:56 -0500)]
EX-2921 lipe: merge lipe changes from b_es5_2
Merge commit '
72399232bb9bbc85fb509fa48e12d88cc7471724' into b_es6_0
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I1624187cffbaba64d914dbe517d0b7e16878006a
John L. Hammond [Mon, 16 Aug 2021 03:14:39 +0000 (22:14 -0500)]
EX-3658 utils: add pumount
Add a utility ('pumount') to lazily unmount a filesystem and kill
remaining users. Add a test script (sanity-puount.sh).
Test-Parameters: testlist=sanity-pumount clientextra_install_params="--packages pumount"
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Iaa937d51ca0003c92d1d63608e5d0f4f67ca92fb
Reviewed-on: https://review.whamcloud.com/44736
Tested-by: jenkins <devops@whamcloud.com>
John L. Hammond [Mon, 23 Aug 2021 22:28:27 +0000 (17:28 -0500)]
EX-2921 lipe: merge lipe changes from b_es6_0
Merge commit '
7d665cba3b245f3d49166de48c39fd9df9843633' into b_es6_0
$ git checkout b_es5_2
$ git subtree split --prefix=lipe
2a015e67c4d3cdc7802178f1ad0fd85fc251fb0a
$ git checkout b_es6_0
$ git subtree merge --prefix=lipe --squash
2a015e67c4d3cdc7802178f1ad0fd85fc251fb0a
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ie8fdc48bad7dba428a247350cf0662524225660f
John L. Hammond [Mon, 23 Aug 2021 22:23:19 +0000 (17:23 -0500)]
Squashed 'lipe/' changes from 39ab2a0..2a015e6
2a015e6 EX-2453 lipe: add lipe_scan2 to RPM
989583f EX-2453 lipe: add xattr_name() test
d9cff4d EX-2453 lipe: lipe_scan2 attribute handling
8c46fbb EX-2453 lipe: defer getting object paths
3fad488 EX-2453 lipe: remove 'all_inode' parameters
6445f44 EX-2453 lipe: remove watch_fid parameters
c40e612 EX-2453 lipe: add lipe_scan2
8128fe0 EX-1325 lamigo: improve debug/error messages
7055a64 EX-2453 lipe: add paths handling to lipe_scan()
307dbcb EX-2453 lipe: move lipe_scan2() wrapper to policy.c
788a15c EX-3623 lipe: include errno.h in lipe_expression_test.c
288b18b EX-3476 lipe-scripts: Add --now to hp stop
1f01cf2 EX-2797 lpurge: initial support for DoM
2327aad EX-2853 lamigo: initial supoprt for DoM
e524af6 EX-3198 lipe: add lipe_convert_expr
3981e3b EX-2453 lipe: add LIPE_OBJECT_ATTR_PATHS
d87813e EX-2453 lipe: add json formatting to lipe_object_attrs
071e07b EX-2600 lipe: parse link xattr once
f1c8142 EX-2453 lipe: add lipe_scan2()
61f28b5 EX-2453 lipe: reduce surface area of results and counters
git-subtree-dir: lipe
git-subtree-split:
2a015e67c4d3cdc7802178f1ad0fd85fc251fb0a
Li Xi [Tue, 17 Aug 2021 16:32:45 +0000 (00:32 +0800)]
RM-620 build: New tag 2.14.0-ddn9
Automatic new tag 2.14.0-ddn9
Change-Id: Icbbe25b7e6aa84e2f71e3fb3c43e05fc98e4d904
Signed-off-by: Li Xi <lixi@ddn.com>
Patrick Farrell [Tue, 17 Aug 2021 15:54:01 +0000 (11:54 -0400)]
LU-13799 llite: Adjust dio refcounting
We get a page reference in cl_page_find, then immediately
add another for cl_2queue_add and remove the first
reference. This is pretty silly, since the life cycle is
the same on these.
This improves DIO/AIO page submission by around 2%.
This patch reduces i/o time in ms/GiB by:
Write: 2 ms/GiB
Read: 2 ms/GiB
Totals:
Write: 170 ms/GiB
Read: 162 ms/GiB
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
With previous pa5ches in series:
write 5955 MiB/s
read 6218 MiB/s
Plus this patch:
write 6028 MiB/s
read 6305 MiB/s
Lustre-change: https://review.whamcloud.com/39447
Lustre-commit:
1e4d10af3909452b0eee1f99010d80aeb01d42a7
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I228eca6d48c6007bbf2c8caae5e477b7d40521d1
Reviewed-on: https://review.whamcloud.com/44446
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Li Xi <lixi@ddn.com>
Patrick Farrell [Fri, 7 May 2021 15:50:51 +0000 (11:50 -0400)]
LU-13799 llite: Modify AIO/DIO reference counting
For DIO pages, it's enough to have a reference on the
cl_object associated with the AIO. This saves taking a
reference on the cl_object for each page, which saves about
5% of the time when doing DIO/AIO.
This is possible because the lifecycle of the aio struct is
always greater than that of the associated pages.
This patch reduces i/o time in ms/GiB by:
Write: 6 ms/GiB
Read: 1 ms/GiB
Totals:
Write: 198 ms/GiB
Read: 197 ms/GiB
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
With previous patches in series:
write 5030 MiB/s
read 5174 MiB/s
Plus this patch:
write 5183 MiB/s
read 5200 MiB/s
Lustre-change: https://review.whamcloud.com/39442
Lustre-commit:
b3de247b76b410101e166b024d65e2bf23d401ba
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I970cda20417265b4b66a8eed6e74440e5d3373b8
Reviewed-on: https://review.whamcloud.com/44443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Patrick Farrell [Fri, 2 Jul 2021 17:24:48 +0000 (13:24 -0400)]
LU-14805 llite: No locked parallel DIO
If we are doing locked DIO, the OSC & LDLM locks are
released at the end of cl_io_loop, ie, before we wait for
parallel DIO at the llite layer.
This is problematic because the locks are released before
i/o done using them is complete; this can lead to data
inconsistencies. (And at least one LBUG, see LU-14805.)
The easiest solution for now is only do parallel DIO when
working lockless (which is the default; DIO only switches
to locked to manage conflicts with buffered i/o).
This problem & fix apply to AIO as well as parallel DIO.
Lustre-change: https://review.whamcloud.com/44131
Lustre-commit:
0f8db7e06abbc341e1ecc6ae164fca7b4a040c4a
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If98a0551d6dde54220b406b26e978e284a6b1ebf
Reviewed-on: https://review.whamcloud.com/44131
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44442
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Patrick Farrell [Fri, 13 Aug 2021 15:25:08 +0000 (11:25 -0400)]
LU-13798 llite: parallelize direct i/o issuance
Currently, the direct i/o code issues an i/o to a given
stripe, and then waits for that i/o to complete. (This is
for i/os from a single process.) This forces DIO to send
only one RPC at a time, serially.
In the case of multi-stripe files and larger i/os from
userspace, this means that i/o is serialized - so single
thread/single process direct i/o doesn't see any benefit
from the combination of extra stripes & larger i/os.
Using part of the AIO support, it is possible to move this
waiting up a level, so it happens after all the i/o is
issued. (See LU-4198 for AIO support.)
This means we can issue many RPCs and then wait,
dramatically improving performance vs waiting for each RPC
serially.
This is referred to as 'parallel dio'.
Notes:
AIO is not supported on pipes, so we fall back to the old
sync behavior if the source or destination is a pipe.
Error handling is similar to buffered writes: We do not
wait for individual chunks, so we can get an error on an RPC
in the middle of an i/o. The solution is to return an
error in this case, because we cannot know how many bytes
were written contiguously. This is similar to buffered i/o
combined with fsync().
The performance improvement from this is dramatic, and
greater at larger sizes.
lfs setstripe -c 8 -S 4M .
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
Without the patch:
write 764.85 MiB/s
read 682.87 MiB/s
With patch:
write 4030 MiB/s
read 4468 MiB/s
Lustre-change: https://review.whamcloud.com/39436
Lustre-commit:
cba07b68f9386b6169788065c8cba1974cb7f712
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7e8df7d16b131b55a235f57c3280509559f94476
Reviewed-on: https://review.whamcloud.com/39436
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44324
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Li Xi [Tue, 17 Aug 2021 12:16:01 +0000 (20:16 +0800)]
RM-620 build: New tag 2.14.0-ddn8
Automatic new tag 2.14.0-ddn8
Change-Id: Idbff1e325c1a11386addf6019064347dd1250ce8
Signed-off-by: Li Xi <lixi@ddn.com>
Lei Feng [Tue, 27 Jul 2021 07:37:11 +0000 (15:37 +0800)]
EX-3209 lipe: add lpcc util and service
Create lpcc daemon/cli and systemd serivce to manage all
PCC devices and services. Create umount.lustre to hook the
umounting and stop PCC in advance. Remove unused lpcc_test
and lpcc_cleanup. Fix stats mistake for purge_objs. Add
--pidfile for lpcc_purge.
Change-Id: I941d07b61906e4d5ebee13dab2a8015e43ecf676
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/44103
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Patrick Farrell [Fri, 7 May 2021 19:42:20 +0000 (15:42 -0400)]
LU-13799 lov: Improve DIO submit
Skip some unnecessary looping in page submission for the
DIO case.
This gives about a 2% improvement for AIO/DIO page
submission.
This patch reduces i/o time in ms/GiB by:
Write: 2 ms/GiB
Read: 2 ms/GiB
Totals:
Write: 172 ms/GiB
Read: 165 ms/GiB
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
With previous patches in series:
write 7726 MiB/s
read 5899 MiB/s
Plus this patch:
write 5954 MiB/s
read 6217 MiB/s
Lustre-change: https://review.whamcloud.com/39446
Lustre-commit:
d31647c017a390c9553a38d82c02fe7001a33a05
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: Iedad978438ee3f1f3290d990311532626cba9e2d
Reviewed-on: https://review.whamcloud.com/44445
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 7 May 2021 19:51:32 +0000 (15:51 -0400)]
LU-13799 clio: Skip prep for transients
The work done by cpo_prep() (etc) is unnecessary for
transient pages. This gives only a minimal performance
boost and is better seen as a step towards removing the
cl_page abstraction for transient pages.
But, it does consistently give around 1% better
performance.
This patch reduces i/o time in ms/GiB by:
Write: 1 ms/GiB
Read: 1 ms/GiB
Totals:
Write: 169 ms/GiB
Read: 161 ms/GiB
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
With previous patches in series:
write 6028 MiB/s
read 6305 MiB/s
Plus this patch:
write 6071 MiB/s
read 6355 MiB/s
Lustre-change: https://review.whamcloud.com/39448
Lustre-commit:
b8553978789ad3dd0776c0543dea4641804d0ac5
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: Ib94f57cde468c9aaea952e1bb89db8fcf4b35e07
Reviewed-on: https://review.whamcloud.com/44447
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Sat, 29 May 2021 01:32:43 +0000 (21:32 -0400)]
LU-13799 llite: Remove transient page counting
Transient page counting is not used for anything, as
already noted in the commit message, but costs something
like 4% of the time in DIO page submission.
Remove it.
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
This patch reduces i/o time in ms/GiB by:
Write: 6 ms/GiB
Read: 11 ms/GiB
Totals:
Write: 174 ms/GiB
Read: 167 ms/GiB
With previous patches in series:
write 5703 MiB/s
read 5756 MiB/s
Plus this patch:
write 5900 MiB/s
read 6136 MiB/s
Lustre-change: https://review.whamcloud.com/39441
Lustre-commit:
587e5aa8342980f761930235714add1cca80686b
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I825de4f1b5d1dd1476a4a711bfa51e7d24b5027a
Reviewed-on: https://review.whamcloud.com/44444
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 7 May 2021 15:37:40 +0000 (11:37 -0400)]
LU-13799 clio: Implement real list splice
Lustre's list_splice is actually just a slightly
depressing list_for_each; let's use a real list_splice.
This saves significant time in AIO/DIO page submission,
getting a several % performance boost.
This patch reduces i/o time in ms/GiB by:
Write: 16 ms/GiB
Read: 14 ms/GiB
Totals:
Write: 220 ms/GiB
Read: 209 ms/GiB
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
With previous patches in series:
write 4326 MiB/s
read 4587 MiB/s
With this patch:
write 4647 MiB/s
read 4888 MiB/s
Lustre-change: https://review.whamcloud.com/39439
Lustre-commit:
dfe2d225b86d4215cbd09e863e8de7547f0d4dae
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: Icfd4a3d9dd6f162b011b402a1c88d7dae53eff40
Reviewed-on: https://review.whamcloud.com/39439
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44440
Lai Siyao [Wed, 4 Aug 2021 02:33:41 +0000 (10:33 +0800)]
LU-14792 llite: enable filesystem-wide default LMV
This change includes three parts:
1. save dir depth to ROOT after lookup on client side.
2. once space balanced default LMV is set on ROOT, and
max-inherit/max-inherit-rr is unlimited or not less than directory
depth, new directory will be created in QOS or roundrobin mode.
3. set ROOT default LMV max-inherit unlimited, and max-inherit-rr to
3, and increase the ratio to create subdirectory on local MDT with
the directory depth to ROOT, so that new directories will be
created by space usage, and the deeper it's located it's more
likely to create on local MDTs; and the top 3 layer will be created
in roundrobin mode if system is balanced.
Set default LMV in mkdir_on_mdt() to make sure its subdirectories are
created on the same MDT. Add sanity 413d.
Create a test directory on MDT0 for pjdfstest, because cross-MDT
rename of symlink will migrate symlink to target MDT, which will cause
inode change (LU-11631).
Lustre-change: https://review.whamcloud.com/44090
Lustre-commit:
b9c4dc3c33fe87ecaa79a290190524ea21b7fa8a
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib3a133ac99655ca04443b9498e6618033f6b88b9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44464
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Thu, 5 Aug 2021 22:32:13 +0000 (16:32 -0600)]
LU-13705 utils: improve llstat/llobdstat usability
Allow llstat to work on the client with minimal effort, by allowing
client OBD device types like "llstat -i 1 llite" or "llstat -i 1 osc",
in addition to the existing "mdt" or "ost_io" RPC-level shortcuts.
Allow specifying stats with the same syntax as "lctl get_param" and
"lctl list_param" like "llstat -i 5 myth-OST0002" or similar specific
stats.
Make the code and usage between llstat and llobdstat more uniform,
so that it is easier to switch between using one and the other.
Fix the display of llstat to fit into 80 columns by default, but
allow it to detect if the terminal is wider and print more columns
(e.g. stddev) if there are more columns in the terminal. llobdstat
always fit within 80 columns, so this is not necessary there.
Fix llite.*.read_ahead_stats to be usable by llstat.
Lustre-change: https://review.whamcloud.com/39178
Lustre-commit:
3e0d994fbf4c4f2e5c51b2be5669ad97aa02f840
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I09c714018925e8d0a17a63eee673ae18512540e5
Reviewed-on: https://review.whamcloud.com/44517
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Patrick Farrell [Fri, 7 May 2021 15:38:07 +0000 (11:38 -0400)]
LU-13799 osc: Simplify clipping for transient pages
The combination of page clip and page flag setting for
transient pages takes up several % of the time when
submitting them for async DIO.
But neither is required - Transient pages do not change
after creation except in limited cases, and in any case,
they are only accessible from the submitting thread -
there is no possibility of parallel access.
So we can set the page flags, etc, at init time.
This patch improves i/o time in ms/GiB by:
Write: 17 ms/GiB
Read: 22 ms/GiB
Totals:
Write: 204 ms/GiB
Read: 198 ms/GiB
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
With previous patches in series:
write 4647 MiB/s
read 4888 MiB/s
Plus this patch:
write 5030 MiB/s
read 5174 MiB/s
Lustre-change: https://review.whamcloud.com/39440
Lustre-commit:
b64b9646f17b771c415e4b39cb8babcdc7541b30
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I974ebb0f55734a8628f1f7e1c01092eb2ce5f83b
Reviewed-on: https://review.whamcloud.com/39440
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44441
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Lai Siyao [Thu, 8 Jul 2021 08:09:01 +0000 (16:09 +0800)]
LU-13417 test: mkdir_on_mdt0() in more tests
Replace mkdir with mkdir_on_mdt0() in several tests.
Update recovery-small test_110k() in case there are opened files on
MDT1 which would cause umount stall.
Lustre-change: https://review.whamcloud.com/44315
Lustre-commit:
618625af42b9ff0427b096996ddf07a327689ec8
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=recovery-small.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-lfsck.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-pfl.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-quota.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-scrub.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-selinux.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity.sh
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanityn.sh
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iebc32568b7fc146b658f47c5f5053fd3db24432f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44648
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Wed, 28 Apr 2021 15:02:23 +0000 (23:02 +0800)]
LU-13417 mdd: set default LMV on ROOT
To balance MDT usage, set default LMV on ROOT if it's not set. The
default stripe offset is "-1", and default stripe count is "1". Then
directory created by "mkdir" under ROOT will be scattered on all MDTs
by usage.
Add sanity 0e.
Lustre-change: https://review.whamcloud.com/38553
Lustre-commit:
3e04b0fd6c3dd36372f33c54ea5f401c27485d60
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I7a6c752256225b8d065b2c304c4725268df28045
Reviewed-on: https://review.whamcloud.com/44463
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Tue, 3 Aug 2021 19:03:06 +0000 (03:03 +0800)]
EX-3640 test: mkdir on MDT0 in hot-pools.sh
To mkdir on MDT0 in hot-pools.sh by default, disable default LMV on
ROOT in init_hot_pools_env().
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=hot-pools.sh
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie34b35a6d79fa4b0f2c1c5a58777cf6291cd8d27
Reviewed-on: https://review.whamcloud.com/44590
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Thu, 29 Apr 2021 03:46:21 +0000 (11:46 +0800)]
LU-13417 test: use mkdir_on_mdt0() in misc tests
Replace mkdir with mkdir_on_mdt0() if directory needs to be created
on MDT0 in following tests:
* conf-sanity
* lustre-rsync-test
* ost-pools
* replay-ost-single
* replay-single
* replay-vbr
* sanity-hsm
* sanity-pcc
* sanity-quota
* sanity-sec
Lustre-change: https://review.whamcloud.com/43491
Lustre-commit:
de62c8c7ef5d627da872260686d9279cbb60736e
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=conf-sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=lustre-rsync-test
Test-Parameters: mdscount=2 mdtcount=4 testlist=ost-pools
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-ost-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single,replay-vbr
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-hsm,sanity-pcc
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-quota
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-sec
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I96369f25982558a1dac7f4f7fe80a95bc1c0207d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44461
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 4 May 2021 01:25:23 +0000 (19:25 -0600)]
LU-13440 utils: update sanity 413a, 413b and 413c
In sanity test 413a,413b and 413c, create "qos" directory on most
full directory, so that its subdirectories won't be created on the
same MDT.
Lustre-change: https://review.whamcloud.com/43530
Lustre-commit:
1dbe63301b8c5cb7f7d0fe9960cafd3cd0e45534
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: Ia8061ee48ac219e6948d667269c3ad80f6198401
Reviewed-on: https://review.whamcloud.com/44542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Fri, 2 Apr 2021 04:47:32 +0000 (12:47 +0800)]
LU-14579 flr: mirror unlink and split race
- protect lod_object::ldo_comp_entries during
lod_obj_for_each_stripe(), since other thread could change the
ldo_comp_entries at the same time.
- protect LOD in-memory layout during layout change
layout_{add|set|del} and purge_mirror.
- fix lock-tx order in mdd_unlink: start the transaction and then
take locks. (introduced in commit
55d5235354d49aee0a330ad64beef4ed9004a27f)
- Add test case for mirror split and unlink race.
Lustre-commit:
bd7a2f9938a7edf09afd133601ca4181e109a7d0
Lustre-change: https://review.whamcloud.com/43369
Fixes:
55d5235354 ("LU-14579 flr: GPF in lod_sub_declare_destroy")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic54245c8755f660087fce46d1cad0ef7fa091245
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44257
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Mon, 26 Jul 2021 06:18:06 +0000 (09:18 +0300)]
LU-14098 obdclass: try to skip corrupted llog records
if llog's header or record is found corrupted, then
ignore the remaining records and try with the next one.
Lustre-commit:
910eb97c1b43a44a9da2ae14c3b83e28ca6342fc
Lustre-change: https://review.whamcloud.com/40754
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86a682a8874a2184e8891ff0ee8a68414d232a79
Reviewed-on: https://review.whamcloud.com/44397
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Tue, 20 Jul 2021 01:24:36 +0000 (09:24 +0800)]
LU-13417 test: generate uneven MDTs early for sanity 413
Fill MDT early to generate uneven MDTs for sanity test_413, and
add test_413z to unlink these directories.
Lustre-change: https://review.whamcloud.com/44384
Lustre-commit:
233344d451e567c71726bcb071f45cf8f1c6ef3e
Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-1
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I84e3670bb40c3666488139d6a272f29188b0dfae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Fri, 7 May 2021 15:35:28 +0000 (11:35 -0400)]
LU-13799 osc: Don't get time for each page
Getting the time when each batch of pages starts is
sufficiently accurate, and ktime_get() is several % of the
CPU time when doing AIO + DIO.
This relies on previous patches in this series.
Measuring this in milliseconds/gigabyte lets us measure the
improvement in absolute terms, rather than just relative
terms.
This patch reduces i/o time in ms/GiB by:
Write: 17 ms/GiB
Read: 6 ms/GiB
Totals:
Write: 237 ms/GiB
Read: 223 ms/GiB
IOR:
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
Without the patch:
write 4030 MiB/s
read 4468 MiB/s
With patch:
write 4326 MiB/s
read 4587 MiB/s
Lustre-change: https://review.whamcloud.com/39437
Lustre-commit:
485976ab451dd6708d4d46bce3bbed9991f5d356
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I02897bf810683bc77a7d09156cdb83ba1d25ebf1
Reviewed-on: https://review.whamcloud.com/39437
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44439
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Sebastien Buisson [Fri, 9 Jul 2021 12:52:40 +0000 (14:52 +0200)]
LU-14833 sec: quiet spurious gss_init_svc_upcall() message
Switch from CWARN to CDEBUG(D_SEC) for message printed by
gss_init_svc_upcall():
Init channel is not opened by lsvcgssd, following request might be
dropped until lsvcgssd is active
Indeed, this message is printed no matter GSS is enabled or not, and
we do not have any way to check this by the time the kernel module
is loaded.
Lustre-change: https://review.whamcloud.com/44197
Lustre-commit:
6a4be282bbbd5c6d92787abe9ae316e3c702192c
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I66c8c2a16e58ca75973226c80e0f4a92c90b4025
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44399
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Sun, 28 Feb 2021 09:24:12 +0000 (12:24 +0300)]
LU-14430 mdt: fix maximum ACL handling
Having maximum ACL cause big reply buffer and in that case
server could return -ERANGE in mdt_pack_acl2body() expecting
a client to resend RPC with bigger buffer. The problem is
that even in that case server can return -ERANGE causing
userspace tool to get this error after all.
Instead of estimating reply sizes in mdt_pack_acl2body()
let's just rely on mdt_fix_reply() code which does buffer
grow when it is needed
- add more credits for osd_create in ldiskfs because it
copies also default ACLs during create
- remove code returning -ERANGE in mdt_pack_acl2body() and
rely on mdt_fix_reply() reply buffers grow
- test is added to create as many ACLs as possible
Lustre-change: https://review.whamcloud.com/42013
Lustre-commit:
aa92caa21fa2a4473dce5889de7fcd17e171c1a0
Test-Parameters: env=ONLY=103e testlist=sanity
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If7af5c61f89ee1220d7982d4c61a7357051a811c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44424
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Qian Yingjin [Fri, 30 Jul 2021 08:47:55 +0000 (16:47 +0800)]
EX-3571 pcc: disable PCC for encrypted files
When files are encrypted in Lustre using fscrypt, they should
normally not be accessible to users without the proper encyrption
key. However, if a user has then encryption key loadedwhen they
read a file, it may be decrypted in memory and saved to the PCC
backend in unencrypted form.
Due to the above reason, we just disable PCC caching for encrypted
files.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6c363dcad7a6bc8520350c0295f6e221bec3abb0
Reviewed-on: https://review.whamcloud.com/44433
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Thu, 17 Dec 2020 09:15:50 +0000 (12:15 +0300)]
LU-14262 utils: lfs to set component flags by pool name
so it'd be easy to set flags (like prefer) on the components
residing on specific OST identified by pool.
Lustre-commit:
0354fa98966eef9874b3fe6818c2c6f1a2433297
Lustre-change: https://review.whamcloud.com/41024
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I733f92fe186682dc8d34512edf75b49e565c457f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43458
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Tue, 6 Jul 2021 15:20:56 +0000 (11:20 -0400)]
LU-14814 osc: osc: Do not flush on lockless cancel
The cancellation of a an OSC lock without an LDLM lock
(a 'lockless' OSC lock) should not flush pages. Only
direct i/o is allowed to use a lockless OSC lock, and
direct i/o does not create flushable pages.
DIO pages are not flushable because:
A) all synced ASAP, and
B) the OSC extents created for them are not added to the
extent tree which is used to track these pages.
Instead, this has the effect of trying to flush pages from
ongoing buffered i/o. This can lead to crashes like the
following:
osc_cache_writeback_range()) ASSERTION(hp == 0 && discard == 0) failed
This assert essentially says the lock cancellation
(hp == 1) found an active i/o (an extent in the OES_ACTIVE
state).
This is not allowed because the flushing code assumes an
LDLM lock is being cancelled, which will only start once
there is no active i/o. Because the OSC lock being
cancelled is not associated with an LDLM lock, this is not
true, and nothing prevents active i/o under a different
lock, leading to this assert.
The solution is simply to not flush pages when cancelling a
no-LDLM-lock OSC lock.
Additional note:
New lockless OSC locks cannot be created if they are
blocked by a regular OSC lock, but a new regular lock can
be created if there is a lockless lock present.
Thus, the sequence is something like this:
Direct i/o creates lockless OSC lock
Buffered i/o creates OSC and LDLM lock on the same range
Direct i/o finishes, starts cancelling its OSC lock
Buffered i/o is still ongoing, with extents in OES_ACTIVE
This results in the above crash during the OSC lock
cancellation.
Note it would be possible to resolve this issue by not
allowing lockless OSC locks to match regular OSC locks, but
this is not necessary, since there's no reason for lockless
locks to flush pages on cancellation.
Lustre-change: https://review.whamcloud.com/44152
Lustre-commit:
6717c573ed90da9175e3c93c19759ea2dcd38bec
Test-Parameters: env=ONLY=398b,ONLY_REPEAT=200 testlist=sanity
Test-Parameters: env=ONLY=77,ONLY_REPEAT=100 testlist=sanityn
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iceb1747b66232cad3f7e90ec271310a13a687a33
Reviewed-on: https://review.whamcloud.com/44438
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 9 Jul 2021 20:13:36 +0000 (16:13 -0400)]
LU-14838 osc: Remove client contention support
Lockless buffered i/o and contention detection don't work,
lockless bufferd i/o is unfixable and contention detection
is broken enough that it will have to be rewritten.
Let's remove both. This patch starts the removal by
pulling the client side support.
Lustre-change: https://review.whamcloud.com/44205
Lustre-commit:
5ad00e36eca11a1469588bd7b7b4d8df1c32eb27
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If8583eff176bddb33e197befb967d229f8ca5688
Reviewed-on: https://review.whamcloud.com/44437
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 9 Jul 2021 20:13:09 +0000 (16:13 -0400)]
LU-14838 osc: Remove lockless truncate
Lockless truncate does not work and cannot be made to work.
Fundamentally, it has no means of ensuring consistency
across clients because it can't force them all to drop
cached data without locking.
It's been off for years - let's just get rid of it.
Lustre-change: https://review.whamcloud.com/44204
Lustre-commit:
6335dba83995765c1ffcd7993eb8958c162913e1
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia2979fb6b31a61da6d4833e9f463fcd5b6dbd718
Reviewed-on: https://review.whamcloud.com/44436
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>