Whamcloud - gitweb
Lai Siyao [Thu, 29 Apr 2021 03:46:21 +0000 (11:46 +0800)]
LU-13417 test: use mkdir_on_mdt0() in misc tests
Replace mkdir with mkdir_on_mdt0() if directory needs to be created
on MDT0 in following tests:
* conf-sanity
* lustre-rsync-test
* ost-pools
* replay-ost-single
* replay-single
* replay-vbr
* sanity-hsm
* sanity-pcc
* sanity-quota
* sanity-sec
Lustre-change: https://review.whamcloud.com/43491
Lustre-commit:
de62c8c7ef5d627da872260686d9279cbb60736e
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=conf-sanity
Test-Parameters: mdscount=2 mdtcount=4 testlist=lustre-rsync-test
Test-Parameters: mdscount=2 mdtcount=4 testlist=ost-pools
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-ost-single
Test-Parameters: mdscount=2 mdtcount=4 testlist=replay-single,replay-vbr
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-hsm,sanity-pcc
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-quota
Test-Parameters: mdscount=2 mdtcount=4 testlist=sanity-sec
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I96369f25982558a1dac7f4f7fe80a95bc1c0207d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44461
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 4 May 2021 01:25:23 +0000 (19:25 -0600)]
LU-13440 utils: update sanity 413a, 413b and 413c
In sanity test 413a,413b and 413c, create "qos" directory on most
full directory, so that its subdirectories won't be created on the
same MDT.
Lustre-change: https://review.whamcloud.com/43530
Lustre-commit:
1dbe63301b8c5cb7f7d0fe9960cafd3cd0e45534
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: Ia8061ee48ac219e6948d667269c3ad80f6198401
Reviewed-on: https://review.whamcloud.com/44542
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Fri, 2 Apr 2021 04:47:32 +0000 (12:47 +0800)]
LU-14579 flr: mirror unlink and split race
- protect lod_object::ldo_comp_entries during
lod_obj_for_each_stripe(), since other thread could change the
ldo_comp_entries at the same time.
- protect LOD in-memory layout during layout change
layout_{add|set|del} and purge_mirror.
- fix lock-tx order in mdd_unlink: start the transaction and then
take locks. (introduced in commit
55d5235354d49aee0a330ad64beef4ed9004a27f)
- Add test case for mirror split and unlink race.
Lustre-commit:
bd7a2f9938a7edf09afd133601ca4181e109a7d0
Lustre-change: https://review.whamcloud.com/43369
Fixes:
55d5235354 ("LU-14579 flr: GPF in lod_sub_declare_destroy")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic54245c8755f660087fce46d1cad0ef7fa091245
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44257
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Mon, 26 Jul 2021 06:18:06 +0000 (09:18 +0300)]
LU-14098 obdclass: try to skip corrupted llog records
if llog's header or record is found corrupted, then
ignore the remaining records and try with the next one.
Lustre-commit:
910eb97c1b43a44a9da2ae14c3b83e28ca6342fc
Lustre-change: https://review.whamcloud.com/40754
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86a682a8874a2184e8891ff0ee8a68414d232a79
Reviewed-on: https://review.whamcloud.com/44397
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Tue, 20 Jul 2021 01:24:36 +0000 (09:24 +0800)]
LU-13417 test: generate uneven MDTs early for sanity 413
Fill MDT early to generate uneven MDTs for sanity test_413, and
add test_413z to unlink these directories.
Lustre-change: https://review.whamcloud.com/44384
Lustre-commit:
233344d451e567c71726bcb071f45cf8f1c6ef3e
Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-1
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I84e3670bb40c3666488139d6a272f29188b0dfae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Fri, 7 May 2021 15:35:28 +0000 (11:35 -0400)]
LU-13799 osc: Don't get time for each page
Getting the time when each batch of pages starts is
sufficiently accurate, and ktime_get() is several % of the
CPU time when doing AIO + DIO.
This relies on previous patches in this series.
Measuring this in milliseconds/gigabyte lets us measure the
improvement in absolute terms, rather than just relative
terms.
This patch reduces i/o time in ms/GiB by:
Write: 17 ms/GiB
Read: 6 ms/GiB
Totals:
Write: 237 ms/GiB
Read: 223 ms/GiB
IOR:
mpirun -np 1 $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
Without the patch:
write 4030 MiB/s
read 4468 MiB/s
With patch:
write 4326 MiB/s
read 4587 MiB/s
Lustre-change: https://review.whamcloud.com/39437
Lustre-commit:
485976ab451dd6708d4d46bce3bbed9991f5d356
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I02897bf810683bc77a7d09156cdb83ba1d25ebf1
Reviewed-on: https://review.whamcloud.com/39437
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44439
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Sebastien Buisson [Fri, 9 Jul 2021 12:52:40 +0000 (14:52 +0200)]
LU-14833 sec: quiet spurious gss_init_svc_upcall() message
Switch from CWARN to CDEBUG(D_SEC) for message printed by
gss_init_svc_upcall():
Init channel is not opened by lsvcgssd, following request might be
dropped until lsvcgssd is active
Indeed, this message is printed no matter GSS is enabled or not, and
we do not have any way to check this by the time the kernel module
is loaded.
Lustre-change: https://review.whamcloud.com/44197
Lustre-commit:
6a4be282bbbd5c6d92787abe9ae316e3c702192c
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I66c8c2a16e58ca75973226c80e0f4a92c90b4025
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44399
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Sun, 28 Feb 2021 09:24:12 +0000 (12:24 +0300)]
LU-14430 mdt: fix maximum ACL handling
Having maximum ACL cause big reply buffer and in that case
server could return -ERANGE in mdt_pack_acl2body() expecting
a client to resend RPC with bigger buffer. The problem is
that even in that case server can return -ERANGE causing
userspace tool to get this error after all.
Instead of estimating reply sizes in mdt_pack_acl2body()
let's just rely on mdt_fix_reply() code which does buffer
grow when it is needed
- add more credits for osd_create in ldiskfs because it
copies also default ACLs during create
- remove code returning -ERANGE in mdt_pack_acl2body() and
rely on mdt_fix_reply() reply buffers grow
- test is added to create as many ACLs as possible
Lustre-change: https://review.whamcloud.com/42013
Lustre-commit:
aa92caa21fa2a4473dce5889de7fcd17e171c1a0
Test-Parameters: env=ONLY=103e testlist=sanity
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If7af5c61f89ee1220d7982d4c61a7357051a811c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44424
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Qian Yingjin [Fri, 30 Jul 2021 08:47:55 +0000 (16:47 +0800)]
EX-3571 pcc: disable PCC for encrypted files
When files are encrypted in Lustre using fscrypt, they should
normally not be accessible to users without the proper encyrption
key. However, if a user has then encryption key loadedwhen they
read a file, it may be decrypted in memory and saved to the PCC
backend in unencrypted form.
Due to the above reason, we just disable PCC caching for encrypted
files.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6c363dcad7a6bc8520350c0295f6e221bec3abb0
Reviewed-on: https://review.whamcloud.com/44433
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Thu, 17 Dec 2020 09:15:50 +0000 (12:15 +0300)]
LU-14262 utils: lfs to set component flags by pool name
so it'd be easy to set flags (like prefer) on the components
residing on specific OST identified by pool.
Lustre-commit:
0354fa98966eef9874b3fe6818c2c6f1a2433297
Lustre-change: https://review.whamcloud.com/41024
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I733f92fe186682dc8d34512edf75b49e565c457f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43458
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Tue, 6 Jul 2021 15:20:56 +0000 (11:20 -0400)]
LU-14814 osc: osc: Do not flush on lockless cancel
The cancellation of a an OSC lock without an LDLM lock
(a 'lockless' OSC lock) should not flush pages. Only
direct i/o is allowed to use a lockless OSC lock, and
direct i/o does not create flushable pages.
DIO pages are not flushable because:
A) all synced ASAP, and
B) the OSC extents created for them are not added to the
extent tree which is used to track these pages.
Instead, this has the effect of trying to flush pages from
ongoing buffered i/o. This can lead to crashes like the
following:
osc_cache_writeback_range()) ASSERTION(hp == 0 && discard == 0) failed
This assert essentially says the lock cancellation
(hp == 1) found an active i/o (an extent in the OES_ACTIVE
state).
This is not allowed because the flushing code assumes an
LDLM lock is being cancelled, which will only start once
there is no active i/o. Because the OSC lock being
cancelled is not associated with an LDLM lock, this is not
true, and nothing prevents active i/o under a different
lock, leading to this assert.
The solution is simply to not flush pages when cancelling a
no-LDLM-lock OSC lock.
Additional note:
New lockless OSC locks cannot be created if they are
blocked by a regular OSC lock, but a new regular lock can
be created if there is a lockless lock present.
Thus, the sequence is something like this:
Direct i/o creates lockless OSC lock
Buffered i/o creates OSC and LDLM lock on the same range
Direct i/o finishes, starts cancelling its OSC lock
Buffered i/o is still ongoing, with extents in OES_ACTIVE
This results in the above crash during the OSC lock
cancellation.
Note it would be possible to resolve this issue by not
allowing lockless OSC locks to match regular OSC locks, but
this is not necessary, since there's no reason for lockless
locks to flush pages on cancellation.
Lustre-change: https://review.whamcloud.com/44152
Lustre-commit:
6717c573ed90da9175e3c93c19759ea2dcd38bec
Test-Parameters: env=ONLY=398b,ONLY_REPEAT=200 testlist=sanity
Test-Parameters: env=ONLY=77,ONLY_REPEAT=100 testlist=sanityn
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iceb1747b66232cad3f7e90ec271310a13a687a33
Reviewed-on: https://review.whamcloud.com/44438
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 9 Jul 2021 20:13:36 +0000 (16:13 -0400)]
LU-14838 osc: Remove client contention support
Lockless buffered i/o and contention detection don't work,
lockless bufferd i/o is unfixable and contention detection
is broken enough that it will have to be rewritten.
Let's remove both. This patch starts the removal by
pulling the client side support.
Lustre-change: https://review.whamcloud.com/44205
Lustre-commit:
5ad00e36eca11a1469588bd7b7b4d8df1c32eb27
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If8583eff176bddb33e197befb967d229f8ca5688
Reviewed-on: https://review.whamcloud.com/44437
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Fri, 9 Jul 2021 20:13:09 +0000 (16:13 -0400)]
LU-14838 osc: Remove lockless truncate
Lockless truncate does not work and cannot be made to work.
Fundamentally, it has no means of ensuring consistency
across clients because it can't force them all to drop
cached data without locking.
It's been off for years - let's just get rid of it.
Lustre-change: https://review.whamcloud.com/44204
Lustre-commit:
6335dba83995765c1ffcd7993eb8958c162913e1
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia2979fb6b31a61da6d4833e9f463fcd5b6dbd718
Reviewed-on: https://review.whamcloud.com/44436
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Patrick Farrell [Thu, 15 Jul 2021 21:47:54 +0000 (17:47 -0400)]
LU-14687 llite: Return errors for aio
The aio code incorrectly discards errors from
ll_direct_rw_pages. Fix this and add a test for this.
Lustre-change: https://review.whamcloud.com/43722
Lustre-commit:
3e1f8d30cb0209b35410e85e502e2cae40f1b58c
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I49dadd0b3692820687fa6a1339e00516edf7a5d5
Reviewed-on: https://review.whamcloud.com/43722
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44323
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
James Nunez [Wed, 4 Aug 2021 17:31:42 +0000 (11:31 -0600)]
LU-12982 tests: skip conf-sanity 5i for old servers
conf-sanity tests 5i was added to lustre-master with version
2.12.54. For all version interop testing with Lustre servers with
version less than 2.12.54 and newer clients, conf-sanity test 5i
will fail and should be skipped.
Lustre-change: https://review.whamcloud.com/36811
Lustre-commit:
fee87077d83436005d6bef1c5c9673877ac4d7c1
Fixes:
d1b5146eda4f (LU-12206 mdt: mdt_init0 failure handling)
Test-Parameters: trivial
Test-Parameters: serverversion=2.12.6-ddn42 serverdistro=el7.9 env=ONLY=5 testlist=conf-sanity
Test-Parameters: env=ONLY=5 testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ia493b6f80b42fbd92254150e8d40a6fbb1039635
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Fri, 5 Mar 2021 09:07:34 +0000 (17:07 +0800)]
LU-14490 lmv: striped directory as subdirectory mount
lmv_intent_lookup() will replace fid1 with stripe FID, but if striped
directory is mounted as subdirectory mount, it should be handled
differently. Because fid2 is directory master object, if stripe is
located on different MDT as master object, it will be treated as
remote object by server, thus server won't reply LOOKUP lock back,
therefore each file access needs to lookup "/".
And remote directory (either plain or striped) shouldn't be used for
subdirectory mount, because remote object can't get LOOKUP lock.
Add an option "mdt_enable_remote_subdir_mount" (1 by default for
backward compatibility), mdt_get_root() will return -EREMOTE if
user specified subdir is a remote directory and this option is
disabled.
Add sanity 247g, updated 247f.
Lustre-change: https://review.whamcloud.com/41893
Lustre-commit:
775f88ed6c8b6235031268e258e15da405a5b955
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I5e8f95ee95c4155336098e55b7569ed7a43865c1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44456
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Wed, 28 Apr 2021 21:30:00 +0000 (05:30 +0800)]
LU-14459 mdt: support fixed directory layout
User may not want directories split automatically in some cases:
*.directory migrated.
* directory restriped.
To support this, an LMV flag LMV_HASH_FLAG_FIXED is added, and it will
be set on migrated/restriped directories. NB, if directory is migrated
or restriped to a one-stripe directory, it won't be transformed into a
plain directory, because this flag needs to be kept.
Update sanity 230q.
Lustre-change: https://review.whamcloud.com/43291
Lustre-commit:
4c2514f4832801374092f3a48c755248af345566
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Icd12b2aa34d391e32c3323a8b9c24449ea3e3d0e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Mon, 12 Apr 2021 03:30:13 +0000 (11:30 +0800)]
LU-14459 mdt: restripe parent may be a stripe
mdt_restripe() check parent LMV sanity with lmv_is_sane(), but parent
may be a stripe, use lmv_is_sane2() instead.
Clear lmv_migrate_hash/offset in layout shrink/update, though it
won't cause any issue, it's strange to see values set in debug logs.
Add more race check between directory restripe, auto-split and
migration.
Lustre-change: https://review.whamcloud.com/43290
Lustre-commit:
a84efc8607ae8057499a8800699f336e821b03d8
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I38950a07a8c9a8b4b20a2fd7aff229d27dbb403c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44458
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Thu, 29 Apr 2021 03:51:33 +0000 (11:51 +0800)]
LU-13417 test: use mkdir_on_mdt0() in replay-dual
Replace mkdir with mkdir_on_mdt0() in replay-dual.sh if directory
needs to be created on MDT0.
Lustre-change: https://review.whamcloud.com/43492
Lustre-commit:
ce179e97767936ff76282fd06df063b386851fe7
Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-dual
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9093e633412991571e18cb0ea264af013672bd8b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44462
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Mon, 12 Apr 2021 03:17:37 +0000 (11:17 +0800)]
LU-14459 llite: reset pfid after dir migration
A plain directory will be turned into to a stripe upon
migration/restripe, and reversely if target is plain directory, the
target stripe will be turned into directory after.
In the first case, set pfid, and in the latter case, clear pfid,
otherwise ll_lock_cancel_bits() will use the wrong master inode.
Lustre-change: https://review.whamcloud.com/43289
Lustre-commit:
abbe545a63b304e803ee62443dd65f1feeed15cd
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I01cac0103dc79d493166e6b090508d24f9678a57
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44457
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Fri, 31 Jul 2020 18:09:06 +0000 (02:09 +0800)]
LU-13852 pcc: don't alloc FID in LLITE for pcc open
ll_lookup_it(IT_OPEN) always alloc FID on MDT0 for pcc open, but
the open request is sent to MDT where the name hash points to,
which may be different from the MDT where the FID is, which will
trigger osp_md_create() assertion because file is created remotely.
This FID allocation is not necessary, and it can be left to be done
in lmv_intent_open() by LMV layer, because the MDT is chosen in
LMV. Then when it's done, the FID allocated can be used to initialize
PCC inode.
Change assertion in osp_md_create() to error message and return
error.
Update sanity-sec 2a for this.
Lustre-change: https://review.whamcloud.com/39568
Lustre-commit:
223728a97c397e6e6c91808dd36a2539705f00b8
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3ccea3f9e7cca5083695c71135b9a5805f833b14
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44455
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Sun, 27 Sep 2020 06:36:57 +0000 (14:36 +0800)]
LU-14004 llite: default lsm update may memory leak
ll_update_default_lsm_md() should check whether lli_default_lsm_md
is set before setting it to the data from lustre_md, and if it's set,
release the old data to avoid memory leak.
Lustre-change: https://review.whamcloud.com/40103
Lustre-commit:
cd2ad336177f8f3130de264709dc326349a22b23
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9c8434c5d62f9fb751788031d6769fd49427c371
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44454
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Wed, 28 Apr 2021 14:36:24 +0000 (22:36 +0800)]
LU-13417 test: add mkdir_on_mdt0()
Once default LMV is set on ROOT, and default stripe offset is "-1",
mkdir may not create directory on MDT0, but it's a premise for many
tests. Add a function mkdir_on_mdt0() to create directory on MDT0
by "lfs mkdir -i 0".
Replace mkdir with mkdir_on_mdt0() for such tests in sanity.sh and
sanityn.sh.
Lustre-change: https://review.whamcloud.com/43489
Lustre-commit:
54fb8458db0bff4fdfe42ba7476de3129d7606cd
Test-Parameters: trivial testlist=sanityn
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I6155d036e6b28153d0bdbdbc01088bd68ee9e0af
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44460
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Thu, 22 Apr 2021 12:33:24 +0000 (15:33 +0300)]
EX-2797 lpurge: initial support for DoM
lpurge should be able to scan MDT device, recognize objects with DoM
component and remove a replica with DoM component if another in-sync
replica exists.
Test-Parameters: testlist=hot-pools
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If12e0448ab07527d86832d942a63b4a0189ad7a0
Reviewed-on: https://review.whamcloud.com/43405
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: John L. Hammond <jhammond@whamcloud.com>
Alex Zhuravlev [Thu, 22 Apr 2021 13:43:59 +0000 (16:43 +0300)]
EX-2853 lamigo: initial supoprt for DoM
if src-dom option is specified, then lamigo will recognize
files with DoM component as files stored on "fast" storage
and consider them for replication.
Test-Parameters: testlist=hot-pools
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I471c3eb82d133c1d967772cf959bed84cab63ff4
Reviewed-on: https://review.whamcloud.com/43407
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
John L. Hammond [Wed, 4 Aug 2021 12:07:43 +0000 (07:07 -0500)]
EX-2818 lipe: add src/lpcc_purge to .gitignore
Change-Id: Ic5ef7dcc4b561ec00122a61876a617b6716c334f
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Minh Diep [Tue, 3 Aug 2021 22:50:43 +0000 (15:50 -0700)]
RM-620 build: New tag 2.14.0-ddn7
Change-Id: I3e65e7c5568ea1996d63e56d5be2989ee8e93210
Nathaniel Clark [Wed, 14 Jul 2021 12:22:46 +0000 (08:22 -0400)]
EX-3476 lipe-scripts: Add --now to hp stop
The --now option will, after stopping lamigo/lpurge, kill all lfs mirror commands
and unmount the clients. This is to aid in a full filesystem shutdown.
Change-Id: I852383b9523068a75a047e5f640f2d28057e0c64
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44304
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Jian Yu [Mon, 2 Aug 2021 20:03:09 +0000 (13:03 -0700)]
LU-14870 kernel: kernel update RHEL8.4 [4.18.0-305.10.2.el8_4]
Update RHEL8.4 kernel to 4.18.0-305.10.2.el8_4 for Lustre client.
Test-Parameters: trivial clientdistro=el8.4
Change-Id: I02096acb2f6acb77d1ea29f4328598172d0ae258
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44473
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Sat, 31 Jul 2021 07:45:56 +0000 (15:45 +0800)]
LU-14709 pcc: VM_WRITE should not trigger layout write
VM area marked with VM_WRITE means that pages may be written, but
mmap page write may never happen.
It should delay layout write until the actual modification on the
file happen in ->page_mkwrite().
Otherwise, it will trigger panic for PCC-RO sanity-pcc test_21f().
Lustre-change: https://review.whamcloud.com/44483
Lustre-commit: TBD (from
de1fe260d8c88112b46857d69cf5fe9e5d06cfbd)
Fixes:
f2d1c4ee4 ("LU-14647 flr: mmap write/punch does not stale other mirrors")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1cbfef8a4ed7e2c718324fd8a21bafd6157b5f0c
Reviewed-on: https://review.whamcloud.com/44452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Mon, 26 Jul 2021 00:51:59 +0000 (08:51 +0800)]
EX-3301 lipe: use larger candidate number in lpcc_purge
Increase the default candidate number to 128K. Calculate the
number of n_discard and n_detach dynamically based on the
candidate number. Set timeout of continious scanning.
Change-Id: I8adacb722fdec820a914250c54d05e0abd740140
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/44040
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Mon, 26 Jul 2021 00:48:01 +0000 (08:48 +0800)]
EX-3323 lipe: add --roid option for lpcc_purge
Command line option --roid is an alias of --rwid of lpcc_purge.
Change-Id: I1ff9c828ce92cadf9d97db4b91be9825824a7fc2
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/44005
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Lei Feng [Wed, 21 Apr 2021 01:30:59 +0000 (09:30 +0800)]
EX-3051 lipe: posix scan in DFS way
In lipe posix scan function, there is a queue to save
directories to be scanned. If we put the new entries into the
tail and get entries from the head, it's an approximate
Breadth-First-Search. But if we put the new entries into the
head and get entries from the head too, it's an approximate
Depth-First-Search. DFS is better to save memory of queue
and can get file ASAP.
Change-Id: Ief32cf9cea42ba35c90084ee500a0801cc8396aa
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43381
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Cyril Bordage [Wed, 7 Jul 2021 13:27:54 +0000 (15:27 +0200)]
LU-14114 lnet: print device status in net show command
A device can be in fatal state, if the cable was disconnected, or the
port brought down on the switch side. In these cases, the LND (o2iblnd
for now), will flag the device in fatal state. That device will not be
used any further. However, it's health will not be decremented. This
causes some confusion when examining the state of the node.
It is better to print the device status in the output of the lnetctl
net show command.
Lustre-change: https://review.whamcloud.com/44169
Lustre-commit:
f75ff33d9fbefd6995a26693032a32a0ba211b51
Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I7c635ab1062f6153449fcec1bc07585065818a72
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44383
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Nathaniel Clark [Thu, 15 Jul 2021 15:38:02 +0000 (11:38 -0400)]
EX-3462 hotpools: Use -S for clush more
Use -S in clush where appropriate. Some instances are okay to fail.
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I87b37ae8a0bcefce5d724d9fe71f830175dffce5
Reviewed-on: https://review.whamcloud.com/44318
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Bobi Jam [Wed, 28 Apr 2021 04:43:11 +0000 (12:43 +0800)]
LU-14646 flr: write a FLR file downgrade SoM
Seek over file size and write a FLR file does not change its SoM
and that makes file size incorrect.
This patch also fixes rename connect flags "pccro" to "pcc_ro"
which causes that PCC-RO related tests in sanity-pcc.sh are all
skipped.
Lustre-change: https://review.whamcloud.com/43471
Lustre-commit:
f437134e80a1b320e575d774061e693042f3eb4c
Fixes:
25836ff90 ("LU-10948 mdt: New connect flag for non-open-by-fid lock")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I3075389721bdd40be60e9206c37f6c1bea514cce
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Li Xi [Thu, 29 Jul 2021 00:59:11 +0000 (08:59 +0800)]
EX-3562 build: failed to find lipe dir
A dollar sign is missing in the script of contrib/lbuild/lbuild
so lipe dir is not able to be found.
Test-Parameters: trivial
Change-Id: I3de918d7e8d11123e5ee8672b089220587ed6756
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/44418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Sebastien Buisson [Thu, 1 Jul 2021 15:20:39 +0000 (00:20 +0900)]
LU-14804 nodemap: do not return error for improper ACL
In nodemap_map_acl(), in case the ACL is incorrect, do nothing
and just return initial size to caller.
Lustre-change: https://review.whamcloud.com/44127
Lustre-commit:
601c48f3ecaefcb644f236344e139088f76a2a07
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I26aba9ce43e4a8878bfa47e145b1b44cfff89403
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44398
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Fri, 11 Jun 2021 14:49:47 +0000 (16:49 +0200)]
LU-14739 quota: nodemap squashed root cannot bypass quota
When root on client is squashed via a nodemap's squash_uid/squash_gid,
its IOs must not bypass quota enforcement as it normally does without
squashing.
So on client side, do not set OBD_BRW_FROM_GRANT for every page being
used by root. And on server side, check if root is squashed via a
nodemap and remove OBD_BRW_NOQUOTA.
Lustre-change: https://review.whamcloud.com/43988
Lustre-commit:
a4fbe7341baf12c00c6048bb290f8aa26c05cbac
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I95b31277273589e363193cba8b84870f008bb079
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44292
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Bobi Jam [Wed, 28 Apr 2021 05:07:36 +0000 (13:07 +0800)]
LU-14647 flr: mmap write/punch does not stale other mirrors
mmap write and punch/fallocate do not stale other mirrors and makes
FLR file contains different content in different mirrors.
Lustre-change: https://review.whamcloud.com/43470/
Lustre-commit:
03511484c668355c77e54e4b01600183236d8673
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I93a3eb5ba898e3bf0ce108718506b742ed485da5
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Jian Yu [Thu, 22 Jul 2021 07:39:45 +0000 (00:39 -0700)]
LU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7]
Update RHEL7.9 kernel to 3.10.0-1160.36.2.el7.
Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9
Change-Id: Ie2898b1df28c8b99ea4099e94baafe388c6aa626
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Thu, 22 Apr 2021 09:51:26 +0000 (17:51 +0800)]
EX-3054 lipe: unlink pcc cache file if it's needed
Cache file is not guaranteed to be deleted after detach
operation returns success. So we unlink the cache file
if it's needed.
Change --force_clear option to --clear_hashdir option.
It tries to remove empty hash dir recursively up to the cache
dir root. The option is off by default.
Change-Id: I49911c688faaf6c7baa814b260a4ff492de077fc
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43403
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Tue, 20 Apr 2021 00:08:59 +0000 (08:08 +0800)]
EX-3041 lipe: include reserved space into disk usage
Typically ext4 fs reserves 5% disk space for root. It's
calculated as used space when df command shows the Use%.
So we calculate the usage in lpcc_purge in the same way
to be consistent with df.
Change-Id: I1cdee6ea66ad27bf7501cecb4a4a9495e09647a6
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Wed, 14 Apr 2021 09:30:29 +0000 (17:30 +0800)]
EX-2988 lipe: double confirm atime before detach it
For lpcc_purge, double confirm the atime before detach it.
If the atime has been changed, don't detach it.
Change-Id: Ib08fc684f1c815cc8477bd8041e007541c18b6c4
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43312
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Tue, 13 Apr 2021 00:19:52 +0000 (08:19 +0800)]
EX-2989 lipe: collect lpcc_purge stats
Collect stats data for lpcc_purge and dump them by sigusr1.
Change-Id: Ifbc5502e53efd3a40846e4ea8c551f05f6b0ce09
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43287
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Wed, 7 Apr 2021 08:08:55 +0000 (16:08 +0800)]
EX-2975 lipe: implement approximate LRU in lpcc_purge
LRU is actually FIFO based on the atime of cache file. But we
can only use limited memory to sort the atime of all files.
So we implement an approximate LRU in lpcc_purge.
Change-Id: Ic636d97b08ee6424f54a1088dfe07359df1f8f79
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43224
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Tue, 27 Apr 2021 10:50:10 +0000 (18:50 +0800)]
EX-2818 lipe: Parse arguments and config file
Add the code to parse arguments and config file, and some arguments.
Add the framework of wait-scan-free.
Change-Id: I792715badcb5a1fb1aa9062f04c2afab3229b747
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43194
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lei Feng [Thu, 1 Apr 2021 07:02:31 +0000 (15:02 +0800)]
EX-2818 lipe: Create lpcc_purge daemon
Create an empty lpcc_purge daemon.
Change-Id: I393797cfc328e44fa89b6fc89363a4e079a26993
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43192
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Minh Diep [Wed, 21 Jul 2021 23:17:48 +0000 (16:17 -0700)]
EX-3468 build: only include mlnx-tools for MOFED 5.4+
mlnx-tools is required starting 5.4+
Test-Parameters: trivial
Change-Id: I8fdd4d0b4ac86335d4d7fe349c89f502d608940c
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44370
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Jian Yu [Wed, 21 Jul 2021 18:48:01 +0000 (11:48 -0700)]
EX-3516 tests: skip hot-pools.sh during interop testing
Skip hot pools for interop testing. It is really a server-side
functionality, and using old/new test scripts doesn't make sense.
Test-Parameters: trivial \
serverjob=lustre-b_es5_2 serverbuildno=305 \
testlist=hot-pools
Change-Id: I26fee51cf284b4f4a75ddd4ef715dd9ff417d283
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44366
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Hongchao Zhang [Wed, 21 Jul 2021 08:44:07 +0000 (16:44 +0800)]
LU-13124 scrub: check for multiple linked file
The files on OSTs should have only one link, but it could
have more than one link when there are some disk failures
"multiply claimed block(s)" and fixed by e2fsck to clone
these conflicted blocks. This patch adds the check of these
multiple linked files in Scrub on OST.
The name of the objects in "O" depends on the object's FID,
the directory pattern is O/[FID_SEQ]/[SUB_DIR]/[FID_OID],
the inodes of these multiple linked files are normal, but
there is only one directroy entry compatible with the object,
this patch scans all files under "O" to check whether its name
is matched with its FID.
Lustre-change: https://review.whamcloud.com/37194
Lustre-commit:
0c1ae1cb9c19f8a4f6c5a7ff6a1fd54807430795
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I280a725939b037006935d47e9ef426a4a6a7b317
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44299
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Hongchao Zhang [Wed, 30 Jun 2021 11:15:03 +0000 (19:15 +0800)]
LU-14831 osd-ldiskfs: uninited osd_inode_id is used
In osd_fid_lookup, the "osd_inode_id" could be used uninitializedly
if the FID doesn't exist in OI, which cause some faked FID/inode
pair to be inserted into OI file and the OI scrub thread could be
triggered repeatedly.
Lustre-change: https://review.whamcloud.com/44349
Lustre-commit: TBD (from
fd0330f95dc2b1d4f10dd137dc0bfaa7ebbf0dfb)
Change-Id: I9100dece9e94d3e590f17fb4498601876aa1edaa
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44350
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Wang Shilong [Mon, 14 Jun 2021 01:28:51 +0000 (09:28 +0800)]
LU-14729 osd-ldiskfs: fix to declare write commits
Fallocation might introduce unwritten extents, writting
data will trigger extents split, so we should reserve
credits for this case, to avoid complicated calculation,
we just use normal credits calculation if extent is mapped
as unwritten.
See comments in ext4:
If we add a single extent, then in the worse case, each tree
level index/leaf need to be changed in case of the tree split.
If more extents are inserted, they could cause the whole tree
split more than once, but this is really rare.
Lustre always reserve extents in 1 extent case, this is wrong.
Also fix indirect blocks calculation.
Lustre-change: https://review.whamcloud.com/43994
Lustre-commit:
9810341a839c27b7a53cdc047e0395f8f906c4bf
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I9b67ec7b002711f040f46d0c77a645bb6f57a7de
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44280
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Wang Shilong [Wed, 2 Jun 2021 01:52:39 +0000 (09:52 +0800)]
LU-14729 osd-ldiskfs: declare dirty block groups correctly
Calculate dirty block groups only include estimated extents,
indirect blocks and extent node/leaf blocks are missed, this
could make us short of credits.
Lustre-change: https://review.whamcloud.com/43890
Lustre-commit:
42cda8781f94ad1138afac2d23180ea48f3c3450
Fixes:
0271b17b80a82 ("LU-14134 osd-ldiskfs: reduce credits for new writing")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Iec8525823b04e909c030f94bf75b8eca60d31c50
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44279
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Thu, 3 Jun 2021 00:10:47 +0000 (20:10 -0400)]
LU-10948 mdt: New connect flag for non-open-by-fid lock
While we removed the 2.1 check for open by fid when open
lock is requested, when you talk to old servers that don't
have that patch - they get an open error, so introduce a compat
flag.
Lustre-commit:
72c9a6e5fb6e11fca1b1438ac18f58ff7849ed7d
Lustre-change: https://review.whamcloud.com/43907/
Change-Id: I94d50ad98a2828519853a35fa90c5063adf2feab
Fixes:
41d99c4902 ("LU-10948 llite: Introduce inode open heat counter")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44260
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Tue, 13 Apr 2021 07:46:41 +0000 (03:46 -0400)]
LU-10948 llite: Introduce inode open heat counter
Initial framework to support detection of naive apps that
assume open-closes are "free" and proceed to open/close
same files between minute operations.
We will track number of file opens per inode and last time inode
was closed.
Initially we'll expose these controls:
llite/opencache_threshold_count - enables functionality and controls after
how many opens open lock is requested
llite/opencache_threshold_ms - if any reopen happens within this time (in
ms), open would trigger open lock request
llite/opencache_max_ms - If last close was longer than this many ms
ago - start counting opens from zero again
Once enough useful data is collected we can look into adding a heatmap
or another similar mechanism to better manage it and enable it
by default with sensible settings.
Currently it's disabled by default
Lustre-change: https://review.whamcloud.com/32158/
Lustre-commit:
41d99c4902836b7265db946dfa49cf99381f0db4
Change-Id: I1aa5455b458840acad651f651c883a7a7a67ab4c
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/44301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Wang Shilong [Mon, 26 Apr 2021 03:23:26 +0000 (11:23 +0800)]
LU-14641 osd-ldiskfs: write commit declaring improvement
This patch try to:
1)extent bytes could be missed to increase with less than
1M, fix to to compare it with current value, and decay
it for every allocation.
2)with system space usage growing up, mballoc codes won't
try best to scan block group to align best free extent as
we can. So extent bytes per extent could be decayed to a
very small value, this could make us reserve too many credits.
We could be more optimistic in the credit reservations, even
in a case where the filesystem is nearly full, it is extremely
unlikely that the worst case would ever be hit.
3)Add extent bytes stats and debug ability to analysis
over reservation problem.
Lustre-commit:
0f81c5ae973bf7fba45b6ba7f9c5f4fb1f6eadcb
Lustre-change: https://review.whamcloud.com/43446
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I357c4a855147ba26a9e9bbe9ab1269bcfd44e5f3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44244
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Tue, 20 Jul 2021 23:23:49 +0000 (17:23 -0600)]
RM-620 build: New tag 2.14.0-ddn6
New tag 2.14.0-ddn6
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1e04c0d5f8e79eec8d48cbec37fe576108dded45
Sebastien Buisson [Tue, 23 Mar 2021 14:19:01 +0000 (14:19 +0000)]
LU-13717 sec: rework includes for client encryption
Simplify includes for crypto, by not repeating stubs in case
HAVE_LUSTRE_CRYPTO is not defined.
Expose encoding routines that are going to be used in the Lustre
code (both client and server sides) with filename encryption.
Lustre-change: https://review.whamcloud.com/43386
Lustre-commit:
028281ae195927e97518c573e7be3e326d9e6bd3
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6c5853d6da7120edd2bec3a12494251d873151a8
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44187
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Sebastien Buisson [Thu, 22 Apr 2021 09:26:51 +0000 (11:26 +0200)]
LU-14629 sec: forbid file rename from enc to unencrypted dir
fscrypt allows renaming an encrypted file from an encrypted directory
into an unencrypted directory. But it leaves the file encrypted,
sitting in an unencrypted directory, which can lead to unexpected
issues.
So just prevent this kind of rename, and adapt sanity-sec test_47
accordingly.
Lustre-change: https://review.whamcloud.com/43404
Lustre-commit:
1158386ac9c6a638f791f62e47a7513b2322772c
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I38e17caa4786c1c8d80a363a826a5aa298eb0980
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43908
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Tue, 8 Jun 2021 21:11:41 +0000 (14:11 -0700)]
LU-14742 socklnd: detect link state to set fatal error on ni
To help avoid selecting lnet ni which corresponds to a downed
ethernet link for sending, add a mechanism for detecting link
events in socklnd. On link up/down events, find corresponding
ni and toggle ni_fatal_error_on flag, similar to o2iblnd way.
Lustre-change: https://review.whamcloud.com/43952
Lustre-commit:
fc2df80e96dc5db9f3fb710893ccf6f442664471
Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ie9f4f02fcb8b988c77bf63f751d5a621e79e9f58
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44329
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Serguei Smirnov [Tue, 30 Mar 2021 16:58:57 +0000 (12:58 -0400)]
LU-12815 socklnd: add conns_per_peer parameter
Introduce conns_per_peer ksocklnd module parameter.
In typed mode, this parameter shall control
the number of BULK_IN and BULK_OUT tcp connections,
while the number of CONTROL connections shall stay
at 1. In untyped mode, this parameter shall control
the number of untyped connections.
The default conns_per_peer is 1. Max is 127.
Lustre-change: https://review.whamcloud.com/41056
Lustre-commit:
71b2476e4ddb95aa42f4a0ea3f23b1826017bfa5
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I70bbaf7899ae1fbc41de34553c8c4ad1c7d55f7e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Serguei Smirnov [Tue, 30 Mar 2021 16:48:04 +0000 (12:48 -0400)]
LU-13641 socklnd: replace route construct
With TCP bonding removed, it's no longer necessary to
maintain multiple route constructs per peer_ni in socklnd.
Replace the route construct with connection control block,
conn_cb, and make sure there's a single conn_cb per peer_ni.
Lustre-change: https://review.whamcloud.com/40774
Lustre-commit:
7766f01e891c378d1bf099e475f128ea612488f0
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I1de683429af5f93b3197b6d536e80b5ac1e67a22
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44327
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Serguei Smirnov [Tue, 16 Mar 2021 21:34:26 +0000 (17:34 -0400)]
LU-13641 socklnd: remove tcp bonding
TCP bonding in the socklnd has become obsolete with LNet
Multi-Rail and there's no evidence it's being used anywhere.
Remove it to keep the code simple.
Lustre-change: https://review.whamcloud.com/40000
Lustre-commit:
d123c47a18adbf5665ed63d99c53117b84db9ec8
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ib456f951b8ccd59112c460085632a2cb3c982004
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44326
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Qian Yingjin [Thu, 1 Jul 2021 07:49:58 +0000 (15:49 +0800)]
EX-3409 pcc: add owner capacity check for open attach
This patch adds owner and capacity check when try to auto attach
at the open() time.
Add sanity-pcc test_43.
For the command "lfs pcc attach_fid", make it more convenient by
the way that the parameter "-m" is not mandatory required when
specifying the mount point.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Icb8cbddb64c5712e2db970120b06fc1a0216c332
Reviewed-on: https://review.whamcloud.com/44123
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Chris Horn [Fri, 23 Apr 2021 19:05:02 +0000 (14:05 -0500)]
LU-14627 tests: Create unload_modules_local
t-f allows for loading modules on single node via load_modules_local.
However, there is no corresponding unload_modules_local that can be
called to cleanup after call to load_modules_local, so we create it.
unload_modules() refactored to use unload_modules_local.
Also address a potential issue that can prevent LND modules from
unloading. Some LNet setup (particularly those in sanity-lnet) may
require that we call lnetctl lnet unconfigure (or lctl net down)
to drop a ref on the module before it can be unloaded.
Lustre-change: https://review.whamcloud.com/43425
Lustre-commit:
32304d863ae98c641f541362f54e7b1f24b350a6
HPE-bug-id: LUS-9031
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6458a7728f5f559f8641c5a9e29dd775c8445c38
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44255
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Fri, 15 Jan 2021 01:21:11 +0000 (18:21 -0700)]
LU-12125 tests: allow racer to specify extra tasks
Add the RACER_EXTRA environment variable to allow racer.sh to run
extra tasks.
Lustre-change: https://review.whamcloud.com/41231
Lustre-commit:
fe2663f18e50023ad5cfe5e07b695378dd27a68e
Test-Parameters: trivial testlist=racer
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic810a248a2dd665a163e0efea8c9af0e4461e09b
Reviewed-on: https://review.whamcloud.com/41961
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Tue, 30 Mar 2021 11:20:26 +0000 (19:20 +0800)]
LU-14526 flr: mirror split downgrade SOM
After mirror split, the file's blocks on SoM is not accurate, this
patch downgrade the SoM from STRICT so that size glimpse does not
trust the SoM from the MDS.
Lustre-change: https://review.whamcloud.com/43168
Lustre-commit:
a30750ad2cc5f10d9d1cc0e30199073091c06f2b
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I02350c24190d96af93fed8c1b8a0bc6beb2c4bc2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44253
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Mikhail Pershin [Tue, 22 Jun 2021 18:16:26 +0000 (21:16 +0300)]
LU-13055 mdd: per-user changelog names and mask
Allow specifying a name for newly-registered changelog users,
rather than the default "clNNN" that is otherwise used. This
allows services to register a "well-known" changelog user,
rather than having to store the changelog username in HA storage
outside of the filesystem.
Each changelog user still has a unique ID appended to it, to allow
the changelog_clear and changelog_deregister commands to be run
using only the ID if necessary/desired. User name can be used to
deregister. User name is also unique per server.
If no name is given, then default "cl" format is used.
With this new functionality, it is possible to specify the name like:
# lctl --device testfs-MDT0000 changelog_register --user watcher
testfs-MDT0000: Registered changelog userid 'cl13-watcher'
Per-user mask is also added to allow specific operation logging on
per-user basis. Mask can be set only during registration. Resulting
mask from per-server mask and all user masks is used for current
changelog operations.
Lustre-change: https://review.whamcloud.com/43380
Lustre-commit:
a15eb4f13224e148810015896101b2950c85adff
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I56028f54cc97bbc9af03fd6559c19ef854f759d8
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/44283
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
John L. Hammond [Wed, 2 Jun 2021 17:05:01 +0000 (12:05 -0500)]
LU-14731 mdd: clear orphans changelog entries
In mdd_changelog_llog_init(), adjust the orphan changelog index logic
to account for the case when no users are registered. Add sanity
test_160n() to verify this.
Lustre-change: https://review.whamcloud.com/43901
Lustre-commit:
c7d8fe31064990d7053436ccd720c531bc78a2dc
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I03b0c1002a0e16f26af8ec23bf06c9a07dec858a
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexander Boyko [Mon, 17 May 2021 13:29:01 +0000 (09:29 -0400)]
LU-14688 mdt: changelog purge deletes plain llog
With a massive cancel records changelog could delete a plain
llog file and skip one by one record cancelling.
Also patch fixes the race between llog_destroy and llog_next_block.
Lustre-change: https://review.whamcloud.com/43719
Lustre-commit:
d813c75df6798efbf3228347628c0d671ca7269c
HPE-bug-id: LUS-9950
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I47c2ed97945e979745255381f83b6a417d7ba8b1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44262
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Wang Shilong [Wed, 17 Mar 2021 09:58:00 +0000 (17:58 +0800)]
LU-12142 readahead: limit over reservation
For performance reason, exceeding @ra_max_pages are allowed to
cover current read window, but this should be limited with RPC
size in case a large block size read issued. Trim to RPC boundary.
Otherwise, too many read ahead pages might be issued and
make client short of LRU pages.
Lustre-commit:
1058867c004bf19774218945631a691e8210b502
Lustre-change: https://review.whamcloud.com/42060
Fixes:
777b04a093 ("LU-13386 llite: allow current readahead to exceed reservation"
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Icf74b5fbc75cf836fedcad5184fcdf45c7b037b4
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/43455
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Wed, 10 Mar 2021 10:13:18 +0000 (18:13 +0800)]
LU-14504 lod: lod_xattr_del() check obj existence
lod_declare_xattr_del() skips object if it doesn't exist, but
lod_xattr_del() doesn't, which may trigger assertion in
osp_xattr_del() if a stripe doesn't exist.
Lustre-change: https://review.whamcloud.com/41976
Lustre-commit:
c8d81a1d1d82fede40ae95924aca12bc5e55426d
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I00723d3b0243efd1357107c59dd86967e076e2af
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/42047
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alexander Boyko [Mon, 12 Apr 2021 12:19:47 +0000 (08:19 -0400)]
LU-14606 llog: hide ENOENT for cancelling record
Llog allows parallel records processing. A record could be cancelled
at callback. If two threads processing and cancelling the same record,
one thread would get ENOENT.
The error was observed during purging changlog records.The patch
adds reproducer test sanity 160m.
This is a valid case, let's hide ENOENT error from a caller.
Lustre-change: https://review.whamcloud.com/43264
Lustre-commit:
0b60647c0382426e3b4105d82d04862d2e4831cb
HPE-bug-id: LUS-9826
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id00b959e6f329c2ad34966f8a17a52f71680f24c
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44333
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Wang Shilong [Tue, 22 Jun 2021 01:26:40 +0000 (09:26 +0800)]
LU-14778 readahead: fix to reserve min pages
@pages_min might be larger than @pages which indicate
more pages should be read, and it will cause a warning
later.
Lustre-change: https://review.whamcloud.com/44050
Lustre-commit:
4fc127428f00d6a3b179a143a61ddc78e5d8ca7c
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ifd82f709c3877172f08b87ab0551da735a0613e0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/44287
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Mon, 17 May 2021 09:14:33 +0000 (17:14 +0800)]
LU-14549 llite: refresh layout after mirror merge/split
mirror merge/split updates file's LOVEA and revokes client's layout
lock, but the client issuing the layout change needs to refresh its
layout (lov->lsm) as well.
Lustre-change: https://review.whamcloud.com/43716
Lustre-commit:
bd7a20f8be4644ebce6a7225560ae933204f543d
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7671efe2fe5354ba0e1503b146045165608e042c
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44252
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Thu, 14 Jan 2021 09:14:01 +0000 (17:14 +0800)]
LU-14119 lfsck: check linkea if it's newly added
In LFSCK phase one, if new linkea entry is added, and final linkea
entry count is more than one, add file in trace file, so that the
linkea sanity will be checked in phase two.
And in phase two check, if link parent FID can't be mapped to valid
inode, remove it from linkea.
Add sanity-lfsck 1d, which changed parent directory FID in LMA,
therefore the FID in LMA mismatches with parent FID in child linkea,
verify LFSCK can fix such inconsistency.
Lustre-change: https://review.whamcloud.com/41261
Lustre-commit:
afd00cacd0b6ef87282887b4e965350a9c1a6821
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I315983d262110c1e36c3893fa2e51925d96c51a7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44237
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Sat, 5 Jun 2021 08:34:15 +0000 (02:34 -0600)]
LU-14734 osd-ldiskfs: enable large_dir automatically
Enable the large_dir feature automatically at mount time for
filesystems that do not have it enabled already. Otherwise,
the REMOTE_PARENT_DIR may overflow if there are many remote
entries created, or for object directories on very large OSTs.
It isn't really needed on a dedicated MGS filesystem.
Lustre-change: https://review.whamcloud.com/43931
Lustre-commit:
0f6ace6e8edef1c08c8ef3785e9c08d21a72b34a
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1c4ead26b09d60567ad12945d7b366b53475cebb
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Wed, 12 May 2021 08:18:00 +0000 (16:18 +0800)]
LU-14648 lod: protect lod_object layout info
Need to protect lod_object's layout access with ldo_layout_mutex.
Lustre-commit:
25aa8527374f8120c113dc12adb1366a1ab98152
Lustre-change: https://review.whamcloud.com/43671
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I2c4a2078bdce64d15485d3ff18f6670d42ca90ba
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44256
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Alex Zhuravlev [Sun, 2 May 2021 09:16:01 +0000 (12:16 +0300)]
LU-14663 mdc: start changelog thread upon first access
thus leaving the caller a chance to set CHANGELOG_FLAG_FOLLOW,
otherwise the thread (started from open()) can reach the end
of the changelog and exit early.
Lustre-change: https://review.whamcloud.com/43513
Lustre-commit:
72a08ea547dceb542d554e9057e0ed257138bd48
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ic14b6c991010bbe5197b5a8b0fedf0f4007e98c1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44263
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Mon, 14 Jun 2021 07:26:47 +0000 (15:26 +0800)]
LU-14762 lmv: compare space to mkdir on parent MDT
In QOS subdirectory creation, subdirectories are kept on parent MDT
if it is less full than average, however it checks weight other than
free space, while "weight = free space - penalty", if MDTs have
different penalties, the result is not accurate, therefore this may
not work.
Check free space instead, and loosen the critirion to allow the
free space within the range of QOS threshold.
Lustre-change: https://review.whamcloud.com/43997
Lustre-commit:
002c2a80266b23c1df02d554fbdc7e5817c42d13
Fixes:
3f6fc483013d ("LU-13439 lmv: qos stay on current MDT if less full")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id34cf8f3f58fee9d329f0d05c2f7a6463b67dfe1
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44314
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Rahul Deshmkuh [Thu, 14 Jul 2016 06:02:45 +0000 (23:02 -0700)]
LU-7853 lod: fixes bitfield in lod qos code
Updating bitfields in struct lod_qos struct is protected
by lq_rw_sem in most places but an update can be lost
due unprotected bitfield access from
lod_qos_thresholdrr_seq_write() and qos_prio_free_store().
This patch fixes it by replacing bitfields with named bits
and atomic bitops.
Lustre-change: https://review.whamcloud.com/18812
Lustre-commit:
3bae39f0a5b98a279fb5f7b8d00211ac0d09366f
Cray-bug-id: LUS-4651
Signed-off-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Change-Id: I28299ce4960e91be551d7f6e43a3b598daf4d7a2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-on: https://review.whamcloud.com/44313
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Boyko [Wed, 14 Oct 2020 08:20:58 +0000 (04:20 -0400)]
LU-14031 ptlrpc: decrease time between reconnection
When a connection get a timeout or get an error reply from a sever,
the next attempt happens after PING_INTERVAL. It is equal to
obd_timeout/4. When a first reconnection fails, a second go to
failover pair. And a third connection go to a original server.
Only 3 reconnection before server evicts client base on blocking
ast timeout. Some times a first failed and the last is a bit late,
so client is evicted. It is better to try reconnect with a timeout
equal to a connection request deadline, it would increase a number
of attempts in 5 times for a large obd_timeout. For example,
obd_timeout=200
- [
1597902357, CONNECTING ]
- [
1597902357, FULL ]
- [
1597902422, DISCONN ]
- [
1597902422, CONNECTING ]
- [
1597902433, DISCONN ]
- [
1597902473, CONNECTING ]
- [
1597902473, DISCONN ] <- ENODEV from a failover pair
- [
1597902523, CONNECTING ]
- [
1597902539, DISCONN ]
The patch adds a logic to wakeup pinger for failed connection request
with ETIMEDOUT or ENODEV. It adds imp_next_ping processing for
ptlrpc_pinger_main() time_to_next_wake calculation, and fixes setting
of imp_next_ping value.
Lustre-commit:
de8ed5f19f04136a4addcb3f91496f26478d03e7
Lustre-change: https://review.whamcloud.com/40244
HPE-bug-id: LUS-8520
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ia0891a8ead1922810037f7d71092cd57c061dab9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-on: https://review.whamcloud.com/44251
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Andreas Dilger [Sun, 11 Apr 2021 02:04:30 +0000 (20:04 -0600)]
LU-14603 ptlrpc: quiet messages for unsupported opcodes
Reduce message spew for unhandled RPC opcodes.
Lustre-change: https://review.whamcloud.com/43257
Lustre-commit:
a23767580aebfab7f093df562ac7598e85b71b3e
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I35496168e3aa29ecb06076654ef0aa97ba2540e5
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-on: https://review.whamcloud.com/44249
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Wang Shilong [Wed, 28 Apr 2021 14:26:10 +0000 (22:26 +0800)]
LU-14541 llite: avoid stale data reading
remove_mapping() can prohibit to kill page from page cache due page
refcount!=2, in vvp_page_delete() clear uptodate flag in case
stale data reading later.
Lustre-change: https://review.whamcloud.com/43476
Lustre-commit: TBD (from
e6033b193e8d35e689b7c2860374c8b2d2b7a5ee)
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I322debec951b1a342246475456c0f40e10b0e578
Reviewed-on: https://review.whamcloud.com/44291
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Wang Shilong [Fri, 16 Apr 2021 02:04:17 +0000 (10:04 +0800)]
LU-14616 readahead: export pages directly without RA
With Readahead disabled, @vpg_defer_uptodate should not
be set as we don't reserve credits for such read.
In vvp_page_completion_read() we will call ll_ra_count_put()
which makes @ra_cur_pages negative.
Lustre-change: https://review.whamcloud.com/43338
Lustre-commit:
9f1c0bfd10d619a3755c3b22b1dd95a593720ce9
Fixes:
7e8efb339b ("LU-12043 llite: fix to submit complete read block with ra disabled")
Change-Id: I1c9134f5972aa0d0e7aac998f02c690cc55b433b
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/44246
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Bobi Jam [Fri, 16 Apr 2021 15:56:01 +0000 (23:56 +0800)]
LU-14618 lov: correctly handling sub-lock init failure
In lov_lock_sub_init(), if a sublock initialization fails, it needs to
bail out of the outer loop as well as the inner one.
Lustre-change: https://review.whamcloud.com/43345
Lustre-commit:
1a5169f9962e254ed4225fe35e8ee6cb6ff7a7f6
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic4e16f484a0a64c670eea5d47054bac19bc95144
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44245
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Alexander Boyko [Wed, 14 Oct 2020 07:45:21 +0000 (03:45 -0400)]
LU-14031 ptlrpc: remove unused code at pinger
The timeout_list was previously used for grant shrinking,
but right now is dead code.
Lustre-change: https://review.whamcloud.com/40243
Lustre-commit:
f02266305941423a10e8e6ec33a5865e24c18432
HPE-bug-id: LUS-8520
Fixes:
fc915a43786e ("LU-8708 osc: depart grant shrinking from pinger")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ia7a77b4ac19da768ebe1b0879d7123941f4490b5
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44250
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Wed, 21 Apr 2021 19:22:46 +0000 (14:22 -0500)]
LU-14627 lnet: Allow delayed sends
The net_delay_add has some code related to delaying sends, but it
isn't fully implemented. Modify lnet_post_send_locked() to check
whether the message being sent matches a rule and should be delayed.
Fix some bugs with how the delay timers were set and checked.
Lustre-change: https://review.whamcloud.com/43416
Lustre-commit:
ab14f3bc852e708100d21770c00235f95841708a
HPE-bug-id: LUS-7651
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Icbd9ee81d2ff0162a01a4187807ea2114a42276d
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44254
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Chris Horn [Fri, 19 Mar 2021 18:22:26 +0000 (13:22 -0500)]
LU-14540 o2iblnd: Use REMOTE_DROPPED for ECONNREFUSED
ECONNREFUSED means that we received a response from the remote end,
so setting the LNet health status to REMOTE_DROPPED is more
appropriate than setting LOCAL_DROPPED. Using REMOTE_DROPPED will
decrement the peer NI health and allow us to try other peer NIs for
future sends.
Decrementing the peer NI health will also result in routes being
marked down, as appropriate, for cases where a router has refused the
connection request.
Lustre-commit:
f9d837b479232bfc4f271f23cd3729ca67cb6c1d
Lustre-change: https://review.whamcloud.com/42114
Test-Parameters: trivial
HPE-bug-id: LUS-9853
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I8190f5d78a76ec25553908c4f215362c0c2051fc
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/44248
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Lai Siyao [Sat, 13 Mar 2021 15:48:54 +0000 (23:48 +0800)]
LU-14537 mdd: directory migrate skips project ID check
mdd_migrate_sanity_check() used to call mdd_rename_sanity_check(),
while the latter checks parent and sub file project ID, which is
redundant for migration because it's an internal layout change.
Add sanity 230t.
Lustre-change: https://review.whamcloud.com/42110
Lustre-commit:
d32beef5ddde49362a849efabf514206a1adf6c4
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If5ac2131acb1dfb30a312dc34052287776f581c7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44247
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
John L. Hammond [Thu, 15 Jul 2021 14:34:59 +0000 (09:34 -0500)]
EX-2921 lipe: merge lipe changes from b_es5_2
Merge commit '
b6fee2b803e7ae817f52911c9475e21227260d1d' into b_es6_0
$ git checkout b_es5_2
$ git subtree split --prefix=lipe
39ab2a09b51500ad850472110b31fa42576e1551
$ git checkout b_es6_0
$ git subtree merge --prefix=lipe --squash
39ab2a09b51500ad850472110b31fa42576e1551
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ie1cde4bb85d6c80882e0479e85990b9e71c2da35
John L. Hammond [Thu, 15 Jul 2021 14:25:52 +0000 (09:25 -0500)]
Squashed 'lipe/' changes from 8251fae..39ab2a0
39ab2a0 EX-3379 lipe: update EXT2_ET_EA_NAME_NOT_FOUND kluge
7c8d939 EX-3388 lipe: use correct timestamps
e1ec215 Update lipe version to 1.18.
38a55f7 EX-3058 lamigo: drop NO_ACCT tag upon replication completion
79c9922 EX-3138 lipe: lpurge slot mutex handling
d0ddcc9 EX-3100 lamigo: check rj_agent pointer
587b1ef EX-3030 lamigo: more stats
87fd5d6 EX-3046 lipe: remove IML sockets
954971d EX-2659 tests: add sanity-lipe.sh to test LiPE utilities
git-subtree-dir: lipe
git-subtree-split:
39ab2a09b51500ad850472110b31fa42576e1551
Patrick Farrell [Tue, 15 Jun 2021 14:23:04 +0000 (10:23 -0400)]
LU-13799 osc: Improve osc_queue_sync_pages
This patch was split and partially done in:
https://review.whamcloud.com/38214
So the text below refers to the combination of this patch
and that one. This patch now just improves a looped atomic
add by replacing with a single one. The rest of the grant
calcuation change is in
https://review.whamcloud.com/38214
(I am retaining the text below to show the performance
improvement)
----------
osc_queue_sync_pages now has a grant calculation component,
this has a pretty painful impact on the new faster DIO
performance. Specifically, per page ktime_get() and the
per-page atomic_add cost close to 10% of total CPU time in
the DIO path.
We can make this per batch of pages rather than for each
page, which reduces this cost from 10% of CPU to almost
nothing.
This improves write performance by about 10% (but has no
effect on reads, since they don't use grant).
This patch reduces i/o time in ms/GiB by:
Write: 10 ms/GiB
Read: 0 ms/GiB
Totals:
Write: 158 ms/GiB
Read: 161 ms/GiB
mpirun -np 1 $IOR -w -t 1G -b 64G -o $FILE --posix.odirect
Before patch:
write 6071
After patch:
write 6470
(Read is similar.)
This also fixes a mistake in
c24c25dc1b / LU-13419 where it
removed the shrink interval update entirely from the direct
i/o path.
Lustre-change: https://review.whamcloud.com/39482
Lustre-commit: TBD (from
ad6f5a41a14d6017e657d5c337fa24e96252bbeb)
Fixes:
c24c25dc1b ("LU-13419 osc: Move shrink update to per-write")
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: Ic606e03be58239c291ec0382fa89eba64560da53
Reviewed-on: https://review.whamcloud.com/44270
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Qian Yingjin [Wed, 14 Jul 2021 08:41:02 +0000 (16:41 +0800)]
EX-3480 pcc: set return code with errno
In liblustreapi_pcc.c, it should set errno on error return.
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibc80ea7a593f153744f10483720db9cb79ef060a
Reviewed-on: https://review.whamcloud.com/44302
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Patrick Farrell [Mon, 13 Apr 2020 16:23:42 +0000 (11:23 -0500)]
LU-13419 osc: Move shrink update to per-write
Updating the grant shrink interval is currently done for
each page submitted, rather than once per write. Since
the grant shrink interval is in seconds, this is
unnecessary.
This came up because this function showed up in the perf
traces for https://review.whamcloud.com/#/c/38151/, and
it is called with the cl_loi_list_lock held.
Note that this change makes this access to the grant shrink
interval a 'dirty' access, without locking, but the grant
shrink interval is:
A) Already accessed like this in various places, and
B) can safely be out of date or suffer a lost update
without affecting correctness or performance.
IOR performance testing with this test:
mpirun -np 36 $IOR -o $LUSTRE -w -t 1M -b 2G -i 1 -F
No patches:
5942 MiB/s
With 38151:
14950 MiB/s
With 38151+this:
15320 MiB/s
Lustre-change: https://review.whamcloud.com/38214
Lustre-commit:
c24c25dc1b84912063f79e44602526c482ca0479
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I8110b3c2570c183d58be2bccdbf76813ea3e373a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44266
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Bobi Jam [Fri, 28 May 2021 08:25:52 +0000 (16:25 +0800)]
LU-10350 lod: adjust stripe count to available ost count
* When user specifies -1 stripe count or more stripe count than the
ost count of a pool, we'd adjust the stripe count otherwise we
cannot alloc enough stripe objects, as LOD reports as follows:
lod_alloc_specific() can't lstripe objid [obj_fid]: have %d want %u
where %d is the ost count of a pool, and %u is the total ost count
if user specifies -1 stripe count of a bigger stripe count value
than %d as user specifies.
* In ost-pool.sh, reset $MOUNT's stripe offset, so that the created
diretory will not inherit it from root directory.
* Preserve the root directory layout in replay-single (run before
ost-pools) to avoid leaving a bad layout on the root dir.
Lustre-change: https://review.whamcloud.com/43872
Lustre-change: https://review.whamcloud.com/43882
Lustre-commit:
f430ec079bf882744729d7aabc2021dfd26aba0c
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idf6884faf1271a3864710aeab0ba0eca154bf492
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-on: https://review.whamcloud.com/44259
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Lai Siyao [Wed, 13 Jan 2021 09:29:50 +0000 (17:29 +0800)]
LU-14119 mdc: set fid2path RPC interruptible
Sometimes OI scrub can't fix the inconsistency in FID and name, and
server will return -EINPROGRESS for fid2path request. Upon such
failure, client will keep resending the request. Set such request
to be interruptible to avoid deadlock.
Lustre-change: https://review.whamcloud.com/41219
Lustre-commit:
bf475262610671534b1b1a33cebb49d8380b74f7
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I82192cb8a8256064ca632cabfe5581b12e86423b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44229
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Mon, 7 Jun 2021 19:17:27 +0000 (15:17 -0400)]
LU-14741 obdclass: Wake up queue of reqs on close completion
Origin title:
LU-14741 obdclass: Wake up entire queue of requests on close
completion
Since close requests could be stuck behind normal requests and get
more slots we need to wake up entire accumulated queue waiting
for the next modrpc slot or have additional waitqueue just for
close requests.
This patch goes with the former approach.
Lustre-change: https://review.whamcloud.com/43941
Lustre-commit:
a4e1567d67559b797a5c24ee0bfbca4a52649c47
Fixes:
1fc013f901 ("LU-5319 mdc: manage number of modify RPCs in flight")
Change-Id: Ib4333c7f6731dd435364d5e5f529577a1600a235
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/44288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Oleg Drokin [Sat, 29 May 2021 02:42:49 +0000 (22:42 -0400)]
LU-14711 tests: Ensure no eviction with long cache discard
Origin title:
LU-14711 tests: Ensure there's no eviction with long cache discard
Just pause execution while doing page processing
for discard if appropriate failloc is set.
Lustre-change: https://review.whamcloud.com/43869
Lustre-commit: TBD (from
3323b40668cddaa1ac6f6644436bd305c189c5ac)
Change-Id: If0d04f3cad267cbeeab63040d63e048dcf03cd6b
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Test-Parameters: trivial testlist=sanity env=ONLY=903
Reviewed-on: https://review.whamcloud.com/44286
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>