Whamcloud - gitweb
Mr NeilBrown [Fri, 14 Aug 2020 04:02:20 +0000 (14:02 +1000)]
LU-13359 quota: call rhashtable_lookup near params decl
rhashtable_lookup() is an inline function which depends - for
performancs - on the 'rhashtable_params' being visible and
consnt. So it should only be called in the same file that
declared the params.
A recent patch make pools_hash_params an external variable and calls
rhashtable_lookup from a separate file, which will break the
optimisation.
So add lov_pool_find() and use it to maintainer optimization.
Fixes:
6b9f849fd5f4 ("LU-13359 quota: make used for pool correct")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ieeb3b080491f5b2c9c825885fe7a42f4a8599a2a
Reviewed-on: https://review.whamcloud.com/39676
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Wed, 8 Jul 2020 16:19:08 +0000 (16:19 +0000)]
LU-12275 sec: encryption with different client PAGE_SIZE
In order to properly handle encryption/decryption on clients that have
a PAGE_SIZE != LUSTRE_ENCRYPTION_UNIT_SIZE (typically aarch64/ppc64),
a few adjustements are necessary:
- when encrypting, do not proceed with PAGE_SIZE as encryption length.
Instead, round up to a multiple of LUSTRE_ENCRYPTION_UNIT_SIZE.
On aarch64/ppc64, it avoids encrypting way beyond
LUSTRE_ENCRYPTION_UNIT_SIZE when the page is not full.
- when decrypting, do not proceed with PAGE_SIZE as decryption length.
Instead, do LUSTRE_ENCRYPTION_UNIT_SIZE length at a time. It enables
proper detection of 'all 0s' sent by servers for content beyond file
size.
Regarding tests, add sanity-sec test_53 to exercise encryption from
clients with different PAGE_SIZE.
The trick to achieve this with AT is to expect the client to have 64KB
PAGE_SIZE, and the servers to have 4KB PAGE_SIZE, and then mount a
client from the MDS node.
This also means code running on server side needs to have client
encryption support enabled, so CentOS/RHEL 8 at least.
Test-Parameters: trivial
Test-Parameters: clientarch=aarch64 clientcount=1 clientdistro=el8.1 serverdistro=el8.1 testlist=sanity-sec env=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52 53" fstype=ldiskfs mdscount=2 mdtcount=4
Test-Parameters: clientarch=aarch64 clientcount=1 clientdistro=el8.1 serverdistro=el8.1 testlist=sanity-sec env=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52 53" fstype=zfs mdscount=2 mdtcount=4
Test-Parameters: clientarch=x86_64 clientcount=1 clientdistro=el8.1 serverdistro=el8.1 testlist=sanity-sec env=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52 53" fstype=ldiskfs mdscount=2 mdtcount=4
Test-Parameters: clientarch=x86_64 clientcount=1 clientdistro=el8.1 serverdistro=el8.1 testlist=sanity-sec env=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52 53" fstype=zfs mdscount=2 mdtcount=4
Test-Parameters: clientdistro=el8.1 testlist=sanity-sec env=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52" mdscount=2 mdtcount=4
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iee4b4d9e70c2e8c8e12061c39400cf6a8c03bac3
Reviewed-on: https://review.whamcloud.com/39315
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Lai Siyao [Wed, 26 Aug 2020 14:47:14 +0000 (22:47 +0800)]
LU-12295 mdd: don't LBUG() if dir nlink is wrong
Sometimes dir nlink may not be correctly decreased: subdir is remote,
when it's unlinked, its dirent is removed, but parent nlink decrease
failed.
Don't assert this in osd_destroy(), but print an error message and
continue since we've checked directory is empty.
Add OBD_FAIL_OSD_REF_DEL to simulate the error above.
Add sanity 48f.
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I483aaf7a62b7761868b5e2af8dbfa92929fda78c
Reviewed-on: https://review.whamcloud.com/39734
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Chris Horn [Wed, 27 May 2020 17:29:10 +0000 (12:29 -0500)]
LU-13605 lnet: Do not overwrite destination when routing
MR path selection in a routed environment is supposed to allow the
originator of a message to set the final destination NID. On a
multi-hop route, intermediate routers execute the same code path as
the message originator (i.e. the remote send cases). This causes
them to overwrite the destination NID when forwarding the message.
Check the msg_routing flag to determine whether we should set the
final destination NID (i.e. LNet peer NI).
A somewhat related issue is that because intermediate routers are not
selecting a destination lpni, they need to pick the next-hop lpni
based on the destination NID's remote net.
Test-Parameters: trivial
Fixes:
9dfdc2238be ("LU-13035 lnet: fix remote peer ni selection")
HPE-bug-id: LUS-8919
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Id2fbbc5d8da347e971bbb8ad2779e80f75e29dd7
Reviewed-on: https://review.whamcloud.com/38731
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Tue, 5 May 2020 21:13:18 +0000 (16:13 -0500)]
LU-13502 lnet: Add response tracking param to lnetctl
Add support for the lnet_response_tracking parameter to lnetctl.
Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-8827
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I952a415e27582a7a6d920bfeb16618766c0235da
Reviewed-on: https://review.whamcloud.com/38514
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Li Dongyang [Thu, 3 Sep 2020 23:34:34 +0000 (09:34 +1000)]
LU-13187 osd-ldiskfs: don't enforce max dir size limit on IAM objects
Add ext4-no-max-dir-size-limit-for-iam-objects.patch to introduce new
inode state EXT4_STATE_IAM and use it to mark IAM objects.
Test-Parameter: testlist=sanity env=ONLY=129,ONLY_REPEAT=100
Change-Id: I3bcc5435ea07edb9fa265dcd8e3261d849495f00
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/39823
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Alex Zhuravlev [Mon, 15 Jun 2020 08:47:44 +0000 (11:47 +0300)]
LU-13827 utils: ofd_access_batch to print top hot files
with -F option one can specify fraction (in %%) of hot
files to be printed.
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I131c47bb2252da88c9afce757c5264b8d2011d1a
Reviewed-on: https://review.whamcloud.com/39529
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
John L. Hammond [Mon, 23 Mar 2020 15:11:46 +0000 (10:11 -0500)]
LU-13376 utils: add batching to ofd_access_log_reader
Add interval based batching to ofd_access_log_reader. Add option to
control the batch interval, offset within the interval, and batch
output file.
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I057e9616f4ec198dbf0c7c82a93a1e45907e7a42
Reviewed-on: https://review.whamcloud.com/38035
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
John L. Hammond [Thu, 3 Sep 2020 16:48:42 +0000 (11:48 -0500)]
LU-13945 utils: add includes for copy_file_range()
In lstddef.h include the needed headers to support the compat
definition of copy_file_range().
Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I83544cbbeb6407f4c3bb9fa3bd2a1297f2b2a2dc
Reviewed-on: https://review.whamcloud.com/39821
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Andreas Dilger [Wed, 9 Sep 2020 00:29:06 +0000 (18:29 -0600)]
LU-12661 tests: skip sanity 817 for kernel 4.12+
Skip the NFS exec mode bug for kernels 4.12 and later, since
this is also being hit on SLES kernel 4.12.14+ and not just 4.14.
Test-Parameters: trivial
Fixes:
4fed33473ca2 ("LU-12661 tests: skip sanity 817 if kernel >= 4.14")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibc4ffda72bd7827e250c4583c760505b8f3ebbe5
Reviewed-on: https://review.whamcloud.com/39838
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mikhail Pershin [Fri, 26 Jun 2020 17:19:05 +0000 (20:19 +0300)]
LU-13599 mdt: add test for rs_lock limit exceeding
The check conditions in mdt_link_parents_lock() considers there
can be total 6 local locks but in fact when object has hard links
then it is regular file and it may have maximum 5 local locks.
Patch updates that check conditions for rs_locks as small
improvement and ports test for rs_locks limit from 2.12.
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I98cca84825ce5789094fbceb5d1f7975410d134b
Reviewed-on: https://review.whamcloud.com/39194
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Yang Sheng [Thu, 20 Aug 2020 07:39:40 +0000 (15:39 +0800)]
LU-13915 ldiskfs: Avoid atomic operation while bitmap prefetch
It is expensive since test_and_set_bit is a atomic operation. So
use test_bit while bitmap prefetch to avoid call it frequently.
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I2ff2c39f1dd3b351462ed66cbd3ebb36e6af4bea
Reviewed-on: https://review.whamcloud.com/39697
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Lai Siyao [Mon, 3 Aug 2020 18:21:58 +0000 (02:21 +0800)]
LU-13909 llite: prune invalid dentries
When file LOOKUP lock is canceled on client, mark its dentries
invalid, and also prune them to avoid OOM, to achieve this,
ll_invalidate_aliases() is renamed to ll_prune_aliases(), the latter
calls d_prune_aliases() to prune unused invalid dentries.
The same for negative dentries when parent UPDATE lock is canceled,
rename ll_invalidate_negative_children() to
ll_prune_negative_children().
Since now unused invalid dentries will always be pruned, it's not
necessary to call __d_drop() in d_lustre_invalidate().
It's redundant to take i_lock before d_lustre_invalidate() in
ll_inode_revalidate() because d_lustre_invalidate() takes d_lock,
remove it.
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib0ae57537e31ba9269e042b94bc5fbe7cb263a50
Reviewed-on: https://review.whamcloud.com/39685
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Mr NeilBrown [Thu, 13 Aug 2020 00:26:40 +0000 (10:26 +1000)]
LU-4671 tests: give multiop a chance to exit.
If 'multiop' is still running after a test complete, test-framework.sh
reports a failure.
test_43A signals multiop asking it to exit as the last thing it does.
If there is any delay in multip being schedule, test-framework will
see it and report an error - this is a false negative.
So use 'wait' to wait for multiop to respond to the signal.
Test-Parameters: trivial testlist=sanity envdefinitions=ONLY="43A"
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic94d3c89bd98c9f6ef2d5bee22aac1e39116bf11
Reviewed-on: https://review.whamcloud.com/39665
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Chris Horn [Wed, 5 Aug 2020 15:17:14 +0000 (10:17 -0500)]
LU-13896 lnet: Fix reference leak in lnet_select_pathway
We call lnet_nid2peerni_locked() to lookup the peer NI for the message
originator. lnet_nid2peerni_locked() takes a reference on the peer NI
that is never dropped.
Test-Parameters: trivial
Fixes:
b0e8ab1a5f ("LU-13606 lnet: Allow router to forward to healthier NID")
HPE-bug-id: LUS-9185
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie0e8f215d7becfbf33f905a1806da8513798ee8d
Reviewed-on: https://review.whamcloud.com/39603
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Cheremencev [Tue, 21 Jul 2020 12:48:40 +0000 (15:48 +0300)]
LU-13810 tests: OST Pool Quotas with wide striping
All previous tests in sanity-quota except DOM,SEL
and PFL specific tests always use stripe count 1.
Check that block hard limit set to OST Pool Quotas
work properly with wide stripe files.
HPE-bug-id: LUS-8600
Test-Parameters: env=ONLY=1g testlist=sanity-quota
Change-Id: Ia6ebb21adb0fff18c6a9e0a36b1e50cb0f1d212a
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/39469
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Mon, 22 Jun 2020 15:21:42 +0000 (10:21 -0500)]
LU-13735 lnet: Loosen restrictions on LNet Health params
The functions that set various LNet Health related parameters require
that the parameters be set in a specific order depending on whether
health is enabled or disabled. This is not user-friendly.
- Don't overwrite lnet_transaction_timeout when health is being
enabled or disabled.
- Don't overwrite lnet_retry_count when health is being enabled
(still set it to zero when health is disabled).
- Allow lnet_retry_count to be set to 0 when health is disabled
- Correct off-by-one error in transaction_to_set() to ensure
lnet_transaction_timeout is greater than lnet_retry_count
HPE-bug-id: LUS-8995
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic8ca7862543fc667fdf85844e05146c78bf48cd1
Reviewed-on: https://review.whamcloud.com/39228
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Nikitas Angelinas [Mon, 22 Jun 2020 20:40:38 +0000 (13:40 -0700)]
LU-13151 mdt: add parent FID to Changelog recordss
Use the link EA to add the parent FID to ChangeLog records, including
MTIME, TRUNC, and SATTR.
Some tools that maintain copies of filesystem metadata in an external
database monitor changelogs for changes to the filesystem, in order to
determine files that need to be rescanned. This can result in a large
number of small updates to the external database that can reduce the
tool's ingest performance. It might be beneficial to instead track and
scan complete directories that contain modified files and update the
external database using bulk operations. Adding the parent FID to
MTIME changelogs allows to more efficiently determine the parent
directories for some types of file data modifications, by issuing
OBD_IOC_FID2PATH once for each parent FID, instead of once for each
file FID.
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Cray-bug-id: LUS-7986
Change-Id: I0c88271e706ef8202910e9461e5ae9f6dcbe0bdd
Reviewed-on: https://review.whamcloud.com/37264
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathan Rutman <nrutman@gmail.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Sun, 30 Aug 2020 16:45:42 +0000 (12:45 -0400)]
LU-11986 lnet: don't read debugfs lnet stats when shutting down
A race exist on shutdown with an external application reading
the debugfs file containing lnet stats which causes an kernel
crash.
[ 257.192117] BUG: unable to handle kernel paging request at
fffffffffffffff0
[ 257.194859] IP: [<
ffffffffc0bb95c6>] cfs_percpt_number+0x6/0x10 [libcfs]
[ 257.196863] PGD 7c14067 PUD 7c16067 PMD 0
[ 257.198665] Oops: 0000 [#1] SMP
[ 257.200431] Modules linked in: ksocklnd(OE) lnet(OE) libcfs(OE) dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sunrpc zfs(POE) zunicode(POE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) ppdev iosf_mbi crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr sg e1000 video parport_pc parport i2c_piix4 dm_multipath dm_mod ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi crct10dif_pclmul crct10dif_common ata_piix serio_raw libata [last unloaded: obdclass]
[ 257.222895] CPU: 0 PID: 7331 Comm: lctl Tainted: P OE ------------ 3.10.0-957.el7_lustre.x86_64 #1
[ 257.229312] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 257.233659] task:
ffff9c9fbaf15140 ti:
ffff9c9fbabcc000 task.ti:
ffff9c9fbabcc000
[ 257.238388] RIP: 0010:[<
ffffffffc0bb95c6>] [<
ffffffffc0bb95c6>] cfs_percpt_number+0x6/0x10 [libcfs]
[ 257.243851] RSP: 0018:
ffff9c9fbabcfdb0 EFLAGS:
00010296
[ 257.246400] RAX:
0000000000000000 RBX:
ffff9c9fba2a5200 RCX:
0000000000000000
[ 257.250304] RDX:
0000000000000001 RSI:
00000000ffffffff RDI:
0000000000000000
[ 257.253677] RBP:
ffff9c9fbabcfdd0 R08:
000000000001f120 R09:
ffff9c9fbe001700
[ 257.257073] R10:
ffffffffc0c376db R11:
0000000000000246 R12:
0000000000000000
[ 257.260339] R13:
0000000000000000 R14:
0000000000001000 R15:
ffff9c9fba2a5200
[ 257.263204] FS:
00007fbdc89c6740(0000) GS:
ffff9c9fbfc00000(0000) knlGS:
0000000000000000
[ 257.266409] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 257.269105] CR2:
fffffffffffffff0 CR3:
0000000022e36000 CR4:
00000000000606f0
[ 257.272529] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 257.275209] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 257.277936] Call Trace:
[ 257.279245] [<
ffffffffc0c0a88b>] ? lnet_counters_get_common+0xeb/0x150 [lnet]
[ 257.283071] [<
ffffffffc0c0a95c>] lnet_counters_get+0x6c/0x150 [lnet]
[ 257.286224] [<
ffffffffc0c3771b>] __proc_lnet_stats+0xfb/0x810 [lnet]
[ 257.288975] [<
ffffffffc0ba6602>] lprocfs_call_handler+0x22/0x50 [libcfs]
[ 257.292387] [<
ffffffffc0c36bf5>] proc_lnet_stats+0x25/0x30 [lnet]
[ 257.295184] [<
ffffffffc0ba665d>] lnet_debugfs_read+0x2d/0x40 [libcfs]
The solution is to only allow reading of the lnet stats when the
lnet state is LNET_STATE_RUNNING.
Test-Parameters: trivial testlist=sanity-lnet
Change-Id: I8720a51ec358e4f6ae121acb34cc23020054ab84
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/39404
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Fri, 31 Jul 2020 17:51:01 +0000 (19:51 +0200)]
LU-12275 sec: ldiskfs not aware of client-side encryption
In osd-ldiskfs, always remove S_ENCRYPTED from inode flags,
because ldiskfs must not be aware of client-side encryption status.
This info is just stored into LMA so that it can be forwared to client
side.
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ief08c059b04b8c7349d725b50b2094183eabc4d3
Reviewed-on: https://review.whamcloud.com/39558
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Tue, 30 Jun 2020 13:42:58 +0000 (13:42 +0000)]
LU-12275 sec: restrict fallocate on encrypted files
For now, ll_fallocate only supports standard preallocation.
Anyway, encrypted inodes can't handle collapse range or zero range or
insert range since we would need to re-encrypt blocks with a different
IV or XTS tweak (which are based on the logical block number).
So make sure we return -EOPNOTSUPP in this case, like what ext4 does.
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia34cd04df9f297ac54109ed385b037fe282954d7
Reviewed-on: https://review.whamcloud.com/39220
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Wed, 17 Jun 2020 16:03:04 +0000 (16:03 +0000)]
LU-12275 sec: O_DIRECT for encrypted file
Add O_DIRECT support for encrypted files.
By default, fscrypt does not support O_DIRECT because it needs
pagecache pages to proceed.
With Lustre, we can make use of pages being used for sending RPCs.
They can be twisted so that they have a proper mapping and index,
suitable for encryption/decryption.
One of the benefits of O_DIRECT support for encrypted files is that
we get support for mirroring at the same time.
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52" clientdistro=el8.1 fstype=ldiskfs mdscount=2 mdtcount=4
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 52" clientdistro=el8.1 fstype=zfs mdscount=2 mdtcount=4
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I12f61c44b55f3a454f38016200f81eb735ab8f18
Reviewed-on: https://review.whamcloud.com/38967
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Emoly Liu [Thu, 20 Aug 2020 04:09:19 +0000 (12:09 +0800)]
LU-13910 mdt: 0 for success in mdt_path_current()
0 should be returned if no error is found in mdt_path_current(),
otherwise, the non-zero value will be treated as an error in
mdc_ioc_fid2path() and null will be returned by "lfs fid2path".
sanity.sh test_226c is added to verify this patch.
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I1c10b023da9bbbb908dfb691fcea6e84ced67a8d
Reviewed-on: https://review.whamcloud.com/39688
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Mon, 10 Aug 2020 22:52:51 +0000 (08:52 +1000)]
LU-13903 tests: skip test_410 if modules weren't built
test_410 requires a special module, which is only built if
all modules are being built. So if only the user-space code
was built, this test will fail.
So make the test contitional on the module existing.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I235894c72b51f627c01ee08e850a59933b49e033
Reviewed-on: https://review.whamcloud.com/39652
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexander Boyko [Thu, 30 Jul 2020 12:04:27 +0000 (08:04 -0400)]
LU-13608 out: don't return einprogress error
When out_handle proccess an update request it could happened
that file doesn't exist, osd_fid_lookup triggers scrub and
returns EINPROGRESS. Remote MDT would process EINPROGRESS at
ptlrpc layer and resend a request in loop, and MDT recovery
would be blocked.
The fix adds fid to OI for ENOENT, like it was before the LU-7782.
So the second attempt with the same fid will return ENOENT.
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
HPE-bug-id: LUS-9062
Change-Id: Ib9a1753234ccc773e9b9529195ebfa6e5a8c101c
Reviewed-on: https://review.whamcloud.com/39538
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Fri, 10 Apr 2020 10:48:51 +0000 (04:48 -0600)]
LU-13314 utils: fix lfs find time calculation margin
Allow a larger margin when checking files that are years old.
Re-enable sanity test_56ob.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id74e4b737ebb3a2b721d3e3b6d79dffe703ebbe5
Reviewed-on: https://review.whamcloud.com/39433
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Nunez [Fri, 24 Jul 2020 20:13:39 +0000 (14:13 -0600)]
LU-13773 tests: subscript failure propagation
When a test script calls another test script and there is a
failure in the called test script, the failure is not
propagated up to the main/calling test suite and Maloo is
not registering the failed test. An example of this is
sanity-dom calls sanity and sanityn, but, if a sanity test
fails, Maloo does not recognize the sanity failure.
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I1914ecbece469cc1faffdffa4c980241ba3020b2
Reviewed-on: https://review.whamcloud.com/39409
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Wang Shilong [Wed, 1 Jul 2020 07:56:47 +0000 (15:56 +0800)]
LU-13733 llite: report client stats sumsq
Commit cd8fb tries to account sumsq for every client operation, but
lprocfs_counter_init() did not init them properly, also add a test
case to verify new format of client stats.
Fixes:
cd8fb1e8d300 ("LU-13597 ofd: add more information to job_stats")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I4c7fc4c3958fdab757a9755cc851836971a4b700
Reviewed-on: https://review.whamcloud.com/39223
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Tue, 30 Jun 2020 12:19:32 +0000 (20:19 +0800)]
LU-13481 test: run sanity 33h with more files
Run sanity 33h with more files to avoid failure with small samples.
Test-parameters: trivial testlist=sanity env=ONLY=33h,ONLY_REPEAT=500
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If14a852289ffc5907637a9dac6ed48890ced5586
Reviewed-on: https://review.whamcloud.com/39219
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Mr NeilBrown [Mon, 18 May 2020 22:48:58 +0000 (08:48 +1000)]
LU-12780 osd: use native kthreads for scrub.
The 'scrub' thread has nothing to do with ptlrpc, so
using ptlrpc_thread is strange - and not necessary. This patch
switches to native kthread interfaces.
A particular advantage of the switch is seen in the convertion of
thread_is_running().
This is uses in two completely different ways.
1/ The thread itself needs to know if it has been asked to stop.
It currently uss thread_is_running(). After the patch it
uses kthread_should_stop().
2/ Other code needs to know if the thread is still active or not. It
previously used thread_is_running(), so it looked just like the
first case. Now it checks a new flag ->os_running, which is
set precisely when SVC_RUNNING was set, and cleared when any other
status was set.
As the thread can stop itself (e.g if osd_scrub_prep fails) or can be
asked to stop (scrub_stop()), we need to avoid confusion between the
two. This is achieved by calling 'xchg(&scrub->os_task,NULL)'.
If the thread finds that to be non-NULL, it has stopped itself
and can just exit. If scrub_stop() finds it to be non-NULL,
it calls kthread_should_stop() to stop the thread.
Instead of using the waitqueue in the 'struct ptlrpc_thread', we use
wake_up_var() and wait_var_event() on the 'scrub' pointer. This is
used both to tell the thread there is work to do, and for callers to
wait for the task to make progress, or to exit.
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib2f1845151734687e89a1b0d6a5135a5a4ba6e5c
Reviewed-on: https://review.whamcloud.com/38824
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Wang Shilong [Sat, 1 Aug 2020 12:29:47 +0000 (20:29 +0800)]
LU-13846 llite: move iov iter forward by ourself
Newer kernel will reward iov iter back to original
position for direct IO, see following codes:
iov_iter_revert(from, write_len -
iov_iter_count(from));--------->here
out:
return written;
}
EXPORT_SYMBOL(generic_file_direct_write);
This break assumptions from Lustre and caused problem
when Lustre need split one IO to several io loop, considering
4M block IO for 1 MiB stripe file, it will submit first 1MiB IO
4 times and caused data corruptions finally.
Since generic kernel varies from different kernel versions,
we'd better fix this problem by move iov iter forward by
Lustre itself.
Added a new test cases aiocp.c is copied from xfstests,
with codes style cleanups to make checkpatch.pl happy.
Change-Id: Iab5d8f1bb0e74ed49c821c81b734c68770edf4a8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/39565
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Wang Shilong [Thu, 30 Jul 2020 15:09:44 +0000 (23:09 +0800)]
LU-13835 llite: reuse same cl_dio_aio for one IO
IO might be restarted if layout changed, this might
cause ki_complete() called several times for one IO.
Fixes: d1dded6 ("LU-4198 clio: AIO support for direct IO")
Fixes: 84c3e85 ("LU-13697 llite: fix short io for AIO")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I791b8f9d82fae6822d38293ba22adf74560c9dce
Reviewed-on: https://review.whamcloud.com/39542
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Artem Blagodarenko [Thu, 7 May 2020 17:36:05 +0000 (20:36 +0300)]
LU-13533 utils: ext4lazyinit should be disabled
lazyinit gets more than 24H on typical OST installation
and produces writes inside read. This influences to benchmark
tests that are usually executed just after cluster installation
is complete. Testing shows, disabling that feature adds ~30 sec
to formating OST drive so that is not a noticeable time during
install.
Explicitly send lazy_itable_init and lazy_journal_init in
conf_sanity 116 to avoid out of disk space.
HPE-bug-id: LUS-3358
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I95bfa43d11cd67c890b036aa0b71fae4c1eea37d
Reviewed-on: https://review.whamcloud.com/38534
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Fri, 1 May 2020 20:57:22 +0000 (15:57 -0500)]
LU-13502 lnet: Conditionally attach rspt in LNetPut & LNetGet
Create a function to interpret the message type and md options to
determine whether response tracking should be enabled for a particular
PUT or GET.
Use that function in LNetPut and LNetGet to determine whether we
attach the response tracker.
HPE-bug-id: LUS-8827
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ic5f2d3dedc3e773b0ec7866cccf6db9d15dc752a
Reviewed-on: https://review.whamcloud.com/38452
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Fri, 20 Dec 2019 00:51:43 +0000 (11:51 +1100)]
LU-10428 lnet: call event handlers without res_lock
Currently event handlers are called with the lnet_res_lock()
(a spinlock) held. This is a problem if the handler wants
to take a mutex, allocate memory, or sleep for some other
reason.
The lock is needed because handlers for a given md need to
be serialized. At the very least, the final event which
reports that the md is "unlinked" needs to be called last,
after any other events have been handled.
Instead of using a spinlock to ensure this ordering, we can
use a flag bit in the md.
- Before considering whether to send an event we wait for the flag bit
to be cleared. This ensures serialization.
- Also wait for the flag to be cleared before final freeing of the md.
- If this is not an unlink event and we need to call the handler, we
set the flag bit before dropping lnet_res_lock(). This
ensures the not further events will happen, and that the md
won't be freed - so we can still clear the flag.
- use wait_var_event to wait for the flag it to be cleared,
and wake_up_var() to signal a wakeup. After wait_var_event()
returns, we need to take the spinlock and check again.
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4dada92c4c06547bdc567838d129a8851d7de3bd
Reviewed-on: https://review.whamcloud.com/37068
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Tue, 11 Aug 2020 00:32:18 +0000 (18:32 -0600)]
LU-13127 ptlrpc: prefer crc32_le() over CryptoAPI
Prefer to call the crc32_le() library function directly if available,
instead of cfs_crypto_hash(CFS_HASH_ALG_CRC32). It is about 10x faster
for the 156-byte struct ptlrpc_body being checked in this function.
A test of small buffers in that compares the two implementations, run
on a 2.9GHz Core i7-7820 shows the difference is significant here:
buffer size 156 bytes 1536 bytes 4096 bytes 1 MiB
-----------+------------+------------+-----------+-----------
cfs_crypto | 182 MiB/s | 1794 MiB/s | 4163 MB/s | 9631 MiB/s
crc32_le | 1947 MiB/s | 1871 MiB/s | 1867 MB/s | 1823 MiB/s
This corresponds to 10x faster or 1/10 as many cycles for ptlrpc_body.
The CryptoAPI speed crosses over around 1536 bytes, which is still 10x
larger than the ptlrpc_body size, so it is unlikely to be faster here.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I116fd6c148f15660dd7b7faefb86f9dd603ebbe5
Reviewed-on: https://review.whamcloud.com/39614
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexander Zarochentsev [Fri, 31 Jul 2020 16:01:32 +0000 (19:01 +0300)]
LU-13899 tgt: drop old epoch request
Do not send -ESTALE back to the client
in case process_req_last_xid() detects
an old epoch request. Just drop reply
instead, otherwise -ESTALE may confuse
the client and have it accept an error
not attempting to resend the request.
Fixes:
c1d465de13 ("LU-6655 ptlrpc: skip delayed replay requests")
HPE-bug-id: LUS-9097
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I1a9ebc637f1357ca0027adbb6bb706287f4b5f4f
Reviewed-on: https://review.whamcloud.com/39612
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Thu, 27 Aug 2020 17:08:37 +0000 (17:08 +0000)]
LU-13931 Revert "LU-13688 hsm: handle in-tree executed copytools correctly"
This seems to break in-tree static build testing
This reverts commit
29bb063654c9a74d495e5d4cea17694a2b70f6a0.
Change-Id: I1bb04fe2a48b4d7392a1306bd16a5bab296af1f5
Reviewed-on: https://review.whamcloud.com/39746
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Amir Shehata [Tue, 14 Jul 2020 17:51:51 +0000 (10:51 -0700)]
LU-13750 lnet: Fix peer add command
The peer add command is suppose to add one peer per command.
The primary NID can be specified followed by a set of constituent
NIDs. This patch restores this behavior and ensures that for
peer add the primary NID must be specified to make the command
syntax more consistent with the peer del command. And behave in
a similar way as net add/del commands.
The APIs have been changed as well to make it more easily testable
from the LUTF.
There a few cleanups to avoid having to do unnecessary parsing.
Test-Parameters: trivial testlist=sanity-lnet
Fixes:
892f675e660 (LU-12410 lnet: Convert lnetctl peer add and del)
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I32ed53742f2c379abb47f7acd3f7f336062e0458
Reviewed-on: https://review.whamcloud.com/39392
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Sun, 19 Jul 2020 15:34:44 +0000 (10:34 -0500)]
LU-13836 lnet: Display correct route aliveness
jt_ptl_print_routes() needs to be updated to interpret
data.cfg_config_u.cfg_route.rtr_flags correctly.
Test-Parameters: trivial
Fixes:
2832478194 ("LU-13029 lnet: fix asym routing with multi-hop")
HPE-bug-id: LUS-9121
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I796759293c6b25794f2092c06fee2981db6a7d48
Reviewed-on: https://review.whamcloud.com/39543
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Sun, 28 Jun 2020 17:35:43 +0000 (12:35 -0500)]
LU-13736 lnet: Do not set preferred NI for MR peer
The preferred NI exists to ensure that a consistent source address is
used when communicating with a non-multi-rail peer. We needn't ever
set a preferred NI for a MR peer.
HPE-bug-id: LUS-9058
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I836a314dbf02d35199c3da2ccea6fb0acbb94b54
Reviewed-on: https://review.whamcloud.com/39229
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Tue, 16 Jun 2020 06:48:01 +0000 (00:48 -0600)]
LU-10934 tests: increase timeout for sanityn test_51b
Increase the timeout for sanityn test_51b, since this is causing
intermittent test failures when stat() is run before dd finishes.
Test-Parameters: trivial testlist=sanityn env=ONLY=51b,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ieb3d50e6b534b535e8255cbbc566f053f33ebbe5
Reviewed-on: https://review.whamcloud.com/38947
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Vitaly Fertman [Fri, 31 Jul 2020 18:16:43 +0000 (21:16 +0300)]
LU-11518 ldlm: cancel LRU improvement
Add @batch parameter to cancel LRU, which means if at least 1 lock is
cancelled, try to cancel at least a batch locks. This functionality
will be used in later patches.
Limit the LRU cancel by 1 thread only, however, not for those which
have the @max limit given (ELC), as LRU may be left not cleaned up
in full.
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ide21c4a2b2209b8a721249466ea1e651c8532c8a
HPE-bug-id: LUS-8678
Reviewed-on: https://es-gerrit.dev.cray.com/157067
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/39561
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Vitaly Fertman [Fri, 31 Jul 2020 18:07:04 +0000 (21:07 +0300)]
LU-11518 ldlm: lru code cleanup
cleanup includes:
- no need in unused locks parameter in the lru policy, better to
take the current value right in the policy if needed;
- no need in a special SHRINKER policy, the same as the PASSED one
- no need in a special DEFAULT policy, the same as the PASSED one;
- no need in a special PASSED policy, LRU is to be cleaned anyway
according to LRU resize or AGED policy;
bug fixes:
- if the @min amount is given, it should not be increased on the
amount of locks exceeding the limit, but the max of them is to
be taken instead;
- do not do ELC on enqueue if no LRU limits are reached;
- do not keep lock in LRUR policy once we have cancelled @min locks,
try to cancel instead until we reach the @max limit if given;
- cancel locks from LRU with the new policy, if changed in sysfs;
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I84369da54f680e5fbddd28089c40d1b90722d42d
HPE-bug-id: LUS-8678
Reviewed-on: https://es-gerrit.dev.cray.com/157066
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/39560
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Vitaly Fertman [Thu, 16 Jul 2020 11:28:03 +0000 (14:28 +0300)]
LU-13645 ldlm: re-process ldlm lock cleanup
For extent locks:
- rescan logic is not needed for group locks, it works well without it
- @err is not needed for ldlm_extent_compat_queue(), it is always set
to @compat, remove it and set outside
- LDLM_FL_NO_TIMEOUT flag could be set once outside of
ldlm_extent_compat_queue()
- add ldlm_resource_insert_lock_before();
For inodebits:
- glimpse expects ELDLM_LOCK_ABORTED to fill data properly on client
side, do not return ELDLM_LOCK_WOULDBLOCK from
ldlm_process_inodebits_lock()
- regular enqueue also does not have logic for ELDLM_LOCK_WOULDBLOCK,
restore the original ELDLM_LOCK_ABORTED here as well for simplicity
- check for DOM lock in mdt_dom_client_has_lock() according to open
flags, not for LCK_PW always;
Also, move sanity 82 to sanityn as after LU-9964 it is to be run on
two mount points.
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I6d0b230f04aaa497db5b036b4ed9afe5d7f418b0
HPE-bug-id: LUS-8987
Reviewed-on: https://review.whamcloud.com/39405
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Thu, 6 Aug 2020 04:20:27 +0000 (14:20 +1000)]
LU-8522 tests: improve slabinfo accuracy when slub is used.
The "active_objs" count in slabinfo is never very accurate, but when
CONFIG_SLUB is being used it is even less accurate than with
CONFIG_SLAB.
If CONFIG_SLUB_DEBUG is also enabled, it is possible to shrink the
cache and remove this inaccuracy by writing '1' to
/sys/kernel/slab/$CACHENAME/shrink
So add appropriate code to sanity.sh so that when the 'shrink' file is
available, it is used.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I36179d4609b5e4bcd1de00f0b5921c9c6bed72b0
Reviewed-on: https://review.whamcloud.com/39579
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Mr NeilBrown [Wed, 5 Aug 2020 01:34:21 +0000 (11:34 +1000)]
LU-11310 ldiskfs: Fix suse15/ext4-max-dir-size.patch
The ext4-max-dir-size patch for suse15 added a 'max_dir_size' sysfs
attribute with an incorrect implementation. The implementation is
identical to that for 'max_dir_size_kb', so setting or reading
'max_dir_size' will result in incorrect values. This causes
sanity test 129 to fail.
So add a suitable implementation for max_dir_size
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I591259ed668bc828c3a7caa6a55d0de2b0d72797
Reviewed-on: https://review.whamcloud.com/39571
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Nunez [Thu, 30 Jul 2020 22:33:32 +0000 (16:33 -0600)]
LU-13773 tests: use TESTLOG_PREFIX in run_one_logged
TESTLOG_PREFIX is defined and exported in init_test_env()
in test-framework.sh. This environment variable is defined
as $LOGDIR/$TESTSUITE. TESTLOG_PREFIX should be used in the
definitions of test_log and zfs_debug_log. Since the logs
created in run_one_logged() don't use the defined prefix,
this could lead to differences in the naming of the dmesg,
debug_log and test_log logs.
Let's use TESTLOG_PREFIX in the definitions of test_log and
zfs_debug_log so that changes to the prefix are reflected in
all logs.
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I14dbe0469b7a6627d63679103c235b97f0e42b67
Reviewed-on: https://review.whamcloud.com/39552
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Nathaniel Clark [Fri, 24 Jul 2020 19:01:47 +0000 (15:01 -0400)]
LU-13819 build: Update ZFS version to 0.8.4
Changes from 0.8.3
* Add missing zfs_refcount_destroy() in key_mapping_rele() #10246
* Linux 5.7 compat: blk_alloc_queue() #10181 #10187
* Prefix struct rangelock #9534
* Fix icp include directories for in-tree build #10021
* ICP: gcm-avx: Support architectures lacking the MOVBE
instruction #10029
* ICP: Improve AES-GCM performance #9749
* Bugfix/fix uio partial copies #8673 #10148
* Prevent deadlock in arc_read in Linux memory reclaim
callback #9987
* Fix infinite scan on a pool with only special allocations
#10106 #8694
* Static symbols exported by ICP #9791
* Linux 5.6 compat: struct proc_ops #9961
* Linux 5.6 compat: timestamp_truncate() #9956 #9961
* Linux 5.6 compat: ktime_get_raw_ts64() #10052 #10064
* Linux 5.6 compat: time_t #10052 #10064
* Fix static data to link with -fno-common #9943
* zfs_get: change time format string from %k to %H #10090 #10153
* Deprecate deduplicated send streams #7887 #10117
* Fix zfs-functions packaging bug
* initramfs: Eliminate substitutions
* Delete built init scripts in make clean
* Restore :: in Makefile.am #9210
* Make init scripts depend on Makefile
* Systemd mount generator: don't fail keyload from file if
already loaded #10103
* Systemd mount generator: Generate noauto units; add control
properties #9649
* Systemd mount generator: Silence shellcheck warnings #9649
* Fix CONFIG_MODULES=no Linux kernel config #9887 #10063
* Linux 5.5 compat: blkg_tryget() #9745 #10072
* zfs-mount-generator: Fix escaping for / #9970
* Missed wakeup when growing kmem cache #9989
* Order zfs-import-*.service after multipathd #9863
* Avoid here-documents in systemd mount generator #9802
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I3a270376e05d466eeb8e8ba93d7c4aa0d2546ae6
Reviewed-on: https://review.whamcloud.com/39507
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Minh Diep [Fri, 24 Jul 2020 17:38:04 +0000 (10:38 -0700)]
LU-13818 build: use libsnmp-dev instead of libsnmp30
Installing libsnmp-dev will pull in the correct libsnmpXX.
By depending on the libsnmp-dev we can install on
ubuntu 20.04 which is libsnmp35
Change-Id: Ib921ac35e06149ba88fa8e39b9a0980deb94acf2
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39506
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Sat, 25 Jul 2020 21:27:22 +0000 (17:27 -0400)]
LU-9325 fld: replace simple_strto* with kstr* functions
The fldb debugfs files use simple_strto* to parse input from the
user. simple_strto* is considered obsolete so replace it with
the equivalent kstrto* functions.
Change-Id: I6d32939152ee0d65df4ec45937d7d0be03b8274e
Test-Parameters: trivial env=ONLY=68 testlist=conf-sanity
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/39498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Lai Siyao [Sun, 12 Jul 2020 01:15:16 +0000 (09:15 +0800)]
LU-13791 sec: enable FS capabilities
FS capabilities are not effective because they are dropped for
non-root users for historical reason: they are used to be enforced
before operations, but now they are checked in MDD layer only (see
mdd_fix_attr()).
Add sanity-sec 51.
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3e355f5df5eab5509b5e6774dbc8b82281a34039
Reviewed-on: https://review.whamcloud.com/39399
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Mon, 6 Jul 2020 12:34:44 +0000 (08:34 -0400)]
LU-9859 libcfs: don't save journal_info in dumplog thread.
As this thread is started by kthread, it must have
a clean environment and cannot possibly be in a
filesystem transaction. So current->journal_info
must be NULL, and preserving it serves no purpose.
Also change libcfs_debug_dumplog_internal() to 'static'
to make it clear that it shouldn't be called from
anywhere but this thread.
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie863f792b36792600bef4fe778c46e97ebf046c3
Reviewed-on: https://review.whamcloud.com/39294
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Tue, 30 Jun 2020 14:37:14 +0000 (08:37 -0600)]
LU-9859 libcfs: rename CFS_TCD_TYPE_MAX to CFS_TCD_TYPE_CNT
The possible TCD types are 0, 1, 2.
So the MAX is 2.
The count of the number of types is 3.
CFS_TCD_TYPE_MAX is 3 - obviously wrong.
So rename it to CFS_TCD_TYPE_CNT.
Also there are 2 places where "3" is used rather
than the macro - fix them.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia4ce5fdb3225494f93d1eebd9fddfc15eb2b8316
Reviewed-on: https://review.whamcloud.com/39276
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Sat, 27 Jun 2020 11:14:02 +0000 (05:14 -0600)]
LU-13687 llite: return -ENODATA if no default layout
Don't return -ENOENT if fetching the default layout from the root
directory fails. Otherwise, "lfs find" will print an error message
for every directory scanned in the filesystem:
lfs find: /myth/tmp does not exist: No such file or directory
Fixes:
3e8fa8a7396c ("LU-11656 llite: fetch default layout for a directory")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5e082c5d425c44ca7770d3b24cbb13bb7d2540e5
Reviewed-on: https://review.whamcloud.com/39200
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Lai Siyao [Mon, 22 Jun 2020 02:17:06 +0000 (10:17 +0800)]
LU-13700 test: increase sanity 230o/230p wait time
ZFS may be slow to finish dir split/merge in time, triple wait time
to avoid failure.
Test-parameters: trivial fstype=zfs testlist=sanity mdscount=2 \
mdtcount=4 env=ONLY=230,ONLY_REPEAT=30
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3d28c942ac925ea201936b53d0487d9a6bf9376c
Reviewed-on: https://review.whamcloud.com/39119
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Nikitas Angelinas [Wed, 17 Jun 2020 11:17:07 +0000 (04:17 -0700)]
LU-13688 hsm: handle in-tree executed copytools correctly
The Lustre test suite and HSM copytools can be invoked from either
within /usr/lib{,64}/ if they have been installed from source or from
packages, or from within the Lustre source tree, usually for
development purposes; in the latter case, the copytool process name is
prepended with an "lt-", due to being invoked via a libtool wrapper
script. The Lustre test framework relies on "libtool execute" to
distinguish between these two cases, parse the command parameters and
pass the correct process name as a parameter to utilities such as
pgrep(1), pkill(1), ps(1) and killall(1). Unfortunately, this doesn't
seem to work unless the libtool script for the copytool and the test
framework test file are in the same directory; e.g. this doesn't work
for lhsmtool_posix as its libtool script is in lustre/utils/, but the
Lustre test suite is in lustre/tests/, which doesn't allow the
"libtool execute" parsing and parameter replacing to succeed.
Fix this by determining the process name of the executed copytool
based on whether it was invoked from within the source tree or not and
using it in commands that either search for copytool processes or send
them signals by process name.
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Cray-bug-id: LUS-8931
Change-Id: Ief7b224b793401b1a24bf9780d1df6e029f5c0d7
Reviewed-on: https://review.whamcloud.com/38962
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: nathan r <nrutman@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Nikitas Angelinas [Wed, 17 Jun 2020 11:04:45 +0000 (04:04 -0700)]
LU-13688 tests: remove duplicate HSM functions
Some HSM test framework functions exist in both test-framework.sh and
sanity-hsm.sh. Some of these are also used in PCC tests, so the
sanity-hsm.sh copy can be removed and some are used only in HSM tests,
so the test-framework.sh copy can be removed. The test-framework.sh
copies were introduced by LU-10092 which seems to have used versions
of the functions before they were updated by LU-11742, so update
kill_copytools() to the latest version.
Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Cray-bug-id: LUS-8913
Change-Id: I8101f748bfcfffb81598f7a5d2d82f2a16696e5c
Reviewed-on: https://review.whamcloud.com/38961
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Cheremencev [Wed, 17 Jun 2020 11:01:13 +0000 (14:01 +0300)]
LU-13686 utils: pool_add/remove error code fix
jt_pool_cmd should always return error code, even
if it failed to add/remove just one of OSTs from list.
Before this patch it returned latest command result,
ignoring previous failures.
Change-Id: Ife6cefc006f061b47a1b00daf826d0d1d34fd66c
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/38960
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alex Zhuravlev [Mon, 15 Jun 2020 06:09:53 +0000 (09:09 +0300)]
LU-13676 tools: awk script to find unique backtraces
looking at backtraces from crash utility it's not routine to find
interesting ones. this simple awk script can help:
1) dump backtraces in crash using "foreach bt" command
2) cat <file-with-backtraces> | crash-find-unique-traces.awk
the output will be like:
schedule,schedule_timeout,osc_extent_wait,osc_cache_wait_range,
osc_io_fsync_end,cl_io_end,lov_io_end_wrapper,lov_io_fsync_end,
cl_io_end,cl_io_loop,cl_sync_file_range,ll_writepages,do_writepages,
__writeback_single_inode,writeback_sb_inodes,wb_writeback,wb_workfn,
process_one_work,worker_thread,kthread PIDs: 7
schedule,schedule_hrtimeout_range_clock,poll_schedule_timeout,
do_sys_poll,__se_sys_poll PIDs: 2130
schedule,schedule_hrtimeout_range_clock,do_sigtimedwait,
__se_sys_rt_sigtimedwait PIDs: 2251
schedule,osd_trans_stop,ofd_commitrw,tgt_brw_write,tgt_request_handle,
ptlrpc_main,kthread PIDs: 11720
schedule,mdt_restriper_main PIDs: 12859
schedule,wb_wait_for_completion,sync_inodes_sb,sync_filesystem,
generic_shutdown_super,kill_anon_super,deactivate_locked_super,
cleanup_mnt,task_work_run,exit_to_usermode_loop PIDs: 15097
Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I94514189f15cf559336217fddf7b665dde0c8f77
Reviewed-on: https://review.whamcloud.com/38936
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Shaun Tancheff [Wed, 5 Aug 2020 14:17:03 +0000 (09:17 -0500)]
LU-13742 llite: do not bypass selinux xattr handling
Without the hint from selinux_is_enabled() to determine if selinux
is running at boot the performance fix from LU-549 to skip handling
of selinux xattrs cannot be correctly handled.
The correct path is to act is if selinux is enabled.
This fixes a bug introduced by LU-12355 that now exists in
RHEL 8.2 kernels where clients have enabled selinux.
Fixes:
39e5bfa734 ("LU-12355 llite: include file linux/selinux.h removed")
Test-Parameters: clientdistro=el8.2 serverdistro=el8.2 clientselinux testlist=sanity-selinux
Test-Parameters: clientdistro=el8.1 serverdistro=el8.1 clientselinux testlist=sanity-selinux
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6fb5ed9ecdb79545225b5586b90509eb157a355b
Reviewed-on: https://review.whamcloud.com/39569
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Thu, 16 Jul 2020 06:30:47 +0000 (16:30 +1000)]
LU-6142 lmv: make various functions static.
Multiple function in lmv_obd.c are only used in the same file, so they
can be made static.
Also make a few style cleanups and fix a spelling error.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I84bf4e8d485dc6a7f8811035b8a689e5e0d91455
Reviewed-on: https://review.whamcloud.com/39402
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Wed, 15 Jul 2020 07:14:08 +0000 (17:14 +1000)]
LU-6142 lov: make various lov_object.c function static.
These function in lov_object.c and lovsub_object.c are only
used in the file in which they are defined.
So mark them as static.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I958f4e850d13d2ced32772e0e66627eb40a1bf36
Reviewed-on: https://review.whamcloud.com/39385
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Thu, 16 Jul 2020 03:53:39 +0000 (13:53 +1000)]
LU-6142 obdclass: make obd_psdev static
obd_psdev is only used in one file, so it can be local to that file.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib74d78e0e72e054d5f998d54f1476216926b293b
Reviewed-on: https://review.whamcloud.com/39395
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Tue, 7 Jul 2020 22:14:21 +0000 (08:14 +1000)]
LU-6142 lustre: use init_wait(), not init_waitqueue_entry()
init_waitqueue_entry(foo, current)
is equivalent to
init_wait(foo)
So use the shorter version - in lustre and libcfs.
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I621364d8f6b155df3f2159dfca39f252abc81c76
Reviewed-on: https://review.whamcloud.com/39300
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Mon, 27 Jul 2020 17:11:20 +0000 (13:11 -0400)]
LU-13740 build: update changelog for ubuntu kernel
With all the lastest Linux kernel supported added to Lustre
enabling Ubuntu 20.04 LTS support for clients is already there.
Test-Parameters: trivial
Change-Id: I35916f3205bff62e2c6ef01f03725aa890c5c8e7
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/39231
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Alexander Boyko [Mon, 1 Jun 2020 12:38:07 +0000 (08:38 -0400)]
LU-13617 tests: check client deadlock selinux
The patch adds test_20e to sanity-selinux. It checks client deadlock
and MDS eviction for it.
Test-Parameters: trivial testlist=sanity-selinux env=ONLY=20e
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Cray-bug-id: LUS-8924
Change-Id: If7707fa14f7307fb3a3fb2228fbd1983b55cbe6b
Reviewed-on: https://review.whamcloud.com/38793
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mikhail Pershin [Thu, 30 Jul 2020 11:56:46 +0000 (14:56 +0300)]
LU-13759 test: make sanityn test_20 repeatable
- make sanityn.sh test_20 able to work with ONLY_REPEAT
parameter by using $tdir and $tfile variable which are
cleaned up by test framework and test can be repeated
- change sanity-dom.sh way to define sanity.sh and sanityn.sh
parameters to allow selected tests run by using SANITY_ONLY,
SANITYN_ONLY to choose specific tests. It supports also
SANITY_REPEAT/SANITYN_REPEAT to repeat those tests.
Test-Parameters: trivial testlist=sanity-dom env=SANITY_ONLY=36,SANITYN_ONLY=20,SANITYN_REPEAT=50
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I314703006a8f53092daf1359f4c4694c704354d2
Reviewed-on: https://review.whamcloud.com/39540
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Alexander Zarochentsev [Thu, 18 Jun 2020 06:18:05 +0000 (09:18 +0300)]
LU-13809 mdc: fix lovea for replay
lmm->lmm_stripe_offset gets overwritten by
layout generation at server reply,
so MDT does not recognize such LOVEA as
a valid striping at open request replay.
This patch extendes LU-7008 fix by supporting
of PFL layout.
HPE-bug-id: LUS-8820
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: If28836c2fcb08620dd3dc869ddfe35147c69e711
Reviewed-on: https://review.whamcloud.com/39468
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Tue, 21 Jul 2020 01:09:37 +0000 (11:09 +1000)]
LU-12275 sec: use memchr_inv() to check if page is zero.
memchr_inv() is the preferred way to check if a memory region is all
zeros. It is likely fast that memcmp() is it doesn't need to read the
ZERO_PAGE into cache, or into the CPU. It was introduced in Linux
3.2.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I0a5c3d30d5db43a3f5ebb270ea66b9db2b200a9a
Reviewed-on: https://review.whamcloud.com/39459
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Mikhail Pershin [Wed, 15 Jul 2020 05:12:55 +0000 (08:12 +0300)]
LU-13759 dom: lock cancel to drop pages
Prevent stale pages after lock cancel by creating
cl_page connection for read-on-open pages.
This reverts
02e766f5ed to fix the problem.
Since VM pages are connected to cl_object they can be
found and discarded by CLIO properly.
Fixes:
02e766f5ed ("LU-11427 llite: optimize read on open pages")
Test-Parameters: mdssizegb=20 testlist=dom-performance
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Iba8c87c934c442b4c0b45d7d3821ceede1a6e68f
Reviewed-on: https://review.whamcloud.com/39401
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Cheremencev [Tue, 7 Jul 2020 20:19:48 +0000 (23:19 +0300)]
LU-13359 quota: make used for pool correct
Before this patch used space for quota pool
was a sum of a space used by user at all OSTs
in a system. Now it is fixed and lfs quota --pool
takes into account only OSTs form the pool.
With option -v it also shows only OSTs from the pool.
Change-Id: Idf1c8ed66fca7caec70246ea4182df883bcef23c
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/39298
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Andreas Dilger [Sat, 27 Jun 2020 11:32:37 +0000 (05:32 -0600)]
LU-13127 ptlrpc: don't require CONFIG_CRYPTO_CRC32
Don't require CONFIG_CRYPTO_CRC32 to build if not configured,
as it may not be available for all kernels and is easily fixed.
Consolidate the early reply code in sec_plain.c to also call
lustre_msg_calc_cksum() to reduce code duplication.
Fixes:
e1a0f602a608 ("LU-13127 libcfs: make noise to console if CRC32 is missing")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I00511df418ddfbd8522936cf2bc0f3193d2540e5
Reviewed-on: https://review.whamcloud.com/39201
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Vikentsi Lapa [Mon, 29 Jun 2020 08:06:14 +0000 (08:06 +0000)]
LU-13718 tests: add LU numbers to skipped tests
Some tests in ALWAYS_EXCEPT variable do not contain LU- numbers in
description. Also tests formatting inconsistent (mixed tabs and
spaces in one line).
This patch adds missing numbers and corrects formatting. This
improvement reduce time to find reasons why test was skipped and let
parse code to build table with skipped tests list.
Test-Parameters: trivial testlist=sanity-lfsck mdscount=2 mdtcount=8 osscount=1 ostcount=4
Signed-off-by: Vikentsi Lapa <vlapa@whamcloud.com>
Change-Id: Iada59d1e01b8ecd07af91157abef483df7715178
Reviewed-on: https://review.whamcloud.com/39217
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Serguei Smirnov [Thu, 16 Jul 2020 18:16:48 +0000 (14:16 -0400)]
LU-13790 socklnd: NID to interface mapping issues
Fix the NID to interface mapping in ksocknal_startup to make sure
the messages go out the interface assigned by LNet on a system
with multiple interfaces configured.
Test-Parameters: trivial
Fixes:
b770d7117f35 ("LU-11893 lnet: consoldate secondary IP address handling")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I22a47fcf17dc0b8b2bf2abebb6b295f4b0550c00
Reviewed-on: https://review.whamcloud.com/39408
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
James Simmons [Wed, 15 Jul 2020 18:33:24 +0000 (14:33 -0400)]
LU-13787 build: fix snmp / libcfs build order
The Lustre snmp code is dependent on libcfs so make
sure libcfs is built first. This only shows up in
parallel builds.
Fixes:
742897a967cf ("LU-13274 uapi: make lnet UAPI headers C99 compliant")
Test-Parameters: trivial
Change-Id: Ibe1669e8586eb54129d3b9dd74b0287766af0bf3
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/39388
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Alexander Zarochentsev [Mon, 15 Jun 2020 15:07:20 +0000 (18:07 +0300)]
LU-13784 tests: allow QUOTA_TYPE to be set
QUOTA_TYPE unconditionally set to "ug3" in
lustre/test/cfg/local.sh, it makes enabling
project quota support a non trivial task;
Fixing conf-sanity test 86 to accept more than
one -O option.
HPE-bug-id: LUS-8983
Test-Parameters: envdefinitions="ENABLE_QUOTA=yes QUOTA_TYPE=p"
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: Ie9bfa536b5ea704e0637afb10a8bb82c64b2bdf6
Reviewed-on: https://review.whamcloud.com/39359
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Thu, 9 Jul 2020 18:33:49 +0000 (13:33 -0500)]
LU-13782 lnet: Have LNet routers monitor the ni_fatal flag
Have the LNet monitor thread on LNet routers check the
ni_fatal_error_on flag to set local NI status appropriately. When
this results in a status change, perform a discovery push to all
peers. This allows peers to update their route status appropriately.
Test-Parameters: trivial
HPE-bug-id: LUS-9068
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic4f8f33c6377f4b95f6ab95f9714414c6b9ab5e6
Reviewed-on: https://review.whamcloud.com/39353
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Wed, 8 Jul 2020 21:03:48 +0000 (16:03 -0500)]
LU-13764 lnet: Clear lp_dc_error when discovery completes
If discovery completes successfully then we can clear the
lp_dc_error.
Test-Parameters: trivial
HPE-bug-id: LUS-9081
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If709022c5c4ba0ab8f01b3f4b508ed464fd0b6ff
Reviewed-on: https://review.whamcloud.com/39348
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Gorenko [Tue, 7 Jul 2020 11:31:31 +0000 (14:31 +0300)]
LU-13761 o2ib: Fix compilation with MOFED 5.1
A new argument was added to rdma_reject() in MOFED 5.1 and
Linux 5.8.
Add a cofigure check and support both versions of rdma_reject().
Test-Parameters: trivial
Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com>
Change-Id: I2b28991f335658b651b21a09899b7b17ab2a9d57
Reviewed-on: https://review.whamcloud.com/39323
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Wang Shilong [Fri, 10 Jul 2020 07:29:39 +0000 (15:29 +0800)]
LU-13775 target: fix memory copy in tgt_pages2shortio()
tgt_pages2shortio() try to copy local pages memory to ptlrpc
inline buf.
The right logic should move page @ptr to offset + count, however,
it does this logic wrongly, this doesn't cause any problem so
far, because normally @lnb_page_offset is 0. when i tried to
play with unaligned DIO, we could hit the problem.
Anyway, fix to use right logic to handle memory.
Fixes: 70f092a ("LU-1757 brw: add short io osc/ost transfer.")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I0a2e05732c0f425043af8393eb41f6bec178da6f
Reviewed-on: https://review.whamcloud.com/39333
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Lai Siyao [Mon, 6 Jul 2020 13:52:45 +0000 (21:52 +0800)]
LU-13437 llite: pack parent FID in getattr
Pack parent FID in getattr request if OBD_CONNECT2_GETATTR_PFID is
enabled, otherwise fill it with target FID for backward compatibility.
Fixes:
f9a2da63 ("LU-13437 mdt: don't fetch LOOKUP lock for remot...")
Test-Parameters: clientversion=2.12 testlist=sanity env=SANITY_EXCEPT="27M 151 156"
Test-Parameters: serverversion=2.12 testlist=sanity env=SANITY_EXCEPT="56 165 205b"
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idcf8388b65dee1f0a09a53b240ce8303f3c6ff75
Reviewed-on: https://review.whamcloud.com/39290
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Mon, 29 Jun 2020 18:44:07 +0000 (13:44 -0500)]
LU-13734 lnet: Allow duplicate nets in ip2nets syntax
Before the MR feature was implemented, it was not possible to
configure multiple interfaces on the same LNet, so the ip2nets
syntax did not allow for this. Now that we have MR feature, we should
allow it to be configured via ip2nets syntax. e.g.
o2ib(ib0) 10.10.10.1
o2ib(ib1) 10.10.10.2
A test is added for configuring LNet with kernel ip2nets parameter.
setup_netns() refactored to facilitate the new test.
cleanup_lnet() is modified to check whether lnet module is loaded
before attempting lnetctl lnet unconfigured otherwise sanity-lnet.sh
could exit with rc 234 on cleanup.
Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-9046
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Iafc3882035269073fd7e4abb53d138d9267f6e21
Reviewed-on: https://review.whamcloud.com/39227
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Mon, 22 Jun 2020 03:57:02 +0000 (13:57 +1000)]
LU-12678 lnet: clarify initialization of lpni_refcount
This refcount is not explicitly initialized, so is implicitly
initialized to zero. This prohibits the use
lnet_peer_ni_addref_locked() for taking the first reference,
so a couple of places open-code the atomic_inc just in case.
There is code in lnet_peer_add_nid() which drops a reference before
accessing the structure. This isn't actually wrong, but it looks
weird.
lnet_destroy_peer_ni_locked() makes assumptions about the content of
the structure, so it cannot be used on a partially initialized
structure.
All these special cases make the code harder to understand. This
patch cleans this up:
- lpni_refcount is now initialized to one, so the called for
lnet_peer_ni_alloc() now owns a reference and must be sure
to release it.
- lnet_peer_attach_peer_ni() now consumes a reference to
the lpni. A pointer returned by lnet_peer_ni_alloc()
is most often passed to lnet_peer_attach_peer_ni() so
these to changes largely cancel each other out - not completely
- The two 'atomic_inc' calls are changed to
'lnet_peer_ni_addref_locked().
- A LIBCFS_FREE() is replaced by lnet_peer_ni_decref_locked(),
and that function is improved to cope with lpni_hashlist
being empty, or ->lpni_net being NULL.
- lnet_peer_add_nid() now holds a reference on the lpni until
it don't need it any more, then explicity drops it.
This should make no functional change, but should make the code a
little less confusing.
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iec312e637d1e7b6eb14f2c363843403dd5cf8e8f
Reviewed-on: https://review.whamcloud.com/39120
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Tue, 14 Jul 2020 04:12:54 +0000 (14:12 +1000)]
LU-6142 lnet: discard unused lnet_print_hdr()
lnet_print_hdr() is unused, and has not been used in git history since
at least 2004.
So remove it.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I0d72726b69f205a8c62ae4dbf1423f6e745db5fe
Reviewed-on: https://review.whamcloud.com/39358
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Mr NeilBrown [Mon, 6 Jul 2020 01:53:39 +0000 (11:53 +1000)]
LU-6142 socklnd: remove declarations of missing functions.
Noe of ksocknal_query(), ksocknal_notify(), and
ksocknal_lib_bind_thread_to_cpu() exist, so don't declare them.
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ice87c317f116bda9c04dcaa285bc7ba47be219ca
Reviewed-on: https://review.whamcloud.com/39326
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Wed, 24 Jun 2020 16:17:45 +0000 (11:17 -0500)]
LU-13712 lnet: Preferred NI logic breaks MR routing
Edge (final-hop) routers typically use the non-multi-rail destination
(NMR_DST) send case. i.e. they treat the destination as
non-multi-rail. The reason for this is that we do not want routers to
modify the destination peer interface selected by the message
originator. As a result of using the NMR_DST send case, edge routers
set a preferred NI, and then continue to use that NI, because it's
preferred, even if the NI goes down and the router has other healthy
interfaces available to it. Routers do not need to use the preferred
NI selection logic when they are forwarding a message, so modify the
NMR_DST algorithm to allow routers to select any suitable local NI.
Test-Parameters: trivial
HPE-bug-id: LUS-9045
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Iae0fb47d58a70f640d316a8c85cf3058ca2f82eb
Reviewed-on: https://review.whamcloud.com/39168
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Fri, 1 May 2020 20:50:57 +0000 (15:50 -0500)]
LU-13502 lnet: Ensure LNet pings and pushes are always tracked
Add the appropriate option to the MD used for LNet pings and pushes
to ensure that these are always tracked via LNet's response tracking
mechanism, regardless of the value of lnet_response_tracking
variable.
Test-Parameters: trivial
HPE-bug-id: LUS-8827
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I13d8ee42ccbb00c85843f64314b1f953d679a0dc
Reviewed-on: https://review.whamcloud.com/38451
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Chris Horn [Fri, 1 May 2020 20:47:06 +0000 (15:47 -0500)]
LU-13502 lnet: Add param to control response tracking
Add lnet_response_tracking parameter which will be used to control
the behavior of LNet response tracking.
Test-Parameters: trivial
HPE-bug-id: LUS-8827
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I9c5be488673bbaa3c3cb983fe099d2203c1d9fa7
Reviewed-on: https://review.whamcloud.com/38449
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Wed, 22 Jul 2020 16:55:23 +0000 (12:55 -0400)]
New tag 2.13.55
Change-Id: Iefb108c0dc97b0c69407e07ca08bcfc14f7dbfe2
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Lai Siyao [Mon, 6 Jul 2020 13:03:59 +0000 (21:03 +0800)]
LU-13437 uapi: add OBD_CONNECT2_GETATTR_PFID
Add OBD_CONNECT2_GETATTR_PFID connect flag to pack parent FID in
getattr request, which will be used to check whether target is
remote object, if so, don't take LOOKUP lock, otherwise client
may see stale directory entries.
Test-parameters: trivial
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ibdf880934456f255f83cd4bac9d61ab5e1ed7330
Reviewed-on: https://review.whamcloud.com/39289
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Jian Yu [Sun, 5 Jul 2020 08:16:06 +0000 (01:16 -0700)]
LU-13731 autoconf: check if VM_FAULT_RETRY is defined
In RHEL 8.2 kernel 4.18.0-193.el8, VM_FAULT_RETRY is
defined as an enumeration constant in linux/mm_types.h
instead of a macro in linux/mm.h. This patch adds
autoconf macros to check if VM_FAULT_RETRY is defined
at configure time.
Test-Parameters: clientdistro=el8.2 serverdistro=el8.2 \
testlist=sanity
Test-Parameters: clientdistro=el8.1 serverdistro=el8.1 \
testlist=sanity
Fixes:
2e813f3e2d ("LU-13731 llite: include linux/mm_types.h for VM_FAULT_RETRY")
Change-Id: I2fdae7b62a53e447a7eb979787bdbd79423b787d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39281
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Menadue <ben.menadue@anu.edu.au>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Emoly Liu [Wed, 1 Jul 2020 10:07:00 +0000 (18:07 +0800)]
LU-13732 lfs: fid2path should match the root path correctly
This patch is to match the root path in function get_root_path()
correctly. For example, if the mount point is /mnt/lustre, the
following root path formats are acceptable:
- /mnt/lustre
- /mnt/lustre/*
sanity.sh test_154A/247d are modified to verify this patch.
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: If705dd341b273d462aeba280fa27d5608b5f3b7c
Reviewed-on: https://review.whamcloud.com/39225
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Fri, 12 Jun 2020 10:52:28 +0000 (10:52 +0000)]
LU-12275 sec: check if page is empty with ZERO_PAGE
In osc_brw_fini_request(), page needs decryption only if it
is not empty. To check this, use ZERO_PAGE macro available
for all architectures, and compare with memcmp.
It will likely be faster/more efficient than comparing the
words by hand as may use optimized CPU instructions or ASM code.
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5e04b72790e8acbceb1989ba3659e170c0b11192
Reviewed-on: https://review.whamcloud.com/38918
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Fri, 22 May 2020 07:27:48 +0000 (07:27 +0000)]
LU-12275 sec: encryption support for DoM files
On client side, data read from DoM files do not go through the OSC
layer. So implement file decryption in ll_dom_finish_open() right
after file data has been put in cache pages.
On server side, DoM file size needs to be properly set on MDT when
content is encrypted. Pages are full of encrypted data, but inode size
must be apparent, clear text object size.
For reads of DoM encrypted files to work proprely, we also need to
make sure we send whole encryption units to client side.
Also add sanity-sec test_50 to exercise encryption of DoM files.
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50" clientdistro=el8.1 fstype=ldiskfs mdscount=2 mdtcount=4
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49 50" clientdistro=el8.1 fstype=zfs mdscount=2 mdtcount=4
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7721ca4085373a7a01b2062c37458a7136e646e0
Reviewed-on: https://review.whamcloud.com/38702
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Wed, 20 May 2020 09:17:57 +0000 (18:17 +0900)]
LU-13593 ptlrpc: fix growing message buffer
In case some buffers need to be moved because of segment growth
from req_capsule_server_grow(), just set buflen to old length
before actually calling lustre_grow_msg().
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6707927a0f24c0637dbc79aa91788122a84ab8c4
Reviewed-on: https://review.whamcloud.com/38701
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sergey Cheremencev [Tue, 19 May 2020 11:41:15 +0000 (14:41 +0300)]
LU-13586 tests: Quota Pools with PFL and SEL
Add sanity-quota_71a that does write to a file
consisted of 2 components on different OSTs(each OST
relates to unique pool). Check that limits in quota
pools work properly. sanity-quota_71b does the same
but for a file with SEL.
Test-Parameters: envdefinitions=ONLY=71 testlist=sanity-quota
Change-Id: I835bec4c9b21c287142e1df9b4cfe797ec68fbef
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/38661
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Thu, 30 Apr 2020 15:23:00 +0000 (15:23 +0000)]
LU-12275 sec: atomicity of encryption context getting/setting
Encryption layer needs to set an encryption context on files and dirs
that are encrypted. This context is stored as an extended attribute,
that then needs to be fetched upon metadata ops like lookup, getattr,
open, truncate, and layout.
With this patch we send encryption context to the MDT along with
create RPCs. This closes the insecure window between creation and
setting of the encryption context, and saves a setxattr request.
This patch also introduces a way to have the MDT return encryption
context upon granted lock reply, making the encryption context
retrieval atomic, and sparing the client an additional getxattr
request.
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49" clientdistro=el8.1 fstype=ldiskfs mdscount=2 mdtcount=4
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48 49" clientdistro=el8.1 fstype=zfs mdscount=2 mdtcount=4
Test-Parameters: clientversion=2.12 env=SANITY_EXCEPT="27M 56ra 151 156 802"
Test-Parameters: serverversion=2.12 env=SANITY_EXCEPT="56oc 56od 165a 165b 165d 205b"
Test-Parameters: serverversion=2.12 clientdistro=el8.1 env=SANITYN_EXCEPT=106,SANITY_EXCEPT="56oc 56od 165a 165b 165d 205b"
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I45599cdff13d5587103aff6edd699abcda6cb8f4
Reviewed-on: https://review.whamcloud.com/38430
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Sebastien Buisson [Tue, 9 Jun 2020 15:27:53 +0000 (15:27 +0000)]
LU-12275 sec: force file name encryption policy to null
Force file/directory name encryption policy to null on newly created
inodes. This is required because first implementation step of client
side encryption only supports content encryption, and not names.
This imposes to force usage of embedded llcrypt lib to the detriment
of in-kernel fscrypt lib, even if the kernel provides it.
This patch will have to be reverted when name encryption is
implemented.
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48" clientdistro=el8.1 fstype=ldiskfs mdscount=2 mdtcount=4
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48" clientdistro=el8.1 fstype=zfs mdscount=2 mdtcount=4
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia697a29006507278c218088d7c3a5e5ade620a15
Reviewed-on: https://review.whamcloud.com/38882
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Oleg Drokin [Sat, 11 Jul 2020 04:51:27 +0000 (00:51 -0400)]
LU-13776 tests: make sure pjdfstest.sh writes to tmp
no writes to random Lustre source locations as they could be readonly
Test-Parameters: trivial
Test-Parameters: fstype=ldiskfs testlist=pjdfstest
Test-Parameters: fstype=zfs testlist=pjdfstest
Change-Id: Icd262a698390eadf4b53cd5d311bc6c2a561a79e
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39338
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>