git://git.whamcloud.com - fs/lustre-release.git/log

LU-15293 build: Add build support for arm64 centos8

This patch adds lbuid support for latest Arm64 CentOS 8.4, 8.5.

Also fix build doesn't use Lustre provided kernel config file
on CentOS8.

Test-Parameters: trivial

Test-Parameters: env=SANITY_EXCEPT="101j" \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Change-Id: I95c7aa7e77ea1cc7a99fdaacc2220e14d2db6185
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/45691
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12678 o2iblnd: convert ibp_refcount to a kref

This refcount is used exactly like a kref. So change it to one.
kref uses refcount_t which will warn on increment-from-zero and
similar problems (which enabled with CONFIG option), so we don't
need the LASSERT calls.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: I23ade8c2f768c70a1fd330e8c173e0d18f5ff976
Reviewed-on: https://review.whamcloud.com/45685
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15234 lnet: Race on discovery queue

If the discovery thread clears the LNET_PEER_DISCOVERING bit then a
race window opens when the discovery thread drops the
lnet_peer.lp_lock spinlock and closes when the discovery thread
acquires the lnet_net_lock. If another thread queues the peer for
discovery during this window then the LNET_PEER_DISCOVERING bit is
added back to the peer state, but since the peer is already on the
lnet.ln_dc_working queue, it does not get added to the
lnet.ln_dc_request queue.

When the discovery thread acquires the lnet_net_lock/EX, it sees that
the LNET_PEER_DISCOVERING bit has not been cleared, so it does not
call lnet_peer_discovery_complete() which is responsible for sending
messages on the peer's discovery pending queue.

At this point, the peer is stuck on the lnet.ln_dc_working queue, and
messages may continue to accumulate on the peer's
lnet_peer.lp_dc_pendq.

Fix the issue by re-working the main discovery thread loop so that we
do not release the lnet_peer.lp_lock until after we've determined
whether we need to call lnet_peer_discovery_complete().
This ensures that the lnet_peer is correctly removed from the
discovery work queue and any messages on the peer's
lnet_peer.lp_dc_pendq are sent or finalized.

It is also possible for the lnet_peer.lp_dc_error to be cleared
during the aforementioned window, as well as during the time when
lnet_peer_discovery_complete() is processing the contents of the
lnet_peer.lp_dc_pendq. This could prevent messages on the
lnet_peer.lp_dc_pendq from being correctly finalized. To fix this
issue, the responsibilities of lnet_peer_discovery_error() were
incorporated into lnet_peer_discovery_complete().

HPE-bug-id: LUS-10615
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I3779a342de7108105c2fd2bc41373560e8e5ef14
Reviewed-on: https://review.whamcloud.com/45670
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15279 ptlrpc: use a cached value

Don't calculate a early reply size - use a cached,
as it don't changed after start

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I3a6bd5013d0646b6165db52d6a7fb38b263756e6
Reviewed-on: https://review.whamcloud.com/45661
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15175 tests: fix ldev tests

generate_ldev_conf() and tests use this fn do not
work on setup with ost targets located not on 1 oss
and do not work on failover setup where
<facet>_HOST != <facet>failover_HOST.

Fixes: 0f17fc82a89a ("LU-7060 ldev: Added MGS NID substitution to ldev")
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-2495
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ief38df0b1a0ffa37a8e7a4545a69a453d6dba7bd
Reviewed-on: https://review.whamcloud.com/45613
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15095 target: lbug_on_grant_miscount module parameter

Some tests have hit "lctl: error invoking upcall" when setting the
lbug_on_grant_miscount tunable parameter. Instead, define a module
parameter lbug_on_grant_miscount flag as ptlrpc module parameter,
similar to how it is done for ldiskfs_track_declares_assert.

Change-Id: I9cd0f9fa75b37539b23443bbcbb3445c87318ab1
Fixes: bb5d81ea95 ("LU-14543 target: prevent overflowing of tgd->tgd_tot_granted")
Test-Parameters: trivial
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45521
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15195 ofd: missing OST object

as the OST-MDT resync may be not finished by the end of the recovery
it may happen new enqueue for a write op may fail due to an absent
object. Return EINPROGRESS so that the enqueue was resent until get
resynced.

to not get stuck forever in case of disappeared MDT or a double
failure, return EINPROGRESS during hard failover timeout only.

also, cleanup replay-ost-single test 12:
- eliminate a need in the hard failover
- no need in a special obd_fail_loc, just use replay_barrier
- createmany is able to create files with unique names,
no need in special steps

HPE-bug-id: LUS-10267
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I5f16b63454c51ad8d112770c15c7e6e7f41f3c40
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/45459
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15137 socklnd: expect two control connections maximum

As a result of connecting to ourselves, e.g. pinging own nid,
two control type connections are established vs. just one
in case of connecting externally.
Fix the control connection counter to be able to handle that.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 71b2476e ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Idce01d81e3924226b5b163d2472cbcd4f6eb5819
Reviewed-on: https://review.whamcloud.com/45461
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>

LU-15109 tests: different quota and usage relations

Add sanity-quota_1i that following cases:
- User is above PQ limit and the quota limit is cleared.
  User should now be able to write.
- User is below PQ limit and the quota limit is lowered
  below current usage. User should not be able to write.
- User is above PQ limit and the quota limit is raised
  above current usage. Should now be able to write.

Change-Id: Iad81c706aaf838cacfdf2971ee100950c47d1585
HPE-bug-id: LUS-9935
Test-Parameters: testlist=sanity-quota env=ONLY=1i
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45257
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9516 tests: fix sanity test_24v

The "lfs getdirstripe -c" command will return stripes=0 for
unstriped directories. Handle this when calculating free_inodes
to avoid creating zero files for this test.

Speed cleanup of test_24v and other users of simple_cleanup_common()
by using unlinkmany to delete files if the file count is provided.

Use stack_trap consistently and don't do both manual and exit cleanup.

Test-Parameters: trivial testlist=sanity env=ONLY=24v
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I25105a5d0ab719d41bf41cff0aaea6d00a9c4059
Reviewed-on: https://review.whamcloud.com/45172
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14965 ldiskfs: rhel7.6 inode mutex for ldiskfs_orphan_add

See following warning:

ldiskfs/namei.c:3331 ldiskfs_orphan_add+0x11e/0x290 [ldiskfs]
Call Trace:
dump_stack+0x19/0x1b
__warn+0xd8/0x100
warn_slowpath_null+0x1d/0x20
ldiskfs_orphan_add+0x11e/0x290 [ldiskfs]
ldiskfs_xattr_inode_orphan_add+0xbb/0x110 [ldiskfs]
ldiskfs_xattr_delete_inode+0x5c/0x350 [ldiskfs]
ldiskfs_evict_inode+0x1a8/0x630 [ldiskfs]
evict+0xb4/0x180
iput+0xfc/0x190
osd_object_delete+0x1f8/0x370 [osd_ldiskfs]
lu_object_free.isra.27+0xb8/0x1c0 [obdclass]
lu_object_put+0xa5/0x460 [obdclass]
mdt_object_put+0x30/0x110 [mdt]
mdt_reint_unlink+0x8e0/0x1890 [mdt]
mdt_reint_rec+0x83/0x210 [mdt]
mdt_reint_internal+0x720/0xaf0 [mdt]
mdt_reint+0x67/0x140 [mdt]
tgt_request_handle+0x7ea/0x1750 [ptlrpc]
ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
ptlrpc_main+0xb3c/0x14e0 [ptlrpc]
kthread+0xd1/0xe0
ret_from_fork_nospec_begin+0x21/0x21

Need to hold inode mutex on the external EA for ldiskfs_orphan_add()
to soothe the warning.

This is a port of:

Lustre-commit: 7d3b5d9fdc766411eacaed27fb2fd9250800f096
Lustre-change: https://review.whamcloud.com/44754

Test-Parameters: trivial
Fixes: f64e9f19f68e ("LU-12977 ldiskfs: properly take inode_lock() for truncates")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I47a01862793afaac1d7c311f1b6d65d2cf4bb93f
Reviewed-on: https://review.whamcloud.com/45165
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13717 sec: fix handling of encrypted file with long name

The ciphertext representation of the name of an encrypted file or
directory can be up to 256 bytes of binary data, if the cleartext
name is up to NAME_MAX. But then this ciphertext is encoded via
critical_encode() before being sent to servers. Once encoded, the
length can exceed NAME_MAX because of the escaped critical
characters.
So make sure ll_prep_md_op_data() accepts those too long encoded names
if it is called for lookup or create of an encrypted file or
directory. In the other cases, the 'name' taken as input is the plain
text version, so it must conform to the NAME_MAX limit.

When carrying out operations on an encrypted file with long name, we
manipulate a digested form whose hash needs to be matched against the
content of the LinkEA. The name found in the LinkEA is not NUL
terminated, so this aspect must be taken care of.

Fixes: 4d38566a00 ("LU-13717 sec: filename encryption")
Fixes: ed4a625d88 ("LU-13717 sec: filename encryption - digest support")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4b0e51eee5e549ab56292fe0fec3c1be1b487fc7
Reviewed-on: https://review.whamcloud.com/45163
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15009 ofd: continue precreate if LAST_ID is less on MDT

It's possible that precreate succeeded on OST, but MDT didn't get the
reply, and assumed failure. In this case, the LAST_ID on MDT is
smaller than that on OST, instead of report error and stop precreate,
it's better to move precreate window forward.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ia6ca418ec0ea6797b7eccc1610879331307fad07
Reviewed-on: https://review.whamcloud.com/44984
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14677 sec: remove MIGRATION_ compatibility defines

Remove the MIGRATION_* compatibility flags and use
LLAPI_MIGRATION_* everywhere.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iab2a2f6dfc435377e9db0d4963547841b2cbc403
Reviewed-on: https://review.whamcloud.com/44957
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14677 sec: no encryption key migrate/extend/resync/split

Allow some layout operations on encrypted files, even when the
encryption key is not available:
- lfs migrate
- lfs mirror extend
- lfs mirror resync
- lfs mirror verify
- lfs mirror split
We allow these access patterns to applications that know what they are
doing, by using the specific flag O_FILE_ENC and O_DIRECT.

Also add sanity-sec test_59a,b,c to exercise these access patterns.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ieaeee0e5bf7643f18d775fe6daa5e31c2f349f8c
Reviewed-on: https://review.whamcloud.com/44024
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13783 osd-ldiskfs: use alloc_file_pseudo to create fake files

With kallsyms_lookup_name() no longer exported with 5.8+ kernels
this means the work around to setup the security handling broke.
Currently osd-ldiskfs will crash due to security_alloc() never
being called. The solution is to use alloc_file_pseudo() instead
to create our fake file.

Change-Id: Ib417ebdda7d9829a231c568022618154c273f3e6
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43876
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14704 tests: disable opencache for sanity/29

otherwise lock counting is not quite correct

Fixes: 41d99c4902 ("LU-10948 llite: Introduce inode open heat counter")

Test-Parameters: trivial

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia73e8aa4a16b7ced29490d41c8eac4ee839a3406
Reviewed-on: https://review.whamcloud.com/43777
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>

LU-11388 test: enable replay-single test_131b

Issue is fixed, so this commit verifies fix.

Test-Parameters: trivial env=ONLY=131 testlist=replay-single fstype=zfs
Signed-off-by: Vikentsi Lapa <vlapa@whamcloud.com>
Change-Id: I609146172c1fee2a955d5c41f623c8b8c2ffaeaa
Reviewed-on: https://review.whamcloud.com/40421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>

LU-15357 mdd: fix changelog context leak

The mdd_changelog_clear() shouldn't skip llog_ctxt_put()
in case of error.

Fixes: 6b183927e1 (LU-14553 changelog: eliminate mdd_changelog_clear warning)
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I9c9aa3ce0d11e8f67470b450d007f2a1081644c6
Reviewed-on: https://review.whamcloud.com/45831
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15252 mdt: reduce contention at mdt_lsom_update

mot_som_mutex serialize all close requests with lsom updates for
a same mdt_object. For a massive open/read/close single shared
file load, it leads to high load avarage cause many threads sleep
on mutex.
This patch introduces a cached lsom size, and uses a mutex at update
part only. Close requests with lsom size less or equal to cached size
would not take a mutex at all.

Test results MPI open/flock/funlock/close SSF
10 iterations 10 node 100 thread each, 1000 file ops per thread
close time secs master patch MDT load avarage master patch
avg             0.142  0.086                  47.05  38.89
max             0.164  0.129                  49.39  44.77
min             0.097  0.041                  44.44  34.7

Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I807b468b128295df9391b0467e74d4f10240662e
Reviewed-on: https://review.whamcloud.com/45709
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-7372 tests: re-enable replay-dual test_26

Re-enable test_26 since it was just the unfortunate victim of
either test_24 or test_25 causing MDS unmount to hang.

Test-Parameters: trivial testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib944028e798488c425501f0c48bf812fc13ebbe5
Reviewed-on: https://review.whamcloud.com/43982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15262 osd: bio_integrity_prep_fn return value processing

There is osd_bio_integrity_handle() fn in lustre/osd-ldiskfs/osd_io.c
It checks the returned code of bio_integrity_prep_fn() but between
mainstream Linux 4.12 and 4.13 kernel integrity API has changed and
in 4.13+ (as well as for any RHEL8 including first beta)

bio_integrity_prep() returns boolean true on success.

HPe-bug-id: LUS-10443
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I973aa8ccae024157ad863d26afc7b1264a5c7149
Reviewed-on: https://review.whamcloud.com/45646
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

New tag 2.14.56

Change-Id: I2491f69b4d4e4a7ae8ed39bef8c9806127c93d79
Signed-off-by: Oleg Drokin <green@whamcloud.com>

LU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.49.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I356b8a8345a4a91d6d1c1a4a9b4eab4bb5afe75b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45687
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15260 tests: numfailovers() fix

Patch fixes numfailovers() to use comma
separated MDTS list correctly. Without this fix
in newer bash version we see the following error:
line 69: mds1,mds2,mds3,mds4_nums: bad substitution

Fixes: a7a2133bfa ("b=18696 new RECOVERY_RANDOM_SCALE test")
Fixes: b594948509 ("TT-59 remove . and - from the node name")
Test-Parameters: trivial testlist=recovery-random-scale
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10619
Change-Id: I4c28e3c62cada60dc1241948dc4e969e0e10ce9a
Reviewed-on: https://review.whamcloud.com/45633
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15263 quota: fix bug in qmt_pool_recalc

env should be freed at the end of qmt_pool_recalc,
as it is needed in qpi_putref. It causes a panic,
if pool has the last reference:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [<ffffffffc08de2d7>] lu_context_key_get+0x17/0x30 [obdclass]
...
Call Trace:
[<ffffffffc08de358>] lu_object_free.isra.30+0x68/0x170 [obdclass]
[<ffffffffc08e1a35>] lu_object_put+0xc5/0x3e0 [obdclass]
[<ffffffffc100e56c>] qmt_pool_free+0x30c/0x590 [lquota]
[<ffffffffc10100b5>] qmt_pool_recalc+0x365/0x1260 [lquota]
[<ffffffff8bac1c31>] kthread+0xd1/0xe0
[<ffffffff8c176c37>] ret_from_fork_nospec_begin+0x21/0x21

HPE-bug-id: LUS-10426
Change-Id: Ic23dcb858ff811757f38948aa572c936c076e21e
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/45632
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15208 ldiskfs: add support for Ubuntu20 kernel 5.4.0.90

Also fix the lustre-build-ldiskfs.m4 to select correct series file.
We use -ge to check the kernel release version, so greater version
should come on top.

Change-Id: Id6b599ef5b2ea823e203aaa6a40917e49f98f4d9
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/45547
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-930 doc: update lustre.7 man page

Update the lustre.7 man page to better describe current functionality.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I979841e597fcfa8448c708dd66d4d89d3018b1cc
Reviewed-on: https://review.whamcloud.com/45493
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Rick Mohr <mohrrf@ornl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.25.1.el8_4.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Change-Id: Ic70f7330f90a36646bb36e0c6015ea22882b20b9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45460
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15190 ptlrpc: fix duplication check

ptlrpc_server_check_for_resend() skips duplication check if
current exp_rpc_count == 0 which is wrong as exp_rpc_count
is incremented for RPCs in progress.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4ba1600341d916871f66aceb4d6a1043dd015e55
Reviewed-on: https://review.whamcloud.com/45445
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12784 tests: fix large_xattr_enabled() for ZFS

Fix large_xattr_enabled() check for ZFS filesystems, since bash
functions return "0" for true. Otherwise, all ZFS tests that
check large_xattr_enabled() will be skipped.

Fixes: 84097792f56c ("LU-12784 llite: limit max xattr size by kernel value")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie566244c6b1f46b947a96331e7623b9b863ebbe5
Reviewed-on: https://review.whamcloud.com/45264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15057 utils: pool quota man

Adding pool quota man for setquota and
quota commands.
Remove [-o <obd_uuid>|-i <mdt_idx>|-I <ost_idx>]
from the case "lfs quota -t". Grace period
is stored only at quota master. Furthermore,
command lfs quota -t -I 0 /mnt/testfs fails
with EOPNOTSUPP.

Test-Parameters: trivial
HPE-bug-id: LUS-9869
Change-Id: I368e22b782bd3626f64907059ea329e94986535b
Reviewed-on: https://es-gerrit.dev.cray.com/158556
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45121
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13756 quota: up_read leak in qmt_pool_lookup

qmt_pool_lock is not released if qti_pools_add fails in
qmt_pool_lookup.

Change-Id: Ic2adb44468d51af7aefcbb91279260ae6f85d67a
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45106
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14975 dne: dir migration in non-recursive mode

Add an option "-d|--directory" option for "lfs migrate -m" to
migrate specified directory only, which is similar to "ls -d".

Add sanity 230w.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib97949e3840a3b49f7074b16e259582a9bf16e3b
Reviewed-on: https://review.whamcloud.com/44802
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14956 fld: repeat failed FLDB lookup

it's possible that LWP reconnection is in progress after remote
MDS restart. if FLDB misses an entry, then FLDB lookup can fail
with EAGAIN and whole RPC processing (like MDS_REINT) can fail
as well. try to lookup few times in cases of EAGAIN.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib6aeaf7706a6465b0c8bee696d985bb440ed192e
Reviewed-on: https://review.whamcloud.com/44723
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9859 mdd: unwind md_capable()

The inline function md_capable() is just a wrapper
around cap_raised() which adds little benefit. Lets
just remove the use of this wrapper.

Change-Id: I1a5f4b2e34b4cf358b52b3fc4bdeff17fdab50c9
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14754 tests: add Overstripe support to racer

The files are created with a overstripe layout if
RACER_ENABLE_OVERSTRIPE=true is set.

We would like to have the ability to use the "real"
layouts, i.e. to limit the number of stripes per OST
instead of allowing racer to achieve the max LOV_MAX_STRIPE_COUNT
value. Patch adds RACER_LOV_MAX_STRIPECOUNT equal to
LOV_MAX_STRIPE_COUNT by default.

Test-Parameters: trivial testlist=racer
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9466, LUS-9608
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: I550922938438afa121af275fd1d6f60082db9b54
Reviewed-on: https://review.whamcloud.com/43964
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14392 gnilnd: re-enable large I/o buffers

DVS on gni breaks the LNet 1M handshake of LNET_MAX_IOV.

Introduce GNILND_MAX_IOV with a 4M i/o maximum and a hint
LNET_MD_GNILND so LNet can accept the large buffer w/o complaint.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4e78c0022fdece0d6945bbcc47e2e64d4d181dca
Reviewed-on: https://review.whamcloud.com/41373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-11915 tests: fix conf-sanity 115 test

Not enough xattrs added to move outside inode.
Add one additional xattr. The test works only with FLAKEY=false.

Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Test-Parameters: testlist=conf-sanity env="ONLY=115"
HPE-bug-id: LUS-6966
Change-Id: Iab13ed3434effb03e1209755ac51eba2debe7387
Reviewed-on: https://review.whamcloud.com/38849
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13542 osd: brw stats are initialized too late

Lustre crashes with the following stack trace:

[<ffffffffc113cbac>] lprocfs_oh_tally+0x2c/0x40 [obdclass]
[<ffffffffc169719b>] record_start_io.part.14+0x2b/0x40 [osd_zfs]
[<ffffffffc1698322>] osd_read+0xa2/0x180 [osd_zfs]
[<ffffffffc1167dee>] dt_record_read+0x1e/0x70 [obdclass]
[<ffffffffc1190997>] lustre_index_restore+0x527/0x1720 [obdclass]
[<ffffffffc16b2564>] osd_initial_OI_scrub+0xa34/0xd50 [osd_zfs]
[<ffffffffc16b34fd>] osd_scrub_setup+0x9ed/0xb90 [osd_zfs]
[<ffffffffc168a97b>] osd_mount+0xf4b/0x1380 [osd_zfs]

osd_procfs_init()/osd_stats_init() are called *after*
osd_initial_OI_scrub(), so osd stats are not yet initialized
when osd_read() first tries to update them.

This patch separates osd stats initialization from procfs
initialization so that osd stats should become initialized
by the time scrub starts its own initialization.

Change-Id: I15ab03e77eaab76e3dea8067b849c891e89aa9a8
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-8173
Reviewed-on: https://review.whamcloud.com/38554
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10640 tests: ha.sh script improvements

In each load iteration check for all created directories
that ls' long format output does not contain question
marks ('?').
'?'s may be reported if
stat(2)->getattr()->ll_glimpse_size()
fails, which is expected in case of failover. Then the test
is to wait until recovery is completed, repeat the check
and exit with error if '?' still exists after second check.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-4894
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Change-Id: I88495511797aaad53c923c90f88f92f1412380ce
Reviewed-on: https://review.whamcloud.com/31229
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15252 mdc: add client tunable to disable LSOM update

It seems that mdt_lsom_update() has a serious issue with a single
shared file because of its mdt-level mutex for every close request.
The patch adds mdc_lsom parameter to mdc, base on it state client
sends or not LSOM updates to MDT. By default LSOM is on.

lctl set_param mdc.*.mdc_lsom=[on|off]

For a configuration when LSOM is not used the patch helps
MDT with load avarage with a specific load when many threads
open/read/close for a single file.

HPE-bug-id: LUS-10604
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Iba0e745a94825641da6b0a1c09488b1e2f54658b
Reviewed-on: https://review.whamcloud.com/45619
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15167 quota: fallocate send UID/GID for quota

Calling fallocate() on a newly created file did not account quota
usage properly because the OST object did not have a UID/GID
assigned yet. Update the fallocate code in the OSC to always send
the file UID/GID/PROJID to the OST so that the object ownership
can be updated before space is allocated.

Test-case: sanity-quota/78 added

Fixes: 48457868a02a ("LU-3606 fallocate: Implement fallocate preallocate operation")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I86d80a7f415a80100f7d2fb5f417cf47bf5b2900
Reviewed-on: https://review.whamcloud.com/45475
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14514 flr: mirror split should not make stale file

Mirror split could leave an all stale mirrors file, this patch
prevent removing the last non-stale mirror from the file.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I63007784929a2cd18d2823e2250f7307ca7d8d45
Reviewed-on: https://review.whamcloud.com/42024
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-11952 mdt: fix reconstruct open

We shouldn't start a new transaction on resend.

Store fid of an opened object and use it during
reconstruction of the resend.

Change-Id: I8c21e9661903d3d4090ad29e43480e2ba7e35c39
Cray-bug-id: LUS-6957, LUS-7286
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/35112
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15169 Revert "LU-14668 lnet: Lock primary NID logic"

This patch breaks client mounts under certain LNet configurations.

This reverts commit 024f9303bc6f32a3113357c864765c4f9c93ed03.

Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic1f1d07694fe49df14c803a9434d673e61c7dd67
Reviewed-on: https://review.whamcloud.com/45386
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-13601 llite: avoid needless large stats alloc

Allocate the ll_rw_extents_info (5896 bytes), ll_rw_offset_info
(6400 bytes), and ll_rw_process_info (640 bytes) structs only
when these stats are enabled, which is very rarely.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I59bbfce8d7f2422d810617d5fa712a67333ebbe5
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-on: https://review.whamcloud.com/40901
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15181 osd-ldiskfs: a typo in osd_declare_write_commit()

A typo in osd_declare_write_commit() makes ASAN emit warnings like:
UBSAN: Undefined behaviour in /lustre/osd-ldiskfs/osd_io.c:1404:2
shift exponent 4096 is too large for 32-bit type 'int'

Change-Id: Ie612d9c6655211445c00bd17fa1bf7a836af3542
Fixes: f0f92773e ("LU-14187 osd-ldiskfs: fix locking in write commit")
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-on: https://review.whamcloud.com/45423
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>

LU-15182 git: Add .gitreview file

Add .gitreview file, so that we can use "git review -s" to setup
remote push url and use "git review" to send patch for review.

Git review cmd is a very convenient tool to push, review for
Gerrit review system. See more details here:
https://docs.opendev.org/opendev/git-review/latest/usage.html

Test-Parameters: trivial
Change-Id: Ic8223bdfcb7a696328f921159a63d625359e45a6
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/44907
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9859 gss: replace cfs_size_roundXX macros.

Many of the cfs_size_roundX() macros are not even used so delete
them. Replace cfs_size_round4() uses in the GSS layer with
round_up(var, 4);

Change-Id: Id35f0f7b60f8d00f425d9b15d2a76aa4d0fa5f2f
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/45584
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>

LU-15217 pcc: disable PCC for encrypted files

When files are encrypted in Lustre using fscrypt, they should
normally not be accessible to users without the proper encyrption
key. However, if a user has then encryption key loaded when they
read a file, it may be decrypted in memory and saved to the PCC
backend in unencrypted form.

Due to the above reason, we just disable PCC caching for encrypted
files.

DDN-bug-id: EX-3571
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6c363dcad7a6bc8520350c0295f6e221bec3abb0
Reviewed-on: https://review.whamcloud.com/45545
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14325 tests: skip replay-single 134 for older servers

The fix for a PFL file lost during recovery was landed to
Lustre 2.13.53. Servers prior to 2.13.53 will fail the
replay-single test, test_134, added to the original patch
to check that PFL files are not lost. Thus, we need to
skip this test for Lustre servers less than 2.13.53.

Fixes: 72d45e1d344c ("LU-13809 mdc: fix lovea for replay")
Test-Parameters: trivial env=ONLY=134 testlist=replay-single
Test-Parameters: serverdistro=el7.9 serverversion=2.12.7 env=ONLY=134 testlist=replay-single
Test-Parameters: serverdistro=el7.7 serverversion=2.13.0 env=ONLY=134 testlist=replay-single
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Id70f9e06f6221f88a54d696afce9de70cbcf1efa
Reviewed-on: https://review.whamcloud.com/45450
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alena Nikitenko <anikitenko@ddn.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>

LU-15186 o2iblnd: Default map_on_demand to 1

On kernels that provide global MR we default to using that exclusively
even if FMR/FastReg is available. This causes an interop issue if the
active side of a connection request has a higher fragment count than
the passive side because FMR/FastReg may be needed to map the higher
fragment count. We should change the default map_on_demand to 1 so
that FMR/FastReg is used by default. map_on)demand can still be set
to 0 if needed.

Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I76010a905f151efbb0b109ae6f5fba6fb7ce1956
Reviewed-on: https://review.whamcloud.com/45431
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15166 tests: restore osp-syn threads after test_818

test_818() is supposed to leave osp-syn threads up after the test end,
otherwise, following tests get "logging isn't available, run LFSCK".

Use fail $SINGLEMDS for that.

Test-Parameters: trivial testlist=sanity
HPE-bug-id: LUS-10495
Change-Id: Ib4876f4c4d39fc87f86788d8611838b8078e4aac
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45375
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12857 tests: allow clients to be IDLE after recovery

If clients are not connected to an OST when it fails (connection
is IDLE), they do not need to be involved in recovery, so this
should not be considered an error when checking the client state.

Test-Parameters: trivial testlist=recovery-mds-scale env=SLOW=no
Test-Parameters: testlist=conf-sanity
Test-Parameters: testlist=replay-dual,replay-single
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6cfeb718acd233378ed1608f22061bc15c3ebbe5
Reviewed-on: https://review.whamcloud.com/45318
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15126 kernel: RHEL 8.5 server support

This patch makes changes to support RHEL 8.5 release
with kernel 4.18.0-348.2.1.el8_5 for Lustre server.

Test-Parameters: trivial fstype=ldiskfs \
env=SANITY_EXCEPT="101j" \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Test-Parameters: trivial fstype=zfs \
env=SANITY_EXCEPT="101j" \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Change-Id: Ie976d8fd3e6fcf8a564eff8a41ad0fd51b2c858c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15222 build: Update ZFS version to 2.0.6

Update ZFS version to 2.0.6. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.0.6

Change-Id: I2a7df45b79f402c3d3bce8b137edd11b5224b576
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45567
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]

This patch makes changes to support new RHEL 8.5 release
for Lustre client.

Test-Parameters: trivial env=SANITY_EXCEPT="101j" \
clientdistro=el8.5

Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45285
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>

LU-15184 llite: properly detect SELinux disabled case

Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45501
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15119 tgt: BUG at tgt_brw_read+0x16bf/0x1d80

struct tgt_thread_big_cache {
  local = {{
      lnb_file_offset = 0,
      lnb_page_offset = 0,
      lnb_len = 0,
      lnb_rc = 0,
      lnb_page = 0xffffddee74fae100,
so npages_read becomes 0

Change-Id: Ie2201c9fc6f0350b1c6dcb480cff52f44d5413db
HPE-bug-id: LUS-10510
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/45273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15110 quota: cosmetic changes in PQ

cosmetic changes in PQ:
- make tgt_pool_free and qmt_sarr_pool_free void
- remove outdated comment from qmt_pool_lqes_lookup
- replace tabs with spaces

HPE-bug-id: LUS-9547
Change-Id: If4918b647eed1d971d00c521d010d0c72d349207
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45258
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15079 quota: include qsd_thread_info into mgs thread context

mgs service thread envs do not get supplied with qsd_thread_info, which
may lead to the failure shown below:
(lu_object.h:1274:lu_env_info()) ASSERTION( info ) failed:
(lu_object.h:1274:lu_env_info()) LBUG
Pid: 146951, comm: ll_mgs_0003 3.10.0-957.1.3957.1.3.x4.3.25.x86_64 #1 SMP
Call Trace:
libcfs_call_trace+0x8e/0xf0 [libcfs]
lbug_with_loc+0x4c/0xa0 [libcfs]
qsd_refresh_usage+0x25e/0x2f0 [lquota]
qsd_op_adjust+0x2f1/0x730 [lquota]
osd_object_delete+0x2b2/0x360 [osd_ldiskfs]
lu_object_free.isra.32+0x68/0x170 [obdclass]
lu_site_purge_objects+0x2fe/0x530 [obdclass]
lu_object_find_at+0x371/0xa60 [obdclass]
dt_locate_at+0x1d/0xb0 [obdclass]
llog_osd_open+0x50e/0xf30 [obdclass]
llog_open+0x15a/0x3e0 [obdclass]
llog_origin_handle_open+0x334/0x720 [ptlrpc]
tgt_llog_open+0x33/0xe0 [ptlrpc]
mgs_llog_open+0x46/0x460 [mgs]
tgt_request_handle+0x96a/0x1680 [ptlrpc]

Supply msg service context with qsd_thread_info.

Change-Id: If8664b81e1f64df015dad46ba26c9c1d1e3f54bf
HPE-bug-id: LUS-10334
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15059 nrs: do not overwrite "cmd" in nrs_tbf_rule

"cmd" pointer inside ptlrpc_lprocfs_nrs_tbf_rule_seq_write() and
nrs_tbf_parse_cmd are static. This could cause a double kfree call
because "cmd" could be overwriten by another "nrs_tbf_rule" write
instance.

Let's try to remove the "static" definition.

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I8cd7d9dd0483778c82bbf8711c07e49255983f4b
Reviewed-on: https://review.whamcloud.com/45142
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>

LU-14699 mdd: proactive changelog garbage collection

Currently changelog starts garbage collection when user
exceeds maximum idle timeout, there is also limit by amount
of idle records but it is used only for old changelog users
which have no cur_time field, therefore it is not used at
all nowadays. Another problem is that garbage collection is
started only when changelog is almost full. That causes
often situations when changelog might have very old users
staying much longer than idle timeout and having idle
records above maximum limit consuming space for nothing.

Patch reworks changelog GC in the following way:
- GC starts when changelog is almost full (old way) or
  either idle time or idle records limits are exceeded or
  when (idle_time * idle_records) exceeds its limit as well.
  The latest limit is calculated as:
  (idle_time * idle_records) / 84600 > (1 << 32) which is a
  reasonable heuristic for deciding if a user is "too idle"
  in both cases when lots records being created quickly vs
  user is idle a very long time.
- to avoid the processing of changelog users each time GC is
  checking all conditions both least user record and time
  are tracked when changelog users are initialized or
  purged/canceled. Both values are stored as mdd_changelog
  fields mc_minrec and mc_mintime
- test 160g is changed to test the new approach when idle
  indexes are checked always along with idle time checks
- test 160s is added in sanity.sh to check heuristic approach
  with (idle_time * idle_records) value checking

Fixes: 3442db6faf68 ("LU-7340 mdd: changelogs garbage collection")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I6028f3164212a2377a4fc45b60a826c64f859099
Reviewed-on: https://review.whamcloud.com/45068
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14957 mdd: prepare xattrs before migration

In directory migration, the xattrs should be prepared before starting
transaction, otherwise if remote MDT is down, which will cause local
MDT stuck as well.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I79279e7b0c051a7542a71066fffd4ad70f559368
Reviewed-on: https://review.whamcloud.com/44741
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14941 lnet: Fix source specified to routed destination

If a source NI is specified for a send then we should not modify the
destination NID that was passed to lnet_send().

HPE-bug-id: LUS-10301
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie47558d5bce97a0dca30ff7d072dcd39eb903324
Reviewed-on: https://review.whamcloud.com/44730
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14940 lnet: Fix source specified send to different net

The destination NI is fixed for all source-specified sends. Thus, in
order for a source-specified send to be considered "local", i.e. a
send that does not require a route, the destination NID must be on
the same net as the specified source.

HPE-bug-id: LUS-10303
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I4847db1d393bbc36def65123f260b928ebbf944e
Reviewed-on: https://review.whamcloud.com/44728
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14939 lnet: Allow specifying a source NID for lnetctl ping

Add a new --source option for lnetctl ping command. This allows the
user to specify a local NI from which to send the ping. This also
ensures that the specified destination NID is also used. Otherwise,
pings to multi-rail peers may end up going to a different peer NI
based on the multi-rail selection algorithm. The ability to specify
a source NI, and thus fix the destination NI, is a great help in
troubleshooting communication issues between multi-rail peers.

Add test to exercise lnetctl ping --source option.

HPE-bug-id: LUS-10296
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I454217b30a92414de537880f076a11a693b1f0b3
Reviewed-on: https://review.whamcloud.com/44727
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14867 doc: a few words about an asterisk in lfs quota

Clarify the difference between an asterisk printed per OST
in verbouse lfs quota output and an asterisk near the whole
filesystem usage.

Change-Id: I778fe1f7b1f6f8d55c311d81bd2b311d82463390
Test-Parameters: trivial
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/44357
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14437 gnilnd: use ktime_get_seconds() to get time

Use ktime_get_seconds() to directly get the time inatead of
getting a timespec and converting it.

Fixes: 4b0e495e3c ("LU-14080 gnilnd: updates for SUSE 15 SP2")
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I256855ceb9e038a9960fa76fe6e3bfe63fb16580
Reviewed-on: https://review.whamcloud.com/41679
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14188 osd: remove not used iam_container lock

There is an rw_semaphore
struct iam_container {
    ...
        /*
         * read-write lock protecting index consistency.
         */
        struct rw_semaphore     ic_sem;   <<<<<<
        struct dynlock       ic_tree_lock;
        /* Protect ic_idle_bh */
        struct mutex         ic_idle_mutex;
     ...
};

There is initialization
2    234  lustre/osd-ldiskfs/osd_iam.c <<iam_container_init>>
             init_rwsem(&c->ic_sem);

There are wrappers
   3    622  lustre/osd-ldiskfs/osd_iam.c <<iam_container_write_lock>>
             down_write(&ic->ic_sem);
   4    627  lustre/osd-ldiskfs/osd_iam.c <<iam_container_write_unlock>>
             up_write(&ic->ic_sem);
   5    632  lustre/osd-ldiskfs/osd_iam.c <<iam_container_read_lock>>
             down_read(&ic->ic_sem);
   6    637  lustre/osd-ldiskfs/osd_iam.c <<iam_container_read_unlock>>
             up_read(&ic->ic_sem);

But this wrappers are not used. And, based on the git history, never been used.
Let's delete this useless code.

Change-Id: Ied1122f034e53fee08888e1091f700bda4507f00
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
HPE-bug-id: LUS-9545
Reviewed-on: https://review.whamcloud.com/40890
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14073 ldiskfs: update ldiskfs patches for Linux 5.9

Minor conflict changes only.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If0b58fc278f6721e44e26c6052509c31068f8e78
Reviewed-on: https://review.whamcloud.com/40397
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9325 obdclass: make niduuid for lustre_stop_mgc() static

The process to create a proper string for niduuid can be made
simpler and avoid a memory allocation.

Change-Id: I52cb01117e41cbcf2756477e91934a42d31fd157
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/33617
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12511 ldlm: free resource when ldlm_lock_create() fails.

ldlm_lock_create() gets a resource, but don't put it on
all failure paths. It should.

Change-Id: Ib49bcafdeac834c412adad9db135034d1ea06a04
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/45585
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12511 llite: fix misuse of current->parent.

current->parent is used by ptrace to redirect some signal delivery
to the ptracer. It should only be used by 'ptrace' or 'signal' code.
All other users should use current->real_parent, which is the real
parent.

Change-Id: I3212fd9f3db9935b8144dba8af82eb2f387a5c45
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/45580
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15083 ldiskfs: disable xattr credits check

Add ext4-xattr-disable-credits-check.patch to the
ldiskfs series from kernel 4.15+ except the
rhel8 ones. They already have this fix.

Change-Id: I1f9818c394377cdba6b11d2b24c26d2354f41ac0
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/45546
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9162 lod: option to set max stripe count per filesystem

Add an option to set max default stripe count when the stripe count
is set to -1.

Change-Id: I02634a02a6f6579750fe964662b7e644af1689d6
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/45532
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14781 osp: osp_object_free access NULL pointer

If a osp_object is created by multiple threads at the same time,
lu_object_find_at() could allocate an osp_object without object
initialization, before hash inserting of the object, it find another
object has been created and inserted by another thread, it will free
the unintialized osp_object, and osp_object_free() will access
an uninitialized list_head (opo_xattr_list).

This patch initialize osp_object::opo_xattr_list in its allocation
function.

Call trace:
            lu_object_free.isra.30+0xf2/0x170 [obdclass]
            lu_object_find_at+0x496/0x930 [obdclass]
            lod_initialize_objects+0x3e4/0xba0 [lod]
            lod_parse_striping+0x693/0xc20 [lod]
            lod_striping_load+0x2b2/0x660 [lod]
            lod_declare_destroy+0x12b/0x600 [lod]
            mdd_declare_finish_unlink+0x91/0x210 [mdd]
            mdd_unlink+0x48f/0xab0 [mdd]
            mdt_reint_unlink+0xc32/0x1550 [mdt]
            mdt_reint_rec+0x83/0x210 [mdt]
            mdt_reint_internal+0x6e1/0xb00 [mdt]
            mdt_reint+0x67/0x140 [mdt]
            tgt_request_handle+0xaee/0x15f0 [ptlrpc]
            ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
            ptlrpc_main+0xb34/0x1470 [ptlrpc]
            kthread+0xd1/0xe0

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib86aca5b41e94a1758f177655ea3a0f680335e0f
Reviewed-on: https://review.whamcloud.com/45442
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>

LU-15114 osp: changes queuing throttle

Prevent queue of sync changes from growing too much
by adding resends when queue size reaches
some (tunable) limit.

HPE-bug-id: LUS-10345
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I5efabb91d3700c58d9451f81c5fed9a22ae404fb
Reviewed-on: https://review.whamcloud.com/45265
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14491 ldiskfs: do not corrupt journal with bh change rh7.6

Currently, ldiskfs_xattr_delete_inode() zeroes xattr inode
references in cached buffers that haven't been prepared by
get_write_access().

When using journal checksums, it is possible that these buffers
are modified after the checksum is calculated but before the
buffer has been written to journal. Journal replay will fail
with a journal checksum error message if this transaction needs
to be replayed.

This is a port of:

Lustre-commit: d563f5140ffa183d0854cf7cd493ad884e314e3d
Lustre-change: https://review.whamcloud.com/41896

Test-Parameters: trivial
HPE-bug-id: LUS-9483
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I5ce1016737a16ddf74811c43ae74296d4f3e03b0
Reviewed-on: https://review.whamcloud.com/45164
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15068 ptlrpc: Do not unlink difficult reply until sent

If a difficult reply is queued in LNet, or the PUT for it is
otherwise delayed, then it is possible for the commit callback
to unlink the reply MD which will abort the send. This results in
client hitting "slow reply" timeout for the associated RPC and
an unnecessary reconnect (and possibly resend).

This patch replaces the rs_on_net flag with rs_sent and rs_unlinked.
These flags indicate whether the send event for the reply MD has
been generated, and whether the MD has been unlinked, respectively.

If rs_sent is set, but rs_unlinked has not been set, then ptlrpc_hr
is free to unlink the reply MD as a result of the commit callback.
The reply-ack will simply be dropped by the server.

If ptlrpc_hr is processing the reply because of commit callback, and
rs_sent has not been set, then ptlrpc_hr will not unlink the reply
MD. This means that the reply_out_callback must also be modified to
check for this case when the send event occurs. Otherwise, if the ACK
never arrives from the client, then the MD would never be unlinked.
Thus when the send event occurs, and rs_handled is set, the
reply_out_callback will schedule the reply for handling by ptlrpc_hr.

HPE-bug-id: LUS-10505
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ib8f4853c7ab35d72624fce7ee3fba9e59a746e1f
Reviewed-on: https://review.whamcloud.com/45138
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15065 quota: fix BIO write performance drop

Before the patch qti_lqes_qunit_min used int to store qunit
value, while lqe_qunit type is _u64. lqe_qunit > 2G caused
an overflow in a local integer argument. For example, when
block hard limit was set to 500TB(i.e. lqe_qunit was about
64TB in a system with 2 OSTs), qti_lqes_qunit_min returned
0 instead of 64TB in a qmt_lvbo_fill. Thus new qunit was not
set on OSTs(qsd_set_qunit wasn't called). Without the qunit,
OST began to send release request after each acquire. For
example, to write 10MB at the OST were sent 2 acquire and
2 release reuests(as qunit was not set on OST). With the
fix, i.e. in a normal case, OST needs just one acquire
request. The issue caused performance drop in a bufferred
write up to 15%-20% if compare with a baseline without PQ
patches.

Note, the issue exists only when a hard limit is set to some
high value(>100GB). The exact hard limit value depends on OSTs
number in a system and on amount of used space, but let's think
that issue doesn't exist on a clean system with 2 OSTs and hard
block limit 100G(this case was checked).

Remove qmt_pool_hash - it is not used anywhere since
"LU-11023 quota: remove quota pool ID".

HPE-bug-id: LUS-10250
Change-Id: I2c4ce38f5b9395ed1f4868d4c8efc00751116b15
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45133
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15179 tests: add trap cleanup_quota_test

Add stack_trap cleanup_quota_test to the tests that
use setup_quota_test. If a test fails without calling
cleanup_quota_test, it may cause later tests to fail
due to used space > 0.

Remove ${tdir}_dom, if exists, in cleanup_quota_test.
sanity-quota_75 doesn't remove test_dom directory.

Test-Parameters: trivial testlist=sanity-quota
Fixes: a4fbe734("LU-14739 quota: nodemap squashed root cannot bypass quota")
Change-Id: Ife4fd499b427bee79f74a5e172d233fe6a83e240
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15025 quota: stale edquot after clearing limits

When hard and soft limit set to 0, lqe enforced flag is also
set to false. As qmt_adjust_qunit does not handle not enforced
lqes, edquot set to the pool continues to be true and a user
gets -EDQUOT even if all pool limits are cleared. This was ok
for global pool lqe as since it turned off, zero limits are
sent to OSTs causing OSTs to release all granted space and
avoid EDQUOT. Fix this for PQ - set edquot and qunit to zero,
since appropriate lqe becomes "not enforced".

HPE-bug-id: LUS-10146
Change-Id: I1e08929bae7e1b37b1e8cbbc44859a786b5fb090
Reviewed-on: https://es-gerrit.dev.cray.com/158915/
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45000
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15191 quota: set correct revoke_time

When we do qmt_adjust_qunit, there are several lqes
and lqe_revoke_time is set for some of them, it means
appropriate OSTs have been already notified with the
least qunit and there is no chance to free more space.
If a qunit of the current lqe becomes equal to the least
qunit, find an lqe with the minimum(earliest) revoke_time
and set this revoke_time to the current one.

This patch fixes the following case. For example, we have
8 OSTs and 4 MDTs(i.e. 12 slaves) and a pool with just one
OST. Global hard block limit for the user is 50M, and 10M
for this user in a pool. User's usage is 0. As global pool
has 12 slaves it's initial qunit value is 1M, i.e. equal to
the least qunit. At the same time initial qunit value for the
pool with one OST is 4M. When user begins to write, pool's
qunit is decreased to 1M, but lqe_revoke is not set - it
should be set only after sending new qunit to OSTs in
qmt_lvbo_update. However, it won't be send because appropriate
lge_qunit in lqe global array already has the same value.
This problem caused sanity-quota_72 to hang instead of fail
with EDQUOT in test_1_check_write.

HPE-bug-id: LUS-10516
Change-Id: I5878c1e719ae83a69ad5dbc3653717bb1b4de632
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45447
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-10391 uapi: move out kernel only code.

Userland doesn't use the new IPv6 function in the UAPI nidstr.h.
So move this to the kernel header lib-types.h. Normally this
wouldn't matter but the python wrappers break with the LUTF
project. With the move to Netlink in the future for the user land
API we shouldn't need these functions anyways.

Test-parameters: trivial
Change-Id: I2ac6d102d4575f639573253c21723d62ca08abc2
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44844
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-12347 llite: do not take mod rpc slot for getxattr

The following scenario may lead to client eviction:
clientA                clientB                  MDS
threadA1: write to file F1, get
and hold DoM MDC LDLM lock L1:
   ->cl_io_loop()
    ->cl_io_lock()
     :
     ->mdc_lock_granted()
      ->lock->l_writers++
      [hold ref until write done]

threadA2-A8: create files F2-F8:
   ->ll_file_open()
    ->mdc_enqueue_base()
     ->ldlm_cli_enqueue()
      ->ptlrpc_get_mod_rpc_slot()
      ->ptlrpc_queue_wait()
      [hold RPC slot until create done]

                                              OST(s) in recovery.
                                              MDS waiting on OST(s) to
                                              precreate new objects.

threadA1:
   -> cl_io_start()
    -> __generic_file_aio_write()
     -> file_remove_suid()
      -> ll_xattr_cache_refill()
       -> mdc_xattr_common()
        -> ptlrpc_get_mod_rpc_slot()
        [blocked waiting for RPC slot]

                       threadB1: write file F1,
       enqueue DoM MDC lock L1

                                              MDS sends blocking AST
                                              to clientA for lock L1

ldlm_threadA3: cannot cancel busy lock L1:
   -> ldlm_handle_bl_callback()
   ["Lock L1 referenced, will be cancelled later"]

                                              MDS evicts clientA for
                                              not cancelling lock L1

threadA1: never completes write:
  ->cl_io_end()
   ->cl_io_unlock()
    ->osc_lock_cancel()
     ->lock->l_writers--;

The fix is to add IT_GETXATTR to list of operations which do not
need mod rpc slot.

Tests to illustrate the issue is added.

wait_for_function(): total sleep time (wait) is to be equal to max
when 1 is returned.

Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
HPE-bug-id: LUS-7271
Change-Id: I1b80677df084bda141b9ac58a78b765bd0b14a41
Reviewed-on: https://review.whamcloud.com/44151
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14428 libcfs: separate daemon_list from cfs_trace_data

cfs_trace_data provides a fifo for trace messages.  To minimize
locking, there is a separate fifo for each CPU, and even for different
interrupt levels per-cpu.

When a page is remove from the fifo to br written to a file, the page
is added to a "daemon_list".  Trace message on the daemon_list have
already been logged to a file, but can be easily dumped to the console
when a bug occurs.

The daemon_list is always accessed from a single thread at a time, so
the per-CPU facilities for cfs_trace_data are not needed.  However
daemon_list is currently managed per-cpu as part of cfs_trace_data.

This patch moves the daemon_list of pages out to a separate structure
- a simple linked list, protected by cfs_tracefile_sem.

Rather than using a 'cfs_trace_page' to hold linkage information and
content size, we use page->lru for linkage and page->private for
the size of the content in each page.

This is a step towards replacing cfs_trace_data with the Linux
ring_buffer which provides similar functionality with even less
locking.

In the current code, if the daemon which writes trace data to a file
cannot keep up with load, excess pages are moved to the daemon_list
temporarily before being discarded.  With the patch, these page are
simply discarded immediately.
If the daemon thread cannot keep up, that is a configuration problem
and temporarily preserving a few pages is unlikely to really help.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie894f7751cadacb515215f18182163ea5d26e969
Reviewed-on: https://review.whamcloud.com/41493
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15107 mdt: Exclusive create isn't replayed

mdt_finish_open() fails on exclusive create check as
there isn't an infrormation that the file was created.

Set DISP_OPEN_CREATE for exclusive open replay,
as we know that the original request has succeeded.

Change-Id: Idc71db76b757670b91b3bb1718870a018a805ed2
HPE-bug-id: LUS-9358
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/45254
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9680 net: Netlink improvements

With the expansion of the use of Netlink several issues have been
encountered. This patch fixes many of the issues. The issues are:

1) Fix idx handling in lnet_genl_parse_list() function. It needs
   to always been incremented. Some renaming suggestion for
   enum lnet_nl_scalar_attrs from Neil. New LN_SCALAR_ATTR_INT_VALUE
   to allow pushing integers as well as strings from userspace.

2) Create struct genl_filter_list which will be used to create
   a list of items to pass back to userland. This will be a common
   setup.

3) A normal user can't read /sys/debug/kernel/lustre which breaks
   lctl ***_params XXX since the first function called is
   llapi_param_get_paths(). Without the ability to read the
   debugfs tree glob() will fail. The solution is to use the
   kernel's glob function and just pass the requested string to
   the kernel.

4) For the external coordinator work you create a YAML parser
   that listens for kernel generated Netlink packets. This is
   a continuous stream vs an one time reply which we don't
   handle correctly. We move the handling of the completion
   of a Netlink packet series icompletely into the function
   yaml_netlink_msg_complete. In yaml_netlink_msg_parse() for
   the async case add "---" and in yaml_netlink_msg_complete()
   add "..." to define the beginning and end of a YAML document.

5) We have 3 types of setups. For kernel generated events it is
   possible to use just a YAML parser to listen for events. For
   the normal request -> reply setup we need both a YAML emitter
   and YAML parser. The last case is just sending commands to
   the kernel which only needs a YAML emitter. It is possible for
   that action to fail so we need to add handling for errors to
   the YAML emitter. We keep error handling for the YAML parser
   as well to handle the case of a stand along YAML parser listener.

6) Reworked the code that translates YAML to Netlink packets to
   send to the kernel so the both key and value pairs are sent
   seperately for the mapping case. This avoids dealing with
   complex string parsing in the kernel.

7) Error message handling was incorrect. struct nlmsgerr msg field
   is also the start of the nlattrs for the ext ack handling.

Test-Parameters: trivial
Change-Id: Ic8eee8fd0020b7a63565de6ef69f2c74bf4bdcd8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44358
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14713 llite: mend the trunc_sem_up_write()

The original lli_trunc_sem replace change (commit e5914a61ac) fixed a
lock scenario:

t1 (page fault)          t2 (dio read)              t3 (truncate)
|- vm_mmap_pgoff()       |- vvp_io_read_start()     |- vvp_io_setattr
|- down_write(mmap_sem)  |- down_read(trunc_sem)            _start()
|- do_map()              |- ll_direct_IO_impl()
|- vvp_io_fault_start    |- ll_get_user_pages()

|- down_write(
|- down_read(mmap_sem)        trunc_sem)
|- down_read(trunc_sem)

t1 waits for read semaphore of trunc_sem which is hindered by t3,
since t3 is waiting for the write semaphore while t2 take its read
semaphore,and t2 is waiting for mmap_sem which has been taken by t1,
and a deadlock ensues.

commit e5914a61ac changes the down_read(trunc_sem) to
trunc_sem_down_read_nowait() in page fault path, to make it ignore
that there is a down_write(trunc_sem) waiting, just takes the read
semaphore if no writer has taken the semaphore, and breaks the
deadlock.

But there is a delicacy in using wake_up_var(), wake_up_var()->
__wake_up_bit()->waitqueue_active() locklessly test for waiters on the
queue, and if it's called without explicit smp_mb() it's possible for
the waitqueue_active() to ge hoisted before the condition store such
that we'll observe an empty wait list and the waiter might not
observe the condition, and the waiter won't get woke up whereafter.

Fixes: e5914a61ac ("LU-12460 llite: replace lli_trunc_sem")
Change-Id: Ifdda2c1c8a4171466be1723923c136e84de8ce0e
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14971 test: align mirror_io resync implementation

Align the mirror_io resync implementation with
llapi_mirror_resync_many().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Icf11c4c2302f36fc0f9682e0a310058081e1214f
Reviewed-on: https://review.whamcloud.com/45202
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15142 lctl: fixes for set_param -P and llog_print

- properly handle permanent param deletion
- don't print skipped parameters in llog_print output
- add --raw option to llog_print to output all entries
including markers

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id93a206a255dc885343efa293e1ee2672493e5e5
Reviewed-on: https://review.whamcloud.com/45332
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15197 llite: Do not count tiny write twice

We accidentally count bytes written with tiny write twice
in stats. Remove the extra count.

This also has the positive effect of improving tiny write
performance by about 4% by removing an extra call to the
stats code (the main cost is ktime_get()).

Before, 8 byte dd:
13.9 MiB/s
After:
14.3 MiB/s

Test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia11e7f16e3e3d0c4012f87cde817ad7b21128fa8
Reviewed-on: https://review.whamcloud.com/45476
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15152 tests: auster reports wrong testsuite status

auster always reports testsuites returned 0 even when there
are failures.

Test-Parameters: trivial testlist=sanity-lnet env=ONLY=230,ONLY_REPEAT=20
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I7a8101d6cfd854d8419edf55c18a72e211f5e5c8
Reviewed-on: https://review.whamcloud.com/45343
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14662 lnet: set eth routes needed for multi rail

When ksocklnd is initialized or new ethernet interfaces
are added via lnetctl, set the routing rules using a common
shell script ksocklnd-config. This ensures control over
source interface when sending traffic.

For example, for eth0 with ip 192.168.122.142/24:
   the output of "ip route show table eth0" should be
192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.142

This step can be omitted by specifying
   options ksocklnd skip_mr_route_setup=1
in the conf file, or by using switch
   --skip-mr-route-setup
when adding NI with lnetctl. Note that the module parameter
takes priority over the lnetctl switch: if skip-mr-route-setup
is not specified when adding NI with lnetctl, the route still
won't get created if the conf file has skip_mr_route_setup=1.

The route also won't be created if any route already exists
for the given interface, assuming advanced users who manage
routing on their own will want to continue doing so.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia14e637bd29d4bbce5dd93daad9992336b2e6b15
Reviewed-on: https://review.whamcloud.com/44065
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-14448 lod: verify LOV early in lod_get_default_striping

lod_get_default_striping() will get both default LOV and default LMV,
and parse them to struct lod_default_striping one by one, however the
LOV and LMV data are both stored in lod_thread_info.lti_ea_store, so
lod_verify_striping() should verify LOV upon getting LOV, otherwise
if both exists, it's LMV that's verified, which will return -EINVAL.

Fixes: 6a08df2d0effc7a ("LU-14448 lod: verify LOV before set/inherit")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9763d35bdbc74101fa8515d5096ec457a4cb3524
Reviewed-on: https://review.whamcloud.com/45370
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-9704 grant: ignore grant info on read resend

The following scenario makes a message like "claims 28672 GRANT, real
grant 0" to appear:

1. client owns X grants and run rpcs to shrink part of those
2. server fails over so that the shrink rpc is to be resent.
3. on the clinet reconnect server and client sync on initial amount
of grants for the client.
4. shrink rpc is resend, if server disk space is enough, shrink does
not happen and the client adds amount of grants it was going to
shrink to its newly initial amount of grants. Now, client thinks that
it owns more grants than it does from server points of view.
5. the client consumes grants and sends rpcs to server. Server avoids
allocating new grants for the client if the current amount of grant
is big enough:
static long tgt_grant_alloc(struct obd_export *exp, u64 curgrant,
...
if (curgrant >= want || curgrant >= ted->ted_grant + chunk)
RETURN(0);
6. client continues grants consuming which eventually leads to
complains like "claims 28672 GRANT, real grant 0".

In case of resent of read and set_info:shrink RPCs grant info should
be ignored as it was reset on reconnect.

Tests to illustrate the issue is added.

HPE-bug-id: LUS-7666
Change-Id: I8af1db287dc61c713e5439f4cf6bd652ce02c12c
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45371
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-930 doc: update lfs migrate usage and man page

Update the usage and man page for "lfs migrate -m", noting that
this command will recursively migrate an entire directory tree.

It is not currently possible to migrate files with DoM components
between MDTs, so provide an example of how to work around this.

Only print the command-line options for commands in the usage
message instead of the full usage, since it is otherwise much
too verbose to see the actual error message being printed. The
user should read the lfs-migrate.1 man page for full usage.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9e34a33fcc3f0e2b90bc499fb7b946c53e6111d1
Reviewed-on: https://review.whamcloud.com/43378
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

LU-15121 llite: skip request slot for lmv_revalidate_slaves()

Some syscalls need lmv_revalidate_slaves(). It requires
second lock enqueue and the it can be blocked by
lack of RPC slots.

Don't acquire rpc slot for second lock enqueue.

Change-Id: Ida23c648c2bd169c4d238543731796232aa490dc
HPE-bug-id: LUS-8416
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/45275
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>