Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-10350 lod: adjust stripe count to available ost count 82/43882/4
Bobi Jam [Fri, 28 May 2021 08:25:52 +0000 (16:25 +0800)]
LU-10350 lod: adjust stripe count to available ost count

* When user specifies -1 stripe count or more stripe count than the
  ost count of a pool, we'd adjust the stripe count otherwise we
  cannot alloc enough stripe objects, as LOD reports as follows:

  lod_alloc_specific() can't lstripe objid [obj_fid]: have %d want %u

  where %d is the ost count of a pool, and %u is the total ost count
  if user specifies -1 stripe count of a bigger stripe count value
  than %d as user specifies.

* In ost-pool.sh, reset $MOUNT's stripe offset, so that the created
  diretory will not inherit it from root directory.

* Preserve the root directory layout in replay-single (run before
  ost-pools) to avoid leaving a bad layout on the root dir.
  Lustre-change: https://review.whamcloud.com/43872

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idf6884faf1271a3864710aeab0ba0eca154bf492
Reviewed-on: https://review.whamcloud.com/43882
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14711 osc: Notify server if cache discard takes a long time 57/43857/7
Oleg Drokin [Fri, 28 May 2021 02:34:44 +0000 (22:34 -0400)]
LU-14711 osc: Notify server if cache discard takes a long time

Discarding a large number of pages from a mapping under a
single lock can take a really long time (750GB is over 170s).
Since there is no stream of RPCs sent to the server as with
read or write to prolong the DLM lock timeout, the server
may evict the client as it does not see progress is being made.

As such send periodic "empty" RPCs to the server to show the
client is still alive and working on the pages under the lock.

For compatibility reasons the RPC is formed as a one-byte
OST_READ request with a special flag set to avoid doing
actual IO, but older servers actually do the one-byte read

Change-Id: I4603c83e92c328d93e29adce8cbfac3d561b25d5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43857
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
2 years agoLU-14721 tests: wait_destroy_complete should check MDTs 70/43870/3
Oleg Drokin [Sat, 29 May 2021 03:45:20 +0000 (23:45 -0400)]
LU-14721 tests: wait_destroy_complete should check MDTs

Ever since destroys handling was moved to MDTs we need to
move waiting for destroys completion to MDTs as well.

Change-Id: I31440ec048b960206a903387d7050aa13e45008d
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43870
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoNew tag 2.14.52 2.14.52 v2_14_52
Oleg Drokin [Fri, 11 Jun 2021 17:09:09 +0000 (13:09 -0400)]
New tag 2.14.52

Change-Id: I9882f84941588ab2a92f1d736559a6e903b32d49
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13783 procfs: fix improper prop_ops fields 80/43880/3
James Simmons [Mon, 31 May 2021 17:05:50 +0000 (13:05 -0400)]
LU-13783 procfs: fix improper prop_ops fields

The lod pool and nodemap proc_ops missed renaming the fields to
start with .proc_*. On newer distros like Ubuntu 20.04 HWE you
get the following compile error:

lustre-release/lustre/ptlrpc/nodemap_lproc.c:686:3: error: ‘const struct proc_ops’ has no member named ‘open’
  686 |  .open   = nodemap_ranges_open,

Test-Parameters: trivial
Fixes: 13cd0f9f667 ("LU-13344 libcfs: Abstract proc_fs with proc_ops")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: I5fff7519a801f585690d468255f7ca6c73adcc90
Reviewed-on: https://review.whamcloud.com/43880
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14725 tests: sanity/27Q to remove own symlink in /tmp 75/43875/5
Alex Zhuravlev [Sun, 30 May 2021 15:36:39 +0000 (18:36 +0300)]
LU-14725 tests: sanity/27Q to remove own symlink in /tmp

otherwise any subsequent restart of MDS/MGS on a local setup
with ZFS backend gets stuck as zpool import scans /tmp and
stat's every found file.

Test-Parameters: trivial
Fixes: cd4caef54f ("LU-14583 llapi: handle symlinks in llapi_file_get_stripe()")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I2eb4cb8819670acef0302e1fe5ab767be7f46842
Reviewed-on: https://review.whamcloud.com/43875
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14607 osp: separate buffer for large XATTR 36/43736/6
Lai Siyao [Wed, 19 May 2021 02:58:19 +0000 (10:58 +0800)]
LU-14607 osp: separate buffer for large XATTR

Once XATTR is too large to fit into PAGE_SIZE, allocate value in a
separate buffer for osp_xattr_entry.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ied090ff73e2e5cdeaf2d91a3670067210f2ab1d7
Reviewed-on: https://review.whamcloud.com/43736
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14661 lnet: Check if discovery toggled off in ping reply 08/43508/4
Chris Horn [Wed, 27 Jan 2021 18:22:09 +0000 (12:22 -0600)]
LU-14661 lnet: Check if discovery toggled off in ping reply

If a peer is initially discovered and found to have discovery
enabled, but the peer later reloads LNet with discovery disabled,
then we can delete the peer and re-create it the next time the peer
is discovered.

It is safe to delete and re-create the peer as long as it wasn't
configured manually.

In lnet_peer_deletion(), we need to use lnet_del_init() when removing
the peer from the discovery queue because the lnet_peer_del() code
path can result in a call to lnet_peer_queue_for_discovery() where
we check if the lp_dc_list is empty.

Test-Parameters: trivial
HPE-bug-id: LUS-9178
Fixes: aa7de0af69 ("LU-13895 lnet: Prevent discovery on peer marked deletion")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0b43d7541711a3b94c492082d4a29487ebe72b09
Reviewed-on: https://review.whamcloud.com/43508
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14660 lnet: Fix destination NID for discovery PUSH 07/43507/2
Chris Horn [Fri, 29 Jan 2021 14:08:08 +0000 (17:08 +0300)]
LU-14660 lnet: Fix destination NID for discovery PUSH

If we're sending a discovery PUSH after receiving a discovery
REPLY then we want to send via the same NID that the reply was
sent to. This introduces a challenge in selecting an appropriate
destination NID for the PUSH because lnet_select_pathway() will not
run the MR selection algorithm for choosing a peer NI if the source
NI has been specified.

It is reasonable to assume that the NID used by the message
originator in sending the REPLY is a suitable destination for the
discovery PUSH. Thus, we record this NID in the same location we
currently record the lp_disc_src_nid, and use it when sending the
PUSH. With this change, the only other user of lnet_peer_select_nid()
is lnet_peer_send_ping(). In the ping case we do not set a source NID,
so lnet_select_pathway() is free to choose any peer NI. So this change
allows us to get rid of lnet_peer_select_nid() altogether.

Alternatively, we would need to reproduce a lot of the path selection
algorithm inside lnet_peer_select_nid() in order to avoid sending to
unhealthy NIDs. It seems undesirable and unnecessary to duplicate that
logic.

Test-Parameters: trivial
HPE-bug-id: LUS-9333
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I47ef856075f049d71c395565974204b8f6fa9003
Reviewed-on: https://review.whamcloud.com/43507
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14432 misc: update e2fsprogs to 1.46.2.wc1 69/43469/2
Li Dongyang [Tue, 27 Apr 2021 23:31:59 +0000 (09:31 +1000)]
LU-14432 misc: update e2fsprogs to 1.46.2.wc1

Update Changelog for the new e2fsprogs release.

Change-Id: I173c43f1c777b7223a56841a06545c1741e1a903
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/43469
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
2 years agoLU-11839 iokit: Fix help message 14/43114/2
Arshad Hussain [Thu, 25 Mar 2021 10:50:59 +0000 (16:20 +0530)]
LU-11839 iokit: Fix help message

This patch fixes help message of iokit-gather-stats
to properly add "--help" instead of "-help". Two
hyphen/dashes are expected by getopts.

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I64270598fc19377571b68066d617b50fcb48cc12
Reviewed-on: https://review.whamcloud.com/43114
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14544 tests: duplicated export entries 30/42130/5
Elena Gryaznova [Mon, 22 Mar 2021 16:55:02 +0000 (19:55 +0300)]
LU-14544 tests: duplicated export entries

File /etc/exports could have $MNTPNT entry if
previous NFS tests interrupted.
With duplicated entries in /etc/exports nfs server
service fails to start/restart in RHEL 8 and SLES15.
Patch cleanups exports file before adding $MNTPNT entry.

Test-Parameters: trivial     clientdistro=el8.3 serverdistro=el8.3     testlist=parallel-scale-nfsv3,parallel-scale-nfsv4
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-6291
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: I738bd0e8e79dc1ba84e6aa70e06fa47c49a935e0
Reviewed-on: https://review.whamcloud.com/42130
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14274 tests: enhance racer to set extra layout 77/41077/3
Elena Gryaznova [Mon, 24 May 2021 16:45:12 +0000 (19:45 +0300)]
LU-14274 tests: enhance racer to set extra layout

Patch adds an ability to set:
  - a generic "RACER_EXTRA_LAYOUT" contained any kind of
    layout in addition to layouts defined for
    RACER_ENABLE_*;
  - an initial racer RACER_PROGS commands list.
    The additional commands specified by RACER_EXTRA,
    RACER_ENABLE_REMOTE_DIRS, RACER_ENABLE_STRIPED_DIRS
    and RACER_ENABLE_MIGRATION are not ignored, i.e. the
    following parameters are to be set to run file_create only:
        RACER_ENABLE_REMOTE_DIRS=false
        RACER_ENABLE_STRIPED_DIRS=false
        RACER_ENABLE_MIGRATION=false.
Patch fixes NUM_THREADS and MAX_FILES to be passed correctly.

Test-Parameters: testlist=racer
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9142
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I9994dfdb0555a3acd75daa4cfd27a0cb62074e36
Reviewed-on: https://review.whamcloud.com/41077
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14138 ptlrpc: move more members in PTLRPC request into pill 69/40669/11
Qian Yingjin [Tue, 17 Nov 2020 15:12:44 +0000 (23:12 +0800)]
LU-14138 ptlrpc: move more members in PTLRPC request into pill

Some data members in the data structure @ptlrpc_request can be
moved into the data structure @rep_capsule:
/** Request message - what client sent */
struct lustre_msg *rq_reqmsg;
/** Reply message - server response */
struct lustre_msg *rq_repmsg;
/** Fields that help to see if request and reply were swabbed */
__u32 rq_req_swab_mask;
__u32 rq_rep_swab_mask;

After these data structures are reconstructed, @rep_capsule can
be more common used and it makes pack and unpack sub requests
in a batch PtlRPC request for the coming batch metadata processing
more easily.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib6d942b79ebf1a444d63b55ad4bc94813cf947c7
Reviewed-on: https://review.whamcloud.com/40669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
2 years agoLU-14459 lmv: change default hash type to crush 84/43684/5
Andreas Dilger [Thu, 13 May 2021 01:20:04 +0000 (19:20 -0600)]
LU-14459 lmv: change default hash type to crush

Change the default hash type to CRUSH to minimize the number
of directory entries that need to be migrated.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I75aff45898044be9d12ae1bfad31b4693b3ebbe5
Reviewed-on: https://review.whamcloud.com/43684
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4] 25/43725/6
Jian Yu [Fri, 4 Jun 2021 07:37:18 +0000 (00:37 -0700)]
LU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4]

This patch makes changes to support new RHEL 8.4 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.4

Change-Id: I47d4706f9175d489ef0e6226492af20f44f0677e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43725
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14703 build: fixup lustre-libcfs.m4 comments 76/43776/2
Olaf Faaland [Tue, 25 May 2021 02:10:36 +0000 (19:10 -0700)]
LU-14703 build: fixup lustre-libcfs.m4 comments

Several macros whose ends are commented to identify the macro being
defined, like

.. # LIBCFS_FUBAR

have the wrong macro named in the comment.  Fix those
end comments so they match the opening comment or the
AC_DEFUN correctly.

Test-Parameters: trivial
Change-Id: Ia40ccb9e271e90306df37d0028734a84684e42ef
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/43776
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14682 tests: sanity-flr to remove temporary files 69/43669/4
Alex Zhuravlev [Wed, 12 May 2021 04:33:02 +0000 (07:33 +0300)]
LU-14682 tests: sanity-flr to remove temporary files

otherwise the test fails frequently on small local systems due
to lack of space.

Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7076bcf2346ae1ec7a4d1bead3d94b2c4bb57bbf
Reviewed-on: https://review.whamcloud.com/43669
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14665 lnet: simplify lnet_ni_add_interface 25/43525/12
Olaf Faaland [Tue, 4 May 2021 02:40:22 +0000 (19:40 -0700)]
LU-14665 lnet: simplify lnet_ni_add_interface

Remove an unnecessary counter and move the comment before
the relevant code.  Improve error messages.

Test-parameters: trivial

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Iffc7a128b16bc1b2be7a44413a5972c97b12a5fa
Reviewed-on: https://review.whamcloud.com/43525
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13284 tests: few tests miss MDS_MOUNT_OPTS/OST_MOUNT_OPTS 69/37669/18
Alex Zhuravlev [Thu, 20 Feb 2020 11:57:08 +0000 (14:57 +0300)]
LU-13284 tests: few tests miss MDS_MOUNT_OPTS/OST_MOUNT_OPTS

Some tests mount servers without MDS_MOUNT_OPTS or OST_MOUNT_OPTS,
then localrecov mount option is lost and subsequent tests may fail
in a local testing environment.

Fixes: 8bd04b4e57 ("LU-12722 target: disable recovery for local clients")

Change-Id: I4e5d3a8678d027809ea9a0d129fbfbc8c6beae09
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37669
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14702 osc: cleanup comment in osc_object_is_contended 75/43775/3
Li Xi [Tue, 25 May 2021 00:49:01 +0000 (08:49 +0800)]
LU-14702 osc: cleanup comment in osc_object_is_contended

ll_file_is_contended() does not exist any more, so the comment
is invalid.

Test-Parameters: trivial
Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: Ib68e8dc885e6812065c076d36dc61938a30d6980
Reviewed-on: https://review.whamcloud.com/43775
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14701 tests: wrong get[set]_osd_param() call 74/43774/2
Elena Gryaznova [Mon, 24 May 2021 15:14:26 +0000 (18:14 +0300)]
LU-14701 tests: wrong get[set]_osd_param() call

sanity-dom:sanityn:test_19 is always skipped because of
get_osd_param() is called incorrectly for DOM=yes.
For osd-* MDT device is to be used instead of default
  device=${2:-$FSNAME-OST*}

Fixes: a7625cd2f37a ("LU-3285 test: add Data-on-MDT tests and fixes")
Test-Parameters: trivial testlist=sanity-dom env=ONLY=sanityn
Test-Parameters: trivial testlist=sanityn env=ONLY=19
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9965
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I2bb9fc7fbaac966ea2254071e7ea82b963a93ad3
Reviewed-on: https://review.whamcloud.com/43774
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14693 mdt: skip DLM when opening volatile files 42/43742/2
John L. Hammond [Wed, 28 Apr 2021 18:43:51 +0000 (13:43 -0500)]
LU-14693 mdt: skip DLM when opening volatile files

In mdt_reint_open(), when opening a volatile file skip taking a
MDS_INODELOCK_UPDATE lock on the parent directory.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I8ee89710f52e8097e1412897de91159702560e4a
Reviewed-on: https://review.whamcloud.com/43742
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13055 libcfs: allow comma-separated masks 41/43741/4
Andreas Dilger [Wed, 19 May 2021 07:44:32 +0000 (01:44 -0600)]
LU-13055 libcfs: allow comma-separated masks

For debug and changelog mask names, allow a comma-separated list
of names to be given, so that the space-separated list does not
need to be quoted for use.

Change sanity-quota to use a comma-separated list to verify it works.

Fix a couple of test cases where the debug parameter is set and
printed overly verbosely during tests.

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icf1e3ebc74f0e48b38a65486b2275ec4c33ebbe5
Reviewed-on: https://review.whamcloud.com/43741
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14688 mdt: changelog purge deletes plain llog 19/43719/2
Alexander Boyko [Mon, 17 May 2021 13:29:01 +0000 (09:29 -0400)]
LU-14688 mdt: changelog purge deletes plain llog

With a massive cancel records changelog could delete a plain
llog file and skip one by one record cancelling.
Also patch fixes the race between llog_destroy and llog_next_block.

HPE-bug-id: LUS-9950
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I47c2ed97945e979745255381f83b6a417d7ba8b1
Reviewed-on: https://review.whamcloud.com/43719
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-11188 lfs: add "--perm" option to "lfs find" 15/43715/4
Courrier Guillaume [Thu, 29 Apr 2021 09:35:01 +0000 (11:35 +0200)]
LU-11188 lfs: add "--perm" option to "lfs find"

Add support for "--perm" option to "lfs find".
The option supports both octal and symbolic representation and
follows the POSIX standard.
As for GNU find, it supports '-' and '/' modifiers before the
permission.

Signed-off-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Change-Id: I8e1292421986c3a4bde686f3c7dc7bfcb679cabc
Reviewed-on: https://review.whamcloud.com/43715
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
2 years agoLU-9537 utils: implement "lfs getstripe --fid" for directories 14/43714/4
Yoann Valeri [Wed, 12 May 2021 09:48:04 +0000 (11:48 +0200)]
LU-9537 utils: implement "lfs getstripe --fid" for directories

Enhance the lfs command by displaying a directory fid when using "lfs
getstripe --fid" on one.

When displaying information through "lfs getstripe --fid", we would
check if the given path was associated to a directory or not. If so,
the fid display would just be skipped, showing a simple blank line.
However, a user could still find the fid of a directory by using "lfs
path2fid" on the same directory.  Therefore, this patch adds a hook to
the underlying "llapi_fd2fid()" (called internally by "lfs path2fid")
when trying to display a directory fid with "lfs getstripe --fid".

Signed-off-by: Yoann Valeri <yoann.valeri@cea.fr>
Change-Id: Ia153717e3feb1a359b8b54297995365fc34a1c29
Reviewed-on: https://review.whamcloud.com/43714
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12678 lnet: use list_for_each_entry() 91/43591/4
James Simmons [Mon, 10 May 2021 20:10:29 +0000 (16:10 -0400)]
LU-12678 lnet: use list_for_each_entry()

Several loops use list_for_each(), then call list_entry()
each time in the loop This complexity can be replaced with
the use of  list_for_each_entry().

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: Ib7968466c4fce5173b20cbaf6c878975ba522d43
Reviewed-on: https://review.whamcloud.com/43591
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12836 osd-zfs: Catch all ZFS pool change events 52/43552/3
Tony Hutter [Fri, 12 Mar 2021 01:23:16 +0000 (17:23 -0800)]
LU-12836 osd-zfs: Catch all ZFS pool change events

This change adds the following symlinks:

  vdev_attach-lustre -> statechange-lustre.sh
  vdev_remove-lustre -> statechange-lustre.sh
  vdev_clear-lustre -> statechange-lustre.sh

This makes it so the statechange-lustre.sh script is also called on
all ZFS events that could change the pool state.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Change-Id: I18edc86749e8ab91bb45f21aafd3fd47e78cbaef
Reviewed-on: https://review.whamcloud.com/43552
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14663 mdc: start changelog thread upon first access 13/43513/5
Alex Zhuravlev [Sun, 2 May 2021 09:16:01 +0000 (12:16 +0300)]
LU-14663 mdc: start changelog thread upon first access

thus leaving the caller a chance to set CHANGELOG_FLAG_FOLLOW,
otherwise the thread (started from open()) can reach the end
of the changelog and exit early.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ic14b6c991010bbe5197b5a8b0fedf0f4007e98c1
Reviewed-on: https://review.whamcloud.com/43513
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13717 sec: limit hard links to linkEA size for enc files 87/43387/4
Sebastien Buisson [Mon, 19 Oct 2020 14:23:05 +0000 (23:23 +0900)]
LU-13717 sec: limit hard links to linkEA size for enc files

Some operations on encrypted files require to identify all names for
files having the same FID. For instance, for lookup, getattr or unlink
on encrypted files without the encryption key, we need to perform an
operation by FID instead of the actual name.
In order to make operations by FID unambiguous on server side, we
decide to limit the number of possible hard links for encrypted files,
to what the linkEA can contain.
Currently linkEA stores 4KiB of links, that is 14 NAME_MAX links, or
119 16-byte names.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I20a01874899f95b2ff61e05b2aa6851d135633e8
Reviewed-on: https://review.whamcloud.com/43387
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14629 sec: forbid file rename from enc to unencrypted dir 04/43404/6
Sebastien Buisson [Thu, 22 Apr 2021 09:26:51 +0000 (11:26 +0200)]
LU-14629 sec: forbid file rename from enc to unencrypted dir

fscrypt allows renaming an encrypted file from an encrypted directory
into an unencrypted directory. But it leaves the file encrypted,
sitting in an unencrypted directory, which can lead to unexpected
issues.
So just prevent this kind of rename, and adapt sanity-sec test_47
accordingly.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I38e17caa4786c1c8d80a363a826a5aa298eb0980
Reviewed-on: https://review.whamcloud.com/43404
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14586 tests: set mpi np correctly 17/43217/2
Elena Gryaznova [Tue, 6 Apr 2021 08:05:02 +0000 (11:05 +0300)]
LU-14586 tests: set mpi np correctly

The number of mpi processes is to be calculated
based on the number of clients in clients subset.

Fixes: 9ecb000 ("LU-13281 tests: ha.sh improvements")
Test-Parameters: trivial
Signed-off-by: Elenai Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9716
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: If574743e2e29a309a8d7a021056fa726495fa959
Reviewed-on: https://review.whamcloud.com/43217
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14340 tests: remove stale test-framework functions 66/41266/3
Andreas Dilger [Tue, 19 Jan 2021 04:43:57 +0000 (21:43 -0700)]
LU-14340 tests: remove stale test-framework functions

Delete functions not referenced anywhere in test-framework.sh
(each one will still appear only once, where it is defined):

  $ grep "^[a-z].* *()" lustre/tests/test-framework.sh |
  sed -e 's/function //' -e 's/ *(.*//' |
  while read F; do (( $(grep $F lustre/tests/*.sh | wc -l) > 1 )) ||
      echo "$F"
  done

  mdsdevlabel ostdevlabel cleanup_check obd_name unmount_zfs
  is_empty_fs get_svr_devs at_min_get canonical_path agts_nodes
  mixed_ost_devs setstripe_nfsserver delayed_recovery_enabled
  mds_on_old_device remove_ost_objects remove_mdt_files
  duplicate_mdt_files get_block_size

The unmount_zfs() function returned by the above check *is*
used via unmount_fstype() calling it as "unmount_$fstype".

Fixes: 5a3dfc2b5d90 ("LU-7301 tests: delete old lfsck tests")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I71842152a87f15918147da860745ef8e981f6121
Reviewed-on: https://review.whamcloud.com/41266
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13950 lnet: do not crash if lnet_sock_getaddr returns error 34/39834/6
Artem Blagodarenko [Tue, 25 Aug 2020 10:01:11 +0000 (06:01 -0400)]
LU-13950 lnet: do not crash if lnet_sock_getaddr returns error

Some issues with network lead to panic in ksocknal_accept

rc = lnet_sock_getaddr(sock, true, &peer_ip, &peer_port);
LASSERT(rc == 0); /* we succeeded before */

Let's pass this error to the caller.

Change-Id: I34d43c19b4e75422db50e7abb02cac3510882b0d
hpe-bug-id: LUS-9256
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://es-gerrit.dev.cray.com/157753
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-on: https://review.whamcloud.com/39834
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14687 llite: Return errors for aio 22/43722/7
Patrick Farrell [Wed, 19 May 2021 18:08:57 +0000 (14:08 -0400)]
LU-14687 llite: Return errors for aio

The aio code incorrectly discards errors from
ll_direct_rw_pages.  Fix this and add a test for this.

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I49dadd0b3692820687fa6a1339e00516edf7a5d5
Reviewed-on: https://review.whamcloud.com/43722
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-11290 osc: Batch gang_lookup cbs 89/33089/10
Patrick Farrell [Wed, 3 Mar 2021 06:50:04 +0000 (09:50 +0300)]
LU-11290 osc: Batch gang_lookup cbs

The osc_page_gang_lookup call backs can be trivially
converted to operate in batches rather than one page at a
time.  This improves cancellation time for locks protecting
large numbers of pages by about 10% (after landing
another optimization (LU-11290 ldlm: page discard speedup)
it shows 6% for canceling a lock for 30GB cached file ).

Truncate to zero time (with one lock protecting many pages)
was improved by about 5-10% as well.  Lock weighing
performance should be improved slightly as well, but is
tricky to benchmark.

HPE-bug-id: LUS-6432
Change-Id: Ib30594ae97182cbeb18051d6cee860c97ae7e119
Signed-off-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-on: https://review.whamcloud.com/33089
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14273 tests: enhance ha.sh to run custom cmd on bg 73/41073/2
Elena Gryaznova [Tue, 22 Dec 2020 15:19:15 +0000 (18:19 +0300)]
LU-14273 tests: enhance ha.sh to run custom cmd on bg

We need this ability to run lfs migrate in parallel
with mdtest and IOR.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9371
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: Id52ed0731aba24d3f40813da5fd2bb9b94ae63e5
Reviewed-on: https://review.whamcloud.com/41073
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13942 obd: check if sbi->ll_md_exp is initialized 12/39812/6
Artem Blagodarenko [Fri, 21 Aug 2020 17:43:38 +0000 (13:43 -0400)]
LU-13942 obd: check if sbi->ll_md_exp is initialized

Null reference at the start of obd_statfs() function is possible
because of ll_fill_super vs lctl race.

ll_md_exp is initialized in ll_fill_super()->
client_common_fill_super(), but if mount process stucks
in lustre_process_log() it doesn't reach client_common_fill_super().

Change-Id: Ife72a62ba42573e2a9c6d244e36cde738b70c15a
hpe-bug-id: LUS-9150
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-on: https://es-gerrit.dev.cray.com/157732
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/39812
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14642 flr: transfer layout version on layout change 72/43472/7
Bobi Jam [Wed, 28 Apr 2021 05:16:05 +0000 (13:16 +0800)]
LU-14642 flr: transfer layout version on layout change

After layout changed (mirror extend/split), the file's layout version
needs to transfer to OST ASAP so that following IO won't be blocked
since OFD will check its layout version stored in the xattr
XATTR_NAME_FID and find that the layout version from the client IO is
bigger (ofd_verify_layout_version()).

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I353800e868eaf13e3c795926b0d76fb1eb45c535
Reviewed-on: https://review.whamcloud.com/43472
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14645 utils: setstripe cleanup 65/43465/3
Vitaly Fertman [Tue, 27 Apr 2021 19:15:30 +0000 (22:15 +0300)]
LU-14645 utils: setstripe cleanup

lfs setstripe checks stripe parameters differently for PFL and !PFL
layouts. Whereas the PFL layout is checked in comp_args_to_layout()
individually and in llapi_layout_sanity_cb() in pairs, !PFL layout
verification is done partially in several places. Create a common
llapi_stripe_param_verify() for this purpose. Make the checks for
both cases symmetric.

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I456b1b2e876229ac1a354d4e3879624325856574
HPE-bug-id: LUS-9886
Reviewed-on: https://es-gerrit.dev.cray.com/158589
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/43465
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14644 vvp: wait for nrpages to be updated 64/43464/2
Vitaly Fertman [Tue, 27 Apr 2021 18:43:06 +0000 (21:43 +0300)]
LU-14644 vvp: wait for nrpages to be updated

truncate_inode_pages() says there still may be a page in a process
of deletion upon return. wait for another thread which is doing
__delete_from_page_cache() to get nrpages updated.

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I165b3d0866efaf2eb7e977520ebba4ee831874ab
HPE-bug-id: LUS-8842
Reviewed-on: https://es-gerrit.dev.cray.com/158557
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/43464
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14594 ptlrpc: do not match reply with resent RPC 42/43242/4
Vitaly Fertman [Thu, 8 Apr 2021 12:00:11 +0000 (15:00 +0300)]
LU-14594 ptlrpc: do not match reply with resent RPC

The server is able to filter by the connection ID, and drop late
coming RPCs of previous connections, however it does not happen for
replies. At the same time, this is a problem in some cases.

Allocate new matchbits for resends and check replies by them, instead
of xid. Connect RPCs are exceptions due to interop with old server -
at the time of connect we do not know yet if the server supports it.

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I2aad037002b488b0c3371544ede0c47940f87efe
HPE-bug-id: LUS-9596
Reviewed-on: https://es-gerrit.dev.cray.com/158446
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/43242
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14673 sec: annotate algorithms taking optional key 56/43656/3
Sebastien Buisson [Tue, 11 May 2021 08:59:03 +0000 (10:59 +0200)]
LU-14673 sec: annotate algorithms taking optional key

Crypto algorithms implementing a ->setkey() method but that can also
be used without a key must set the CRYPTO_ALG_OPTIONAL_KEY flag if
defined in the kernel.
In Lustre, adler32 implementation defines a ->setkey() method, but
its "key" is not actually a cryptographic key.

Linux-commit: a208fa8f33031b9e0aba44c7d1b7e68eb0cbd29e

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I362211d1b1aa3763fe1481cebb3629b255f29e41
Reviewed-on: https://review.whamcloud.com/43656
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14689 hsm: starting running HSM coordinator should success 20/43720/2
Li Xi [Mon, 17 May 2021 14:49:55 +0000 (22:49 +0800)]
LU-14689 hsm: starting running HSM coordinator should success

When starting a running coordinator, the command should succeed
no matter how many times the command runs.

And this should be the same for stopping a stopped coordinator.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I99169de35d6fcc11e03604ac63cdc4358e25b3d2
Reviewed-on: https://review.whamcloud.com/43720
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14549 llite: refresh layout after mirror merge/split 16/43716/3
Bobi Jam [Mon, 17 May 2021 09:14:33 +0000 (17:14 +0800)]
LU-14549 llite: refresh layout after mirror merge/split

mirror merge/split updates file's LOVEA and revokes client's layout
lock, but the client issuing the layout change needs to refresh its
layout (lov->lsm) as well.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I7671efe2fe5354ba0e1503b146045165608e042c
Reviewed-on: https://review.whamcloud.com/43716
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14430 mdd: use own rec_hdr for changelog declare 83/43683/3
Andreas Dilger [Thu, 13 May 2021 00:41:47 +0000 (18:41 -0600)]
LU-14430 mdd: use own rec_hdr for changelog declare

Do not use an lu_buf just to declare the changelog record.  This
only needs llog_rec_hdr to pass in lrh_len, so declaring rec_hdr
on the stack avoids the overhead of using the lu_buf.

Fixes: f3d03bc38a ("LU-14430 mdd: fix inheritance of big default ACLs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7b6f1d761aa98aa6ecb023894bde03dce23ebbe5
Reviewed-on: https://review.whamcloud.com/43683
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14681 obclass: fix typo of comment on job ID 67/43667/3
Li Xi [Wed, 12 May 2021 02:21:16 +0000 (10:21 +0800)]
LU-14681 obclass: fix typo of comment on job ID

Comments on how to get job ID are confusing because of the typo.

Test-Parameters: trivial
Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I9d714323f106dfb76eafc8d70346409b38a9b66b
Reviewed-on: https://review.whamcloud.com/43667
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14291 lustre: rename tgt_pool_* functions. 24/43624/3
James Simmons [Mon, 10 May 2021 13:43:56 +0000 (09:43 -0400)]
LU-14291 lustre: rename tgt_pool_* functions.

Functions starting with tgt_* represents code for target handling
used by Lustre servers. Now that the pool functions are used by
both clients and servers rename it to lu_tgt_* to mirror how
lu_tgt_desc_* is used since both represents remote server targets.

Test-Parameters: trivial
Change-Id: I2a9084d5bf9cea3b373c96e15cba1a41631d1172
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43624
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2] 50/43550/2
Jian Yu [Wed, 5 May 2021 18:19:39 +0000 (11:19 -0700)]
LU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2]

Update SLES12 SP5 kernel to 4.12.14-122.66.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ib2bf4795ccb21dbd0bb9202228ff32d73a203eee
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43550
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14671 kernel: kernel update SLES15 SP2 [5.3.18-24.61.1] 49/43549/2
Jian Yu [Wed, 5 May 2021 18:12:24 +0000 (11:12 -0700)]
LU-14671 kernel: kernel update SLES15 SP2 [5.3.18-24.61.1]

Update SLES15 SP2 kernel to 5.3.18-24.61.1 for Lustre client.

Test-Parameters: trivial \
env=SANITY_EXCEPT="100 130 136 817" \
clientdistro=sles15sp2 serverdistro=el7.9 \
testlist=sanity

Change-Id: Ie0aab7cc7200796ed8e4d75862ceaef020943c08
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43549
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7] 48/43548/3
Jian Yu [Wed, 5 May 2021 17:58:18 +0000 (10:58 -0700)]
LU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.25.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ic846d648c45476cc4886ce86577605bf3e66d935
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43548
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14622 osd: mark pages accessed on reads 67/43367/6
Alex Zhuravlev [Mon, 19 Apr 2021 06:10:17 +0000 (09:10 +0300)]
LU-14622 osd: mark pages accessed on reads

to improve cache hit ratio

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If4850465d118ed62e9da105dc0cf144ff5041fd3
Reviewed-on: https://review.whamcloud.com/43367
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13783 ldiskfs: Add support for mainline 5.8 kernel 73/40373/4
Mr NeilBrown [Wed, 21 Oct 2020 02:34:09 +0000 (13:34 +1100)]
LU-13783 ldiskfs: Add support for mainline 5.8 kernel

Various changes needed for 5.8 over 5.4:
 - ext4_mark_inode_dirty is now a macro, so export
     __export_mark_inode_dirty instead
 - procfs additions need to use 'struct proc_ops'
 - inode-test.c is a new C file that we MUST NOT build
 - various ordinary conflicts

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I681ab26c60fb35a1ef5f518ee7cac8766e6fde47
Reviewed-on: https://review.whamcloud.com/40373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13952 quota: default OST Pool Quotas 73/39873/18
Sergey Cheremencev [Thu, 10 Sep 2020 12:48:01 +0000 (15:48 +0300)]
LU-13952 quota: default OST Pool Quotas

Patch makes ability to set default quota
limits per OST pool.
Patch also adds sanity-quota_73.

Test-Parameters: testlist=sanity-quota
HPE-bug-id: LUS-9133
Change-Id: I9e49def231aeeed4588e5e3fbcd29fdd62a35855
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/39873
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13852 pcc: don't alloc FID in LLITE for pcc open 68/39568/11
Lai Siyao [Fri, 31 Jul 2020 18:09:06 +0000 (02:09 +0800)]
LU-13852 pcc: don't alloc FID in LLITE for pcc open

ll_lookup_it(IT_OPEN) always alloc FID on MDT0 for pcc open, but
the open request is sent to MDT where the name hash points to,
which may be different from the MDT where the FID is, which will
trigger osp_md_create() assertion because file is created remotely.

This FID allocation is not necessary, and it can be left to be done
in lmv_intent_open() by LMV layer, because the MDT is chosen in
LMV. Then when it's done, the FID allocated can be used to initialize
PCC inode.

Change assertion in osp_md_create() to error message and return
error.

Update sanity-sec 2a for this.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I3ccea3f9e7cca5083695c71135b9a5805f833b14
Reviewed-on: https://review.whamcloud.com/39568
Reviewed-by: Yingjin Qian <qian@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14004 llite: default lsm update may memory leak 03/40103/5
Lai Siyao [Sun, 27 Sep 2020 06:36:57 +0000 (14:36 +0800)]
LU-14004 llite: default lsm update may memory leak

ll_update_default_lsm_md() should check whether lli_default_lsm_md
is set before setting it to the data from lustre_md, and if it's set,
release the old data to avoid memory leak.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9c8434c5d62f9fb751788031d6769fd49427c371
Reviewed-on: https://review.whamcloud.com/40103
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14646 flr: write a FLR file downgrade SoM 71/43471/5
Bobi Jam [Wed, 28 Apr 2021 04:43:11 +0000 (12:43 +0800)]
LU-14646 flr: write a FLR file downgrade SoM

Seek over file size and write a FLR file does not change its SoM
and that makes file size incorrect.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I3075389721bdd40be60e9206c37f6c1bea514cce
Reviewed-on: https://review.whamcloud.com/43471
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14631 quota: fix qunit sort 10/43410/5
Sergey Cheremencev [Mon, 5 Apr 2021 12:27:34 +0000 (15:27 +0300)]
LU-14631 quota: fix qunit sort

Fix lqes_cmp that is used to sort lqes by qunit. As lqes_cmp returns
integer, it returns incorrects values if difference between qunits is
grater than 4GB causing write to hang instead of fail with -EDQUOT:

[<ffffffffc0701945>] cl_sync_io_wait+0x295/0x3c0 [obdclass]
[<ffffffffc07026f8>] cl_io_submit_sync+0x1c8/0x360 [obdclass]
[<ffffffffc128dc0a>] vvp_io_commit_sync+0x12a/0x460 [lustre]
[<ffffffffc128f5ee>] vvp_io_write_commit+0x4de/0x620 [lustre]
[<ffffffffc128fa39>] vvp_io_write_start+0x309/0x990 [lustre]
[<ffffffffc0700a18>] cl_io_start+0x68/0x130 [obdclass]
[<ffffffffc0702e8c>] cl_io_loop+0xcc/0x1c0 [obdclass]
[<ffffffffc1243514>] ll_file_io_generic+0x5c4/0xdc0 [lustre]
[<ffffffffc12441b9>] ll_file_aio_write+0x289/0x730 [lustre]
[<ffffffffc1244760>] ll_file_write+0x100/0x1c0 [lustre]
[<ffffffffa0241320>] vfs_write+0xc0/0x1f0
[<ffffffffa024213f>] SyS_write+0x7f/0xf0

The issue is occurred if a user hits block hard limit in a pool (pools
limit 6GB), while global limit is set to some huge value(53T in my case)

Change global limit in sanity-quota_1e to check that system doesn't
hung anymore.

HPE-bug-id: LUS-9891
Change-Id: I5a16fd3a40172187bbf35d9a9c9bfeef2ef3a108
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/43410
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14647 flr: mmap write/punch does not stale other mirrors 70/43470/5
Bobi Jam [Wed, 28 Apr 2021 05:07:36 +0000 (13:07 +0800)]
LU-14647 flr: mmap write/punch does not stale other mirrors

mmap write and punch/fallocate do not stale other mirrors and makes
FLR file contains different content in different mirrors.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I93a3eb5ba898e3bf0ce108718506b742ed485da5
Reviewed-on: https://review.whamcloud.com/43470
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14430 mdd: use own buffer for changelog 72/43672/3
Mikhail Pershin [Wed, 12 May 2021 06:55:29 +0000 (09:55 +0300)]
LU-14430 mdd: use own buffer for changelog

Use own persistent buffer for changelog needs to don't
interfere with generic big_buf in MDD thread info which
can be in use.

Fixes: f3d03bc38a ("LU-14430 mdd: fix inheritance of big default ACLs")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I4f4692b5556eaa98e2e23d7b58c925e33401e4e5
Reviewed-on: https://review.whamcloud.com/43672
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14593 utils: specify libzpool 36/43236/4
Alex Zhuravlev [Thu, 8 Apr 2021 14:07:09 +0000 (17:07 +0300)]
LU-14593 utils: specify libzpool

otherwise libmount_utils_zfs doesn't build on some platforms

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id850ba519f7a898e73c099887f0c9eca5c9b5b6f
Reviewed-on: https://review.whamcloud.com/43236
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12678 o2iblnd: fix bug in list_first_entry() change. 58/43558/2
Mr NeilBrown [Thu, 6 May 2021 00:19:30 +0000 (10:19 +1000)]
LU-12678 o2iblnd: fix bug in list_first_entry() change.

This comparison should be != NULL, else a NULL pointer could be
dereferenced.

Test-Parameters: trivial
Fixes: 34b57a6f8fcd ("LU-12678 lnet: use list_first_entry() in lnet/klnds subdirectory.")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4510e2e0f2eb7b5bf86626e5ddb5ee537d3fae02
Reviewed-on: https://review.whamcloud.com/43558
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-13783 libcfs: use lsmcontext in security_release_secctx 84/43284/3
Jian Yu [Tue, 4 May 2021 23:55:34 +0000 (16:55 -0700)]
LU-13783 libcfs: use lsmcontext in security_release_secctx

Kernel linux-hwe-5.8 (5.8.0-22.23~20.04.1) introduces
struct lsmcontext and uses it in security_release_secctx(),
which reduces the argruments from 2 to 1.

Change-Id: I37e185493001d335b40ea0a6102db593cb18beb3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43284
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14526 flr: mirror split downgrade SOM 68/43168/11
Bobi Jam [Tue, 30 Mar 2021 11:20:26 +0000 (19:20 +0800)]
LU-14526 flr: mirror split downgrade SOM

After mirror split, the file's blocks on SoM is not accurate, this
patch downgrade the SoM from STRICT so that size glimpse does not
trust the SoM from the MDS.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I02350c24190d96af93fed8c1b8a0bc6beb2c4bc2
Reviewed-on: https://review.whamcloud.com/43168
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14291 lustre: limit header scope for server only handling 96/43096/12
James Simmons [Sat, 8 May 2021 16:15:33 +0000 (12:15 -0400)]
LU-14291 lustre: limit header scope for server only handling

The lustre headers have server only function declarations and
inline functions. Only include them if HAVE_SERVER_SUPPORT is
set. This gets us closer to building OpenSFS modules against
a Linux kernel with a native Lustre client. Move a few things
from UAPI headers that are used only by kernel space to kernel
only headers where they belong.

For mdc_changlog.c a debug message access a field that is
available the server which is never the case. Since its
nonsense we can remove the report of this field

Change-Id: I6e2d3bebc121aef97fe69344d496e230f62b28ad
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43096
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14195 lnet: improve compat code for IPV6_V6ONLY sock opt 59/43559/3
Mr NeilBrown [Thu, 6 May 2021 01:27:28 +0000 (11:27 +1000)]
LU-14195 lnet: improve compat code for IPV6_V6ONLY sock opt

As get_fs() and set_fs() are deprecated, using them to call
sock->ops->setsockopt() is not a good solution.
Since linux 5.9 (v5.8-rc4-1952-ga7b75c5a8c41) it has been
possible to pass a "sockptr" to ->setsockopt() which can provide
a kernel address.

Prior to 5.8, kernet_setsockopt() is available and should still be
used.

For 5.8, when neither preferred option is available, we can pass
a NULL pointer which has the same effect as a pointer to zero.

Fixes: 10d99554631b ("LU-13783 lnet: remove kernel_setsockopt() from lnet_sock_listen()")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I78c1f735a73cc9c835371c139e946144c6df5108
Reviewed-on: https://review.whamcloud.com/43559
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14651 uapi: rename CONFIG_T_* to MGS_CFG_T_* 94/43494/5
James Simmons [Tue, 4 May 2021 14:12:05 +0000 (10:12 -0400)]
LU-14651 uapi: rename CONFIG_T_* to MGS_CFG_T_*

The Linux kernel uses CONFIG_* as a way to determine if a feature
is available. Using CONFIG_* in an UAPI is considered an error
and in the most recent kernels will break a build. While we don't
have any CONFIG_* in our UAPI headers we do have CONFIG_T_*
which is used for config logs. This naming confuses the Linux
kernel build system so just rename these variables to MGS_CFG_T_*
instead.

Test-Parameters: trivial
Change-Id: I574510c2da90e9ae608c9e2374d75220b7abb19d
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43494
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14583 llapi: handle symlinks in llapi_file_get_stripe() 29/43229/4
John L. Hammond [Wed, 7 Apr 2021 19:11:25 +0000 (14:11 -0500)]
LU-14583 llapi: handle symlinks in llapi_file_get_stripe()

In llapi_file_get_stripe(), if the IOC_MDC_GETFILESTRIPE ioctl handler
returns -ENOTTY or -ENODATA then try to resolve any symlinks in the
path and try again.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ic046d6ef77d8342d47336144e3066cab3a940a96
Reviewed-on: https://review.whamcloud.com/43229
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14565 ofd: Do not rely on tgd_blockbit 54/43154/23
Arshad Hussain [Mon, 29 Mar 2021 05:22:11 +0000 (10:52 +0530)]
LU-14565 ofd: Do not rely on tgd_blockbit

tgd_blockbit is recordsize bits set during mkfs.
This once set does not change. However, 'zfs set'
can be used to change the OST blocksize. Instead
of using cached value of 'tgd_blockbit' always
calculate the blocksize bits which may have
changed.

Test-case: sanity/104c added.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icc100cca0d5ae492c41d60f0bf97512450f796bc
Reviewed-on: https://review.whamcloud.com/43154
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-10948 llite: Introduce inode open heat counter 58/32158/40
Oleg Drokin [Tue, 13 Apr 2021 07:46:41 +0000 (03:46 -0400)]
LU-10948 llite: Introduce inode open heat counter

Initial framework to support detection of naive apps that
assume open-closes are "free" and proceed to open/close
same files between minute operations.

We will track number of file opens per inode and last time inode
was closed.

Initially we'll expose these controls:
llite/opencache_threshold_count - enables functionality and controls after how
                                  many opens open lock is requested
llite/opencache_threshold_ms    - if any reopen happens within this time (in
                                  ms), the open would trigger open lock request
llite/opencache_max_ms          - If last close was longer than this many ms
                                  ago - start counting opens from zero again

Once enough useful data is collected we can look into adding a heatmap
or another similar mechanism to better manage it and enable it
by default with sensible settings.

Change-Id: I1aa5455b458840acad651f651c883a7a7a67ab4c
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32158
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
2 years agoLU-14627 tests: Create unload_modules_local 25/43425/5
Chris Horn [Fri, 23 Apr 2021 19:05:02 +0000 (14:05 -0500)]
LU-14627 tests: Create unload_modules_local

t-f allows for loading modules on single node via load_modules_local.
However, there is no corresponding unload_modules_local that can be
called to cleanup after call to load_modules_local, so we create it.
unload_modules() refactored to use unload_modules_local.

Also address a potential issue that can prevent LND modules from
unloading. Some LNet setup (particularly those in sanity-lnet) may
require that we call lnetctl lnet unconfigure (or lctl net down)
to drop a ref on the module before it can be unloaded.

HPE-bug-id: LUS-9031
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6458a7728f5f559f8641c5a9e29dd775c8445c38
Reviewed-on: https://review.whamcloud.com/43425
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13783 libcfs: support absence of account_page_dirtied 27/40827/7
Mr NeilBrown [Thu, 29 Apr 2021 13:04:04 +0000 (09:04 -0400)]
LU-13783 libcfs: support absence of account_page_dirtied

Some kernels export neither account_page_dirtied nor
kallsyms_lookup_name.
For these kernels we need to use __set_page_dirty() and suffer the
cost of dropping an reclaiming the page-tree lock.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I69d934480832f3909d3ec103f11e1d62489d70d7
Reviewed-on: https://review.whamcloud.com/40827
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13722 utils: lnetctl discrepancy in YAML output 69/40269/3
Cyril Bordage [Fri, 16 Oct 2020 14:10:28 +0000 (16:10 +0200)]
LU-13722 utils: lnetctl discrepancy in YAML output

Use "max_interfaces" instead of "max_intf" in YAML output/input.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: Id0d1d1a4220d817b238456946e28b985e1fdc80c
Reviewed-on: https://review.whamcloud.com/40269
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-13912 lnet: Correct the router ping interval calculation 94/39694/7
Chris Horn [Mon, 17 Aug 2020 21:02:10 +0000 (16:02 -0500)]
LU-13912 lnet: Correct the router ping interval calculation

The router ping interval is being divided by the number of local nets
which results in sending pings more frequently than defined by the
alive_router_check_interval. In addition, the current code is structured
such that we may not find a peer net in need of a ping until after
inspecting the router list multiple times. Re-work the code so that the
loop that inspects a router's peer nets will look at all of them until
it either loops back around the list or it finds one that actually
needs to be pinged.

We also move the check of LNET_PEER_RTR_DISCOVERY so that we avoid the
work of inspecting the router's peer nets if the router is already being
discovered.

Test-Parameters: trivial
HPE-bug-id: LUS-9237
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I5a4733002f29c0ade6aee62b4424313d5d245556
Reviewed-on: https://review.whamcloud.com/39694
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14641 osd-ldiskfs: write commit declaring improvement 46/43446/7
Wang Shilong [Mon, 26 Apr 2021 03:23:26 +0000 (11:23 +0800)]
LU-14641 osd-ldiskfs: write commit declaring improvement

This patch try to:

1)extent bytes could be missed to increase with less than
1M, fix to to compare it with current value, and decay
it for every allocation.

2)with system space usage growing up, mballoc codes won't
try best to scan block group to align best free extent as
we can. So extent bytes per extent could be decayed to a
very small value, this could make us reserve too many credits.
We could be more optimistic in the credit reservations, even
in a case where the filesystem is nearly full, it is extremely
unlikely that the worst case would ever be hit.

3)Add extent bytes stats and debug ability to analysis
over reservation problem.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I357c4a855147ba26a9e9bbe9ab1269bcfd44e5f3
Reviewed-on: https://review.whamcloud.com/43446
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14603 ptlrpc: quiet messages for unsupported opcodes 57/43257/3
Andreas Dilger [Sun, 11 Apr 2021 02:04:30 +0000 (20:04 -0600)]
LU-14603 ptlrpc: quiet messages for unsupported opcodes

Reduce message spew for unhandled RPC opcodes.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I35496168e3aa29ecb06076654ef0aa97ba2540e5
Reviewed-on: https://review.whamcloud.com/43257
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14669 tests: reduce time spent in sanity-sec 43/43543/4
Sebastien Buisson [Wed, 5 May 2021 13:01:15 +0000 (15:01 +0200)]
LU-14669 tests: reduce time spent in sanity-sec

Several tests in sanity-sec loop over a number of nodemaps (16),
a number of NID ranges (3), a number of IP addresses (6) and a
number of mapped IDs (10).
Whereas the benefits for test coverage of such a large number of
combinations is not clear, looping over all this certainly takes
a lot of time.
So arbitrarily reduce that to smaller numbers in case SLOW!=yes:
- 3 nodemaps;
- 2 NID ranges;
- 2 IP addresses;
- 3 mapped ID
This reduces the test time for sanity-sec from ~12000s to ~7500s.

Similarly, test_fops function, used by tests 16,17,18,19,20,21,22,
loops over a number of users (4), a number of mapped IDs (6), a
number of permission bit masks (4), and for each client.
Arbitrarily reduce that to smaller numbers in case SLOW!=yes, to
save test time without degrading test coverage:
- 2 users;
- 4 mapped IDs;
- 2 permission bit masks;
- from only one client.
This further reduces the test time from ~7500s to ~5000s.

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 clientdistro=el8.3 serverdistro=el8.3 testlist=sanity-sec
Test-Parameters: mdscount=2 mdtcount=4 osscount=1 ostcount=8 clientcount=2 clientdistro=el8.3 serverdistro=el8.3 testlist=sanity-sec env=SLOW=yes
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idbe9c8daca43feafe7ca6481902cf6b54e3fa87c
Reviewed-on: https://review.whamcloud.com/43543
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13439 lmv: qos stay on current MDT if less full 45/43445/9
Andreas Dilger [Sun, 25 Apr 2021 11:02:19 +0000 (05:02 -0600)]
LU-13439 lmv: qos stay on current MDT if less full

Keep "space balanced" subdirectories on the parent MDT if it is less
full than average, since it doesn't make sense to select another MDT
which may occasionally be *more* full.  This also reduces random
"MDT jumping" and needless remote directories.

Reduce the QOS threshold for space balanced LMV layouts, so that the
MDTs don't become too imbalanced before trying to fix the problem.

Change the LUSTRE_OP_MKDIR opcode to be 1 instead of 0, so it can
be seen that a valid opcode has been stored into the structure.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iab34c7eade03d761aa16b08f409f7e5d69cd70bd
Reviewed-on: https://review.whamcloud.com/43445
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13440 lmv: add default LMV inherit depth 31/43131/13
Lai Siyao [Mon, 15 Mar 2021 03:57:36 +0000 (11:57 +0800)]
LU-13440 lmv: add default LMV inherit depth

A new field "__u8 lum_max_inherit" is added into struct lmv_user_md,
which represents the inherit depth of default LMV. It will be
decreased by 1 for subdirectories.

The valid value of lum_max_inherit is [0, 255]:
* 0 means unlimited inherit.
* 1 means inherit end.
* 250 is the max inherit depth.
* [251, 254] are reserved.
* 255 means it's not set.

A new field "__u8 lum_max_inherit_rr" is added, if default stripe
offset is -1, lum_max_inherit_rr is non-zero, and system is balanced,
new directories are created in roundrobin mannner, otherwise they
are created on the MDT where their parents are located to avoid
creating remote directories. And similarly this value will be
decreased by 1 for each level of subdirectories.

The valid value of lum_max_inherit_rr is different:
* 0 means not set.
* 1 means inherit end.
* 250 is the max inherit depth.
* [251, 254] are reserved.
* 255 means unlimited inherit.

However for the user interface of "lfs", the valid value is [-1, 250]:
* -1 means unlimited inherit.
* 0 means not set.
* others are the same.

Add sanity 413c.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I98ccad8556a0469f83bd7d79f5086a2184d5b115
Reviewed-on: https://review.whamcloud.com/43131
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
2 years agoLU-13440 obdclass: server qos penalty miscaculated 85/43385/2
Lai Siyao [Wed, 21 Apr 2021 12:05:52 +0000 (20:05 +0800)]
LU-13440 obdclass: server qos penalty miscaculated

Server qos penalty calculation uses active target count, but it
should use server count, which will make it larger than expected,
then weight of targets are often 0, and finally cause MDT0 is
often chosen in qos allocation.

Fixes: 45222b2ef ("LU-12624 obdclass: lu_tgt_descs cleanup")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I1982363e4ff74c7344dd5e07d04e29214afa8a7f
Reviewed-on: https://review.whamcloud.com/43385
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
2 years agoLU-14628 ptlrpc: remove might_sleep() in sptlrpc_gc_del_sec() 97/43397/3
Nikitas Angelinas [Thu, 15 Apr 2021 19:09:16 +0000 (12:09 -0700)]
LU-14628 ptlrpc: remove might_sleep() in sptlrpc_gc_del_sec()

sptlrpc_gc_del_sec() calls mutex_lock() which calls might_sleep(), so
the explicit might_sleep() call can be removed as redundant.

Signed-off-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Test-Parameters: trivial
Change-Id: I48714fae12e63ba5e37ec0f9aa3ab7f688b9475d
Reviewed-on: https://review.whamcloud.com/43397
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-12678 lnet: use list_first_entry() in lnet/klnds subdirectory. 19/43419/2
Mr. NeilBrown [Thu, 22 Apr 2021 18:27:38 +0000 (14:27 -0400)]
LU-12678 lnet: use list_first_entry() in lnet/klnds subdirectory.

Convert
  list_entry(foo->next .....)
to
  list_first_entry(foo, ....)

in 'lnet/klnds

In several cases the call is combined with a list_empty() test and
list_first_entry_or_null() is used

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Change-Id: I3b2b33c3c9284c02e44610614d64a1f84be300a4
Reviewed-on: https://review.whamcloud.com/43419
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14632 tests: fix sanity-hsm test_606() 09/43409/3
Elena Gryaznova [Thu, 22 Apr 2021 14:15:42 +0000 (17:15 +0300)]
LU-14632 tests: fix sanity-hsm test_606()

After check_and_setup_lustre() mds1_dev is equal to
/dev/mapper/mds1_flakey, which is unexported to saved real
device  (/dev/vdc) by stop():
  stop mds1
    elif dm_flakey_supported mds1; then
        dm_cleanup_dev mds1
          unexport_dm_dev mds1
As a result stack_trap() is called with non existing
/dev/mapper/mds1_flakey:
  stack_trap 'stop mds1; start mds1 /dev/mapper/mds1_flakey
              -o rw,user_xattr' EXIT
and failed as:
  losetup: /dev/mapper/mds1_flakey: failed to set up loop device:
              No such file or directory
Reproducer:
  run ONLY=606 sh sanity-hsm.sh on "failover" setup (mds1_HOST !=
  mds1failover_HOST), no llmount.sh before the test

Fixes: 54b9e3f78935 ("LU-684 tests: replace dev_read_only patch with dm-flakey")
Test-Parameters: trivial testlist=sanity-hsm env=ONLY=606
Test-Parameters: fstype=zfs testlist=sanity-hsm env=ONLY=606
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9920
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: I9ab3cbcf67c6fd046861810a2ceab262f211436b
Reviewed-on: https://review.whamcloud.com/43409
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13717 sec: rework includes for client encryption 86/43386/3
Sebastien Buisson [Tue, 23 Mar 2021 14:19:01 +0000 (14:19 +0000)]
LU-13717 sec: rework includes for client encryption

Simplify includes for crypto, by not repeating stubs in case
HAVE_LUSTRE_CRYPTO is not defined.
Expose encoding routines that are going to be used in the Lustre
code (both client and server sides) with filename encryption.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6c5853d6da7120edd2bec3a12494251d873151a8
Reviewed-on: https://review.whamcloud.com/43386
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14616 readahead: fix reserving for unaliged read 77/43377/7
Wang Shilong [Tue, 20 Apr 2021 01:47:25 +0000 (09:47 +0800)]
LU-14616 readahead: fix reserving for unaliged read

If read is [2K, 3K] on x86 platform, we only need
read one page, but it was calculated as 2 pages.

This could be problem, as we need reserve more
pages credits, vvp_page_completion_read() will only
free actual reading pages, which cause @ra_cur_pages
leaked.

Fixes: d4a54de84c0 ("LU-12367 llite: Fix page count for unaligned reads")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I3cf03965196c1af833184d9cfc16779f79f5722c
Reviewed-on: https://review.whamcloud.com/43377
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14621 mdd: fix lock-tx order in orphan cleanup 62/43362/7
Alex Zhuravlev [Sun, 18 Apr 2021 13:52:25 +0000 (16:52 +0300)]
LU-14621 mdd: fix lock-tx order in orphan cleanup

start the transaction and then take locks.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iba4be17df55459142baa7585f47231f1b72ebf0b
Reviewed-on: https://review.whamcloud.com/43362
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14616 readahead: export pages directly without RA 38/43338/3
Wang Shilong [Fri, 16 Apr 2021 02:04:17 +0000 (10:04 +0800)]
LU-14616 readahead: export pages directly without RA

With Readahead disabled, @vpg_defer_uptodate should not
be set as we don't reserve credits for such read.
In vvp_page_completion_read() we will call ll_ra_count_put()
which makes @ra_cur_pages negative.

Fixes: 7e8efb339b ("LU-12043 llite: fix to submit complete read block with ra disabled")
Change-Id: I1c9134f5972aa0d0e7aac998f02c690cc55b433b
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/43338
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14612 utils: Add functions llapi_group_lock64/unlock64() 10/43310/8
Emoly Liu [Sun, 25 Apr 2021 00:48:42 +0000 (08:48 +0800)]
LU-14612 utils: Add functions llapi_group_lock64/unlock64()

Add new functions llapi_group_lock64/unlock64() for 64-bit usage.

Test-Parameters: trivial
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Id79ae634ba6a787974be481b51743604c4adb536
Reviewed-on: https://review.whamcloud.com/43310
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3] 58/43258/3
Jian Yu [Tue, 13 Apr 2021 23:41:31 +0000 (16:41 -0700)]
LU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.22.1.el8_3.

Test-Parameters: trivial \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Change-Id: I1a3152d95822a74e05f9b44f590a6cdb1f8b02b6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43258
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14599 osp: limit allocation at osp_sync_process_committed 50/43250/5
Alexander Boyko [Fri, 9 Apr 2021 12:16:55 +0000 (08:16 -0400)]
LU-14599 osp: limit allocation at osp_sync_process_committed

Sometimes osp cancels very large cookie list with 64K elements.
In this case osp_sync_process_committed() tries to allocate 64 pages
and uses vmalloc.
The fix limits memory allocation size to 4 page with kmalloc, and
reuse it in a loop.

HPE-bug-id: LUS-9815
Fixes: 6d7332102 ("LU-11924 osp: combine llog cancel operations")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ic875335a28f78494fdb3cbc4b0145e5a43831ee8
Reviewed-on: https://review.whamcloud.com/43250
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14553 changelog: eliminate mdd_changelog_clear warning 25/43125/4
Olaf Faaland [Thu, 25 Mar 2021 01:35:10 +0000 (18:35 -0700)]
LU-14553 changelog: eliminate mdd_changelog_clear warning

When handling a changelog_clear request, the user may specify a
range of indices which do not exist.  Similarly, the user may
specify a changelog user which does not exist.  Neither indicates
a problem within Lustre that justifies a a console warning.

Change those cases to CDEBUG.

Test-Parameters: trivial
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I64bab12ef4978c4bf7139f5f36a39f9b109616fb
Reviewed-on: https://review.whamcloud.com/43125
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14537 mdd: directory migrate skips project ID check 10/42110/3
Lai Siyao [Sat, 13 Mar 2021 15:48:54 +0000 (23:48 +0800)]
LU-14537 mdd: directory migrate skips project ID check

mdd_migrate_sanity_check() used to call mdd_rename_sanity_check(),
while the latter checks parent and sub file project ID, which is
redundant for migration because it's an internal layout change.

Add sanity 230t.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If5ac2131acb1dfb30a312dc34052287776f581c7
Reviewed-on: https://review.whamcloud.com/42110
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14359 hsm: support a flatter HSM archive format 12/41312/6
John L. Hammond [Fri, 22 Jan 2021 16:56:06 +0000 (10:56 -0600)]
LU-14359 hsm: support a flatter HSM archive format

Add versioning (v1 and v2) to the HSM archive format (directory
layout):
  v1: (oid & 0xffff)/-/-/-/-/-/FID
  v2: ((oid ^ seq) & 0xffff)/FID

v1 is the original layout and the default. v2 is the new layout which
should be selected for new installs.

Add an option --archive-format to select the archive format.

Add YAML configuration file support to lhsmtool_posix with properties
achive_format and archive_path. Add an option --config to set the
config file.

Adapt sanity-hsm and test-framework to allow testing of both archive
formats.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I6d6bd0c8817a491848b554fa76078d876549cc1f
Reviewed-on: https://review.whamcloud.com/41312
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14357 fid: simplify locking for fid updates 99/41299/2
Mr NeilBrown [Fri, 22 Jan 2021 02:42:33 +0000 (13:42 +1100)]
LU-14357 fid: simplify locking for fid updates

'struct lu_client_seq' contains a mutex (lcs_mutex) and a second
open-coded mutex (lcs_waitq, lcs_update).  Both of these are using in
gettign a new fid, possibly from the server.

lcs_mutex is the main mutex which protects the local variables.  If an
RPC to the server is required, the extra mutex is held during that
RPC.

This was apparently intended to avoid some deadlock, presumably with
seq_client_flush().  However as seq_client_flush() now takes both
mutexes as well, it is still prone to any such deadlock, but does not
appear to actually suffer from one.
See:
  Commit 23e2a370c8a3 ("b=24255 move seq_client_alloc_seq out of
   lcs_sem")
  Commit d1feb5c774d4 ("LU-662 fix conflict between seq_client_flush
   and seq_client_alloc_fid")

for some of the history.

The extra open-coded mutex appears to provide no value, so let's
remove it.

As part of this, seq_fid_alloc_fini() is open-coded into the two call
sites, which adds further simplification.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia39eca7d925c9d49fbf942923de8af79dba4f6bf
Reviewed-on: https://review.whamcloud.com/41299
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12815 socklnd: add conns_per_peer parameter 56/41056/9
Serguei Smirnov [Tue, 30 Mar 2021 16:58:57 +0000 (12:58 -0400)]
LU-12815 socklnd: add conns_per_peer parameter

Introduce conns_per_peer ksocklnd module parameter.
In typed mode, this parameter shall control
the number of BULK_IN and BULK_OUT tcp connections,
while the number of CONTROL connections shall stay
at 1. In untyped mode, this parameter shall control
the number of untyped connections.
The default conns_per_peer is 1. Max is 127.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I70bbaf7899ae1fbc41de34553c8c4ad1c7d55f7e
Reviewed-on: https://review.whamcloud.com/41056
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13781 lnet: Local NI must be on same net as next-hop 52/39352/5
Chris Horn [Sun, 12 Jul 2020 15:47:55 +0000 (10:47 -0500)]
LU-13781 lnet: Local NI must be on same net as next-hop

When sending to a remote peer we need to restrict our selection of a
local NI to those on the same peer net as the next-hop.

The code currently selects a local NI on the peer net specified by the
lr_lnet field of the lnet_route returned by lnet_find_route_locked().
However, lnet_find_route_locked() may select a next-hop peer NI on any
local peer net - not just lr_lnet.

A redundant assignment to sd->sd_msg->msg_src_nid_param is also
removed. That variable is always set appropriately in
lnet_select_pathway().

Test-Parameters: trivial
HPE-bug-id: LUS-9095
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If1bec26d6646b9e66b99656d7db2dc538d631a34
Reviewed-on: https://review.whamcloud.com/39352
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13107 utils: clean up lctl command usage 08/37108/4
Andreas Dilger [Sat, 28 Dec 2019 09:42:54 +0000 (02:42 -0700)]
LU-13107 utils: clean up lctl command usage

The lctl usage is confusing because it lists a number of valid
commands after "testing (DANGEROUS)", such as LFSCK and llog.

Move the useful commands before the "testing" section so it is
not mis-interpreted as all following commands are dangerous.
Group some other commands together with more related commands,
rather than whatever order they happened to be imlpemented in.

Remove function prototypes for commands that no longer exist.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I469f9c92953762cc46a68e44238c4b67ebacab07
Reviewed-on: https://review.whamcloud.com/37108
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-11085 llite: reimplement range_lock with Linux interval_tree 26/39726/8
Mr NeilBrown [Tue, 25 Aug 2020 01:48:34 +0000 (11:48 +1000)]
LU-11085 llite: reimplement range_lock with Linux interval_tree

As a step towards removing the lustre interval-tree implementation,
reimplent range_lock to use Linux interval trees.

As Linux interval trees allow the same interval to be stored twice,
this allows the removal of the rl_next_lock list and associated code.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I1d1669345fac0945e0e189b87a74ca8e7582e842
Reviewed-on: https://review.whamcloud.com/39726
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14618 lov: correctly handling sub-lock init failure 45/43345/2
Bobi Jam [Fri, 16 Apr 2021 15:56:01 +0000 (23:56 +0800)]
LU-14618 lov: correctly handling sub-lock init failure

In lov_lock_sub_init(), if a sublock initialization fails, it needs to
bail out of the outer loop as well as the inner one.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic4e16f484a0a64c670eea5d47054bac19bc95144
Reviewed-on: https://review.whamcloud.com/43345
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>