Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-14160 tests: fix fsx logdump to fit in 80 chars
Andreas Dilger [Thu, 20 Jan 2022 03:02:00 +0000 (19:02 -0800)]
LU-14160 tests: fix fsx logdump to fit in 80 chars

Fix fsx logdump fallocate/truncate lines to fit within 80 columns.
Remove spurious leading 0 for every operation length.

Lustre-change: https://review.whamcloud.com/44510
Lustre-commit: e7897043c6eff00593123f2b43075bebeb50934f

Test-Parameters: trivial testlist=sanityn env=ONLY=16
Fixes: cb037f305c64 ("LU-14160 fallocate: Add punch mode to fallocate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I93460b62be8611926e620241232d886dee3ebbe5
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/46222
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14160 fallocate: Add punch mode to fallocate
Arshad Hussain [Thu, 20 Jan 2022 02:58:30 +0000 (18:58 -0800)]
LU-14160 fallocate: Add punch mode to fallocate

This patch adds fallocate(2) punch operation
(FALLOCATE_FL_PUNCH_HOLE) mode support for ldiskfs backend
OSD and for OSC/OST

Test cases sanity/150{f,g} are added for verification.
FSX test was modified:
 - add 'punch' operation to an output
 - fix 'No space' problem when fallocate length become negative
 - fix wrong bytes number in output

Lustre-change: https://review.whamcloud.com/40877
Lustre-commit: cb037f305c64cd5121fa308afc1e6b7d3df3f61a

Test-Parameters: testlist=sanity ostsizegb=12 env=ONLY="150f 150g"
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I0c180d413efdf995823e25d5c340013bec0c8611
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/46221
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14433 llite: do fallocate() size checks under lock
Mikhail Pershin [Thu, 20 Jan 2022 02:53:48 +0000 (18:53 -0800)]
LU-14433 llite: do fallocate() size checks under lock

Check about fallocate() range vs file size in vvp_io_setattr_start()
instead of ll_fallocate() so inode size cannot be changed by pending
write or truncate. This implies that IO is initialized already and
requires changes in LOV to update sub-IOs with proper inode size and
valid size attribute values

Fix also vvp_io_setattr_lock() to don't include fallocate_end in
lock range

Lustre-change: https://review.whamcloud.com/41668
Lustre-commit: f23ac22c4c79750fed6b05ddbe460bfc9b0f0ea5

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I8c1d295464be24d6638005bc9d46cff50656cf11
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46220
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13783 procfs: fix improper prop_ops fields
James Simmons [Thu, 20 Jan 2022 01:47:11 +0000 (17:47 -0800)]
LU-13783 procfs: fix improper prop_ops fields

The lod pool and nodemap proc_ops missed renaming the fields to
start with .proc_*. On newer distros like Ubuntu 20.04 HWE you
get the following compile error:

lustre-release/lustre/ptlrpc/nodemap_lproc.c:686:3: error: ‘const struct proc_ops’ has no member named ‘open’
  686 |  .open   = nodemap_ranges_open,

Lustre-change: https://review.whamcloud.com/43880
Lustre-commit: d106dfc1458702865118e73bfcdfc2ec2676a7d6

Test-Parameters: trivial
Fixes: 13cd0f9f667 ("LU-13344 libcfs: Abstract proc_fs with proc_ops")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: I5fff7519a801f585690d468255f7ca6c73adcc90
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/46219
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14093 tests: silence gcc10 error for badarea_io
James Simmons [Thu, 20 Jan 2022 01:18:20 +0000 (17:18 -0800)]
LU-14093 tests: silence gcc10 error for badarea_io

With gcc10 badarea_io will fail to build with the following error.

badarea_io.c: In function 'main':
badarea_io.c:59:7: error: 'write' reading 2097152 bytes from a
                           region of size 4 [-Werror=stringop-overflow=]
   59 |  rc = write(fd, &fd, 2UL*1024*1024);
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Talking to Oleg see stated this is the done this way on purpose.
So instead of 'fixing' the issue in this case we silence the gcc
warning.

Lustre-change: https://review.whamcloud.com/44670
Lustre-commit: 084546f7d01b0eec8dafae9bc50edc778c3886ca

Test-Parameters: trivial
Test-Parameters: env=ONLY=133f,133g testlist=sanity
Change-Id: Iee79c7988cc209fd099c23c38a8bd7df96015b05
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46218
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14093 mgc: rework mgc_apply_recover_logs() for gcc10
Alex Zhuravlev [Thu, 20 Jan 2022 01:07:02 +0000 (17:07 -0800)]
LU-14093 mgc: rework mgc_apply_recover_logs() for gcc10

rework mgc_apply_recover_logs() to use a separate buffer of
appropriate size so that gcc10 doesn't complain:
mgc_request.c:1506:24: error: argument 4 may overlap destination
        object [-Werror=restrict]
 1506 |        pos += sprintf(obdname + pos, "-%s-%s", cname, inst);

Lustre-change: https://review.whamcloud.com/40484
Lustre-commit: d13d8158e816b7ac4437ff6d6c6aec3926ba7531

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ice863b412475e53705dc6523ab30ba613244bd90
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46217
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14093 gss: gcc10 fixes for GSS
James Simmons [Thu, 20 Jan 2022 00:48:45 +0000 (16:48 -0800)]
LU-14093 gss: gcc10 fixes for GSS

Building with gcc10 reports the following issues when building
GSS:

gss_util.h:37: multiple definition of `this_realm';
gssd.h:73: multiple definition of `clnt_list';
svcgssd.h:38: multiple definition of `krb_enabled';

Properly scope these variables.

Lustre-change: https://review.whamcloud.com/44363
Lustre-commit: 39e4c97530c4657192e7c0d6a22ca30c90cdb6e4

Test-Parameters: env=SHARED_KEY=true testlist=sanity,recovery-small,sanity-sec
Change-Id: I05fc298fb90d67314c6963273688c2577099188a
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/46216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14093 lnet: annotate LNET_WIRE_HANDLE_COOKIE_NONE as u64
Dominique Martinet [Thu, 20 Jan 2022 00:44:55 +0000 (16:44 -0800)]
LU-14093 lnet: annotate LNET_WIRE_HANDLE_COOKIE_NONE as u64

Fix the following warning on new gcc with -Wextra when including
lustre_idl.h on external project:

.../include/linux/lnet/lnet-types.h: In function LNetMDHandleIsInvalid:
.../include/linux/lnet/lnet-types.h:355:46:
   error: comparison of integer expressions of different signedness:
   int and __u64 {aka long long unsigned int} [-Werror=sign-compare]
        return (LNET_WIRE_HANDLE_COOKIE_NONE == h.cookie);
                                             ^~

Lustre-change: https://review.whamcloud.com/43713
Lustre-commit: 27214876fcdfbda016c920bce4ab1da800fcda4b

Change-Id: I05f21dcca5fe9dd15d1e0b6cb9a29c3999bcd807
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14093 llapi: remove ignored qualifier
Dominique Martinet [Thu, 20 Jan 2022 00:28:45 +0000 (16:28 -0800)]
LU-14093 llapi: remove ignored qualifier

Fixes the following warning on newer gcc with -Wextra:
.../include/lustre/lustreapi.h:1000:1: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
 1000 | const __u16 llapi_layout_string_flags(char *string);
      | ^~~~~

As the parameter is ignored, this should make no code difference

Test-parameters: trivial

Lustre-change: https://review.whamcloud.com/43712
Lustre-commit: 90ee0457c9fb1da939558186961f346c917d678f

Change-Id: I049166bbc586007cdecc93225d508693607ef04e
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46213
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14093 utils: fix format-overflow warning
Dominique Martinet [Thu, 20 Jan 2022 00:25:07 +0000 (16:25 -0800)]
LU-14093 utils: fix format-overflow warning

Fix the following warning on gcc11 by making numbuf big enough to fit
format content.

lfs.c: In function ‘print_quota’:
lfs.c:7719:48: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
 7719 |                         sprintf(numbuf[0], "%s*", strbuf);
      |                                                ^
lfs.c:7719:25: note: ‘sprintf’ output between 2 and 33 bytes into a destination of size 32
 7719 |                         sprintf(numbuf[0], "%s*", strbuf);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test-parameters: trivial

Lustre-change: https://review.whamcloud.com/43711
Lustre-commit: a0fe9be254b944f5b005dd4b36c414827bcb40df

Change-Id: I021e6ffff2e1405eadbe689f718674af4d4d6376
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46212
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14093 utils: trivial changes to support gcc10
Alex Zhuravlev [Thu, 20 Jan 2022 00:19:49 +0000 (16:19 -0800)]
LU-14093 utils: trivial changes to support gcc10

just to fix gcc10 complains.

Lustre-change: https://review.whamcloud.com/40485
Lustre-commit: 79acd674e3bc49ac630d84ef64df2291fc9ade01

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8390d28fe78c9dad15a41301cc2b6d6184fdc330
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46211
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEXA-4252 debian: remove mpi dependency for lustre debs
Louis Douriez [Wed, 15 Dec 2021 10:08:19 +0000 (11:08 +0100)]
EXA-4252 debian: remove mpi dependency for lustre debs

Remove mpi-default-bin and mpi-default-dev dependencies for all
debian packages except lustre-tests.

Change-Id: I8fef87aa9fec427c93115b3aad338f093cc5c65c
Signed-off-by: Louis Douriez <ldouriez@ddn.com>
Reviewed-on: https://review.whamcloud.com/45860
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15268 mdt: reveal the real intent close error code
Bobi Jam [Thu, 13 Jan 2022 09:08:47 +0000 (17:08 +0800)]
LU-15268 mdt: reveal the real intent close error code

mdt_mfd_close() clobbers the intent close error so that user space
tool only knows that the close intent hasn't finished and reports
-EBUSY instead of the real error code.

Lustre-change: https://review.whamcloud.com/45636
Lustre-commit: TBD (from dc6dee1f5683ac91d637533b4c220617f62e60d2)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I72f474a73e8b73cdc35ca38eaaec5af182f63ca7
Reviewed-on: https://review.whamcloud.com/46092
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-15219 lfs: migration to DoM layout fix
Mikhail Pershin [Fri, 12 Nov 2021 16:00:22 +0000 (19:00 +0300)]
LU-15219 lfs: migration to DoM layout fix

Migration to DoM layout from OST-striped file can skip
data sync beyond DoM component if it is not initialized.
Patch forces data copy prior layout merge, so new layout
is initialized and contains needed data

Tests 272e/272f in sanity.sh were modified to migrate data
for both MDT and OST parts

Lustre-change: https://review.whamcloud.com/45549
Lustre-commit: cf872dcce3b33d309d31dc5cce8794729f14a924

Fixes: 44a721b8c1 ("LU-11421 dom: manual OST-to-DOM migration via mirroring")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I206358e762780ab7cfaa7587888174a31bc7b196
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46012
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-14642 flr: abolish MDS transfer layout version to OST
Bobi Jam [Thu, 13 Jan 2022 09:35:24 +0000 (17:35 +0800)]
LU-14642 flr: abolish MDS transfer layout version to OST

Quit setting layout version to OST object from MDS, and client
write request will carry the new layout version and OST object
rejects old layout version write and update new layout version
accordingly.

Lustre-change: https://review.whamcloud.com/c/45443/
Lustre-commit: TBD (from aea66b7d3fdc37d61704bd5df2a25a4747a6cce5)

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I655044f69a4509a2b0cfe99f86de2ce4ee846979
Reviewed-on: https://review.whamcloud.com/45278
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-14448 lod: verify LOV early in lod_get_default_striping
Lai Siyao [Wed, 19 Jan 2022 05:28:23 +0000 (21:28 -0800)]
LU-14448 lod: verify LOV early in lod_get_default_striping

lod_get_default_striping() will get both default LOV and default LMV,
and parse them to struct lod_default_striping one by one, however the
LOV and LMV data are both stored in lod_thread_info.lti_ea_store, so
lod_verify_striping() should verify LOV upon getting LOV, otherwise
if both exists, it's LMV that's verified, which will return -EINVAL.

Lustre-change: https://review.whamcloud.com/45370
Lustre-commit: eba1b49172259ece89ab604a2ed2285e4770baa2

Fixes: 6a08df2d0effc7a ("LU-14448 lod: verify LOV before set/inherit")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9763d35bdbc74101fa8515d5096ec457a4cb3524
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46182
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-4463 flr: ensure layout generation is incremented on merge
John L. Hammond [Thu, 6 Jan 2022 15:09:30 +0000 (09:09 -0600)]
EX-4463 flr: ensure layout generation is incremented on merge

In lod_declare_layout_merge(), load the striping before we increment
the layout generation. Add sanity-flr:test_51a to check this.

Fixes: 6b5a29c0ac94 (Revert "LU-14642 flr: transfer layout version on layout change")
Test-Parameters: testlist=sanity-flr envdefinitions=FAIL_ON_ERROR=false,ONLY=51a,ONLY_REPEAT=20
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I357501edfd2bc0d710be902df9c40aab53c11824
Reviewed-on: https://review.whamcloud.com/46028
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-12056 ldiskfs: add trusted.projid virtual xattr
Li Dongyang [Tue, 30 Nov 2021 01:13:03 +0000 (12:13 +1100)]
LU-12056 ldiskfs: add trusted.projid virtual xattr

Add trusted.projid virtual xattr in ldiskfs to export the
current project id, intended for ldiskfs level MDT backup.

When the project id is EXT4_DEF_PROJID/0,
the virtual xattr is hidden from listxattr(2).

It's also hidden on lustre client when parent has the
project inherit flag and the same project ID,
to stop mv from setting the virtual xattr on the dest with
the project id from src, which could be different from dest.

getxattr(2) on trusted.projid will report current project id,
setxattr(2) will change curent project id and
removexattr(2) will set project id back to EXT4_DEF_PROJID/0

Both get|setxattr(2) will work even when the virtual xattr is
hidden.

Invalidate client xattr cache for the inode when changing its
project id, so the virtual xattr can get the new value
for next getxattr(2)

Add test cases to verify the virtual projid xattr and backup
restore MDT using tar can now preserve the project id.

Change mds_backup_restore in test framework, to use
tar with --xattrs --xattrs-include='trusted.*'" options.

Lustre-change: https://review.whamcloud.com/45679
Lustre-commit: 665383d3a1f4d1dc7f404301039432271ad85eaf

Change-Id: I29b1aa922ef72d734cdc87125401fa08fb13d4af
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46083
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15121 llite: skip request slot for lmv_revalidate_slaves()
Andriy Skulysh [Fri, 30 Aug 2019 11:43:29 +0000 (14:43 +0300)]
LU-15121 llite: skip request slot for lmv_revalidate_slaves()

Some syscalls need lmv_revalidate_slaves(). It requires
second lock enqueue and the it can be blocked by
lack of RPC slots.

Don't acquire rpc slot for second lock enqueue.

Lustre-change: https://review.whamcloud.com/45275
Lustre-commit: 7e781c605c4189ea1f4b0a343863280ebeb237d4

Change-Id: Ida23c648c2bd169c4d238543731796232aa490dc
HPE-bug-id: LUS-8416
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46106
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15133 osp: only deactivate OSP on LAST_FID error
Lai Siyao [Tue, 18 Jan 2022 21:56:26 +0000 (13:56 -0800)]
LU-15133 osp: only deactivate OSP on LAST_FID error

ofd_get_info_hdl() should return -EFAULT upon LAST_FID error, which
is the same as LAST_ID error.

osp_get_lastfid_from_ost() should deactivate OSP only upon -EFAULT,
which means reading LAST_FID on OST failed. This can avoid unnecessary
admin intervention.

Add sanity 27S.

Lustre-change: https://review.whamcloud.com/45309
Lustre-commit: f738156aa621c6c800d08af18ca52c39c40c3bd3

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib78c8994c0398dd4b4db32005abd018933ef3a7c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46180
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15456 llite: deadlock in ll_new_node()
Lai Siyao [Tue, 18 Jan 2022 04:29:19 +0000 (23:29 -0500)]
LU-15456 llite: deadlock in ll_new_node()

ll_new_node() will call ll_dir_getstripe() to fetch parent default
LMV if md_create() returns -EREMOTE, it should call
ll_finish_md_op_data() before calling ll_dir_getstripe() because
the latter will lock lli_lsm_sem again, which will deadlock.

Lustre-change: https://review.whamcloud.com/46157/
Lustre-commit: d46e8deb9c0e680e195ef4d2c8755f25ad27865f

Fixes: 55ca00c3d1cd863 ("LU-11213 ptlrpc: intent_getattr fetches default LMV")
Test-Parameters: mdscount=2 mdtcount=4 testlist=racer,racer,racer
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib858bae19ff88533fe487583c27d544026aafa3f
Reviewed-on: https://review.whamcloud.com/46196
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14957 mdd: prepare xattrs before migration
Lai Siyao [Wed, 19 Jan 2022 05:26:18 +0000 (21:26 -0800)]
LU-14957 mdd: prepare xattrs before migration

In directory migration, the xattrs should be prepared before starting
transaction, otherwise if remote MDT is down, which will cause local
MDT stuck as well.

Lustre-change: https://review.whamcloud.com/44741
Lustre-commit: dc1aa272d24cff9f06fd9ea71e4ad468c16acc52

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I79279e7b0c051a7542a71066fffd4ad70f559368
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46188
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14655 lnet: Protect lpni deref in lnet_health_check
Chris Horn [Wed, 28 Apr 2021 01:10:16 +0000 (20:10 -0500)]
LU-14655 lnet: Protect lpni deref in lnet_health_check

Discovery thread can modify peer NI/peer net/peer relationship
so we need to be careful when dereferencing the peer NI pointer in
lnet_health_check(). Discovery thread operations under net lock, so
move the peer NI dereference under the net lock which is taken for
incrementing the health stats.

Move some of the other code that is only relevant for messages with a
health status != LNET_MSG_STATUS_OK under the appropriate condition.

Lustre-commit: d87af24452a2e883b0e7400661a5b768c35088b1
Lustre-change: https://review.whamcloud.com/43503

Test-Parameters:  testlist=sanity-lnet
HPE-bug-id: LUS-9962
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I3e6763b71bcdc9281f46b79c59e40f939190d468
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46138
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14432 misc: update e2fsprogs to 1.46.2.wc1
Li Dongyang [Thu, 20 Jan 2022 00:03:49 +0000 (16:03 -0800)]
LU-14432 misc: update e2fsprogs to 1.46.2.wc1

Update Changelog for the new e2fsprogs release.

Lustre-change: https://review.whamcloud.com/43469
Lustre-commit: 80f352fe90cad09cbdf7b61f74cc6ce4cd999bbf

Change-Id: I173c43f1c777b7223a56841a06545c1741e1a903
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46210
Tested-by: jenkins <devops@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn29
Andreas Dilger [Wed, 19 Jan 2022 23:28:41 +0000 (16:28 -0700)]
RM-620 build: New tag 2.14.0-ddn29

New tag 2.14.0-ddn29

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iefb38a16d3744398fad1296f221d216a863d494d

3 years agoEX-3342 tests: correct Lustre version in test skip checks in sanityn
Alena Nikitenko [Thu, 11 Nov 2021 21:11:18 +0000 (00:11 +0300)]
EX-3342 tests: correct Lustre version in test skip checks in sanityn

Many patches land to the EXAScaler branches as ports from
other branches.  Sometimes the tests that are included with
the ported patches check the version of Lustre to ensure
that the feature it tests exists in this version of Lustre.
These version values are not always changed when patches
are ported from one branch to another.

Change Lustre test suite version checks to be relative to
this branch. Sanityn tests 43j, 81c and 84 were modified.

Fixes: 1c01d0867da ("LU-10235 mdt: mdt_create: check EEXIST without lock")
Fixes: 23fa920b0ce ("LU-13437 mdt: rename misses remote LOOKUP lock revoke")
Fixes: c4a91e08b1e ("LU-12485 obdclass: 0-nlink race in lu_object_find_at()")

Test-Parameters: env=ONLY="43j 81c 84" serverversion=2.10.8 \
serverdistro=el7.6 testlist=sanityn
Test-Parameters: env=ONLY="43j 81c 84" clientversion=2.12.6-ddn42 \
testlist=sanityn
Test-Parameters: env=ONLY="43j 81c 84" serverversion=2.12.6-ddn42 \
testlist=sanityn
Test-Parameters: trivial env=ONLY="43j 81c 84" testlist=sanityn

Lustre-change: https://review.whamcloud.com/45537
Lustre-commit: c525d9367976fe38fd6fba6fdda8c3a9df6fd6e1

Signed-off-by: Alena Nikitenko <anikitenko@ddn.com>
Change-Id: I3116af827889ccc5a23434f9a253f3115b56dfb4
Reviewed-on: https://review.whamcloud.com/46186
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13513 osp: make neterr not fatal for precreate_reserve
Vladimir Saveliev [Tue, 18 Jan 2022 21:48:29 +0000 (13:48 -0800)]
LU-13513 osp: make neterr not fatal for precreate_reserve

When OST_CREATE (not resendable rpc) sent by precreate thread fails
with network error, osp_pre_update_status() sets d->opd_pre_status to
EIO. osp_precreate_reserve() considers EIO as fatal and does not wait
for another attempt from precreate thread. That may make
mdt_intent_open() to return ENOSPC confusing a caller.  ENOSPC comes
from lod_alloc_rr().

osp_precreate_send(): in case of network error switch EIO to ENOTCONN.

Test to illustrate the issue is added.

Lustre-change: https://review.whamcloud.com/38472
Lustre-commit: 4bba67075aa3d8739d8ca99642ff2b2836774479

Cray-bug-id: LUS-8811
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: Iffaad9bd16f216f758c784b708e21b525c999b14
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46179
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-3637 osd-ldiskfs: revalidate nonrotational state
Andreas Dilger [Sat, 4 Dec 2021 05:14:09 +0000 (22:14 -0700)]
EX-3637 osd-ldiskfs: revalidate nonrotational state

Until the nonrotational state of the device is correctly exported by
the underlying storage, periodically recheck the nonrotational state
in case it is changed by the udev/tune_devices.sh script after the
OST is mounted.  It would be possible to check less often (e.g. after
a time limit or some number of operations), but directly checking the
rotational state each time is not more expensive and is only checked
on statfs RPCs (sent about once per 5s from the MDS).

If the nonrotational state is changed and the flash-related cache
parameters have not been explicitly set, then tune them appropriately.

Rename the parameter functions and variables for read_cache_enable,
writethrough_cache_enable, and read_cache_max_filesize to match the
parameter names to make them easier to find in the code.

Lustre-change: https://review.whamcloud.com/45745
Lustre-commit: TBD (from cbc74314532632fbcab88531d43deed7555e125e)

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iec78a5d5c22c0474eda84a5a793fbb006f3ebbe5
Reviewed-on: https://review.whamcloud.com/45829
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Tested-by: Gaurang Tapase <gtapase@ddn.com>
3 years agoLU-14514 flr: mirror split should not make stale file
Bobi Jam [Thu, 2 Sep 2021 16:27:34 +0000 (00:27 +0800)]
LU-14514 flr: mirror split should not make stale file

Mirror split could leave an all stale mirrors file, this patch
prevent removing the last non-stale mirror from the file.

Lustre-change: https://review.whamcloud.com/42024
Lustre-commit: 83c790cbf2f8f7452e1382051564af6f155b47cf

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I63007784929a2cd18d2823e2250f7307ca7d8d45
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46089
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14430 mdd: rename mti_fid to mdi_fid and friends
Andreas Dilger [Wed, 19 Jan 2022 05:22:47 +0000 (21:22 -0800)]
LU-14430 mdd: rename mti_fid to mdi_fid and friends

Rename mdd_thread_info fields to avoid confusion with mdt_thread_info.
The final patch to rename mdd_thread_info fields to a unique prefix:

  mti_cattr->mdi_cattr
  mti_fid->mdi_fid
  mti_fid2->mdi_fid2
  MTI_KEEP_KEY->MDI_KEEP_KEY
  mti_la_for_fix->mdi_la_for_fix
  mti_la_for_start->mdi_la_for_start
  mti_pattr->mdi_pattr
  mti_tattr->mdi_tattr
  mti_tpattr->mdi_tpattr

Lustre-change: https://review.whamcloud.com/43740
Lustre-commit: b1ed8e57da67feddb9c5e67abaf6db1b70333fa0

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I17bcc3ddfae400a5ca76e4f654c696da6d3ebbe5
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46195
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14430 mdd: rename mti_oa to mdi_oa and friends
Andreas Dilger [Wed, 19 Jan 2022 05:18:09 +0000 (21:18 -0800)]
LU-14430 mdd: rename mti_oa to mdi_oa and friends

Rename fields in mdd_thread_info to confusion with mdt_thread_info.
The second patch of several to rename all mdd_thread_info fields
to use a more unique field prefix:

  mti_dof->mdi_dof
  mti_dt_rec->mdi_dt_rec
  mti_ent->mdi_ent
  mti_flags->mdi_flags
  mti_hint->mdi_hint
  mti_key->mdi_key
  mti_link_data->mdi_link_data
  mti_name->mdi_name
  mti_oa->mdi_oa
  mti_range->mdi_range
  mti_spec->mdi_spec

The mti_lmv and mti_lrl fields are removed since they are unused.

Lustre-change: https://review.whamcloud.com/43739
Lustre-commit: 9a23a5de12164f9d50db9e602f085bb0c3cc9d8a

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6fd4b7f26b7e9561d8a8585eaa5438d6093ebbe5
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46194
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14430 mdd: rename mti_big_buf to mdi_big_buf
Andreas Dilger [Wed, 19 Jan 2022 05:11:14 +0000 (21:11 -0800)]
LU-14430 mdd: rename mti_big_buf to mdi_big_buf

Avoid serious confusion with the MDT mti_big_buf, and other fields
in mdd_thread_info, since they are two separate buffers completely.

  mti_big_buf->mdi_big_buf
  mti_chlg_buf->mdi_chlg_buf
  mti_link_buf->mdi_link_buf
  mti_xattr_buf->mdi_xattr_buf

The first patch of several to rename all mdd_thread_info fields.

Lustre-change: https://review.whamcloud.com/43738
Lustre-commit: f9f38c33ab8484102cdb3736868f4e7bece594ae

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib0ec91c8481e747ed058afe5c08c3f60203ebbe5
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46187
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14430 mdd: use own rec_hdr for changelog declare
Andreas Dilger [Wed, 19 Jan 2022 05:07:49 +0000 (21:07 -0800)]
LU-14430 mdd: use own rec_hdr for changelog declare

Do not use an lu_buf just to declare the changelog record.  This
only needs llog_rec_hdr to pass in lrh_len, so declaring rec_hdr
on the stack avoids the overhead of using the lu_buf.

Lustre-change: https://review.whamcloud.com/43683
Lustre-commit: ff52f8c1736ad7ef2621d23366a1ca6572aa7f22

Fixes: f3d03bc38a ("LU-14430 mdd: fix inheritance of big default ACLs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I7b6f1d761aa98aa6ecb023894bde03dce23ebbe5
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/46193
Tested-by: jenkins <devops@whamcloud.com>
3 years agoLU-14662 lnet: set eth routes needed for multi rail
Serguei Smirnov [Wed, 23 Jun 2021 22:51:21 +0000 (15:51 -0700)]
LU-14662 lnet: set eth routes needed for multi rail

When ksocklnd is initialized or new ethernet interfaces
are added via lnetctl, set the routing rules using a common
shell script ksocklnd-config. This ensures control over
source interface when sending traffic.

For example, for eth0 with ip 192.168.122.142/24:
   the output of "ip route show table eth0" should be
192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.142

This step can be omitted by specifying
   options ksocklnd skip_mr_route_setup=1
in the conf file, or by using switch
   --skip-mr-route-setup
when adding NI with lnetctl. Note that the module parameter
takes priority over the lnetctl switch: if skip-mr-route-setup
is not specified when adding NI with lnetctl, the route still
won't get created if the conf file has skip_mr_route_setup=1.

The route also won't be created if any route already exists
for the given interface, assuming advanced users who manage
routing on their own will want to continue doing so.

Lustre-change: https://review.whamcloud.com/44065
Lustre-commit: c9bfe57bd2495671fa66eb7e52184f76e1f4a6eb

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia14e637bd29d4bbce5dd93daad9992336b2e6b15
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45678
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15409 kernel: kernel update RHEL8.5 [4.18.0-348.7.1.el8_5]
Jian Yu [Tue, 18 Jan 2022 08:35:18 +0000 (00:35 -0800)]
LU-15409 kernel: kernel update RHEL8.5 [4.18.0-348.7.1.el8_5]

Update RHEL8.5 kernel to 4.18.0-348.7.1.el8_5.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Change-Id: I45e7b429654b6307ae233597f5a27aaccd8df0af
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46162
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15126 kernel: RHEL 8.5 server support
Jian Yu [Tue, 18 Jan 2022 08:29:09 +0000 (00:29 -0800)]
LU-15126 kernel: RHEL 8.5 server support

This patch makes changes to support RHEL 8.5 release
with kernel 4.18.0-348.2.1.el8_5 for Lustre server.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Lustre-change: https://review.whamcloud.com/45306
Lustre-commit: 605ac53b1a621afa5d94b9854a7a1783d7e24afe

Change-Id: Ie976d8fd3e6fcf8a564eff8a41ad0fd51b2c858c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46161
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15222 build: Update ZFS version to 2.0.6
Jian Yu [Tue, 18 Jan 2022 08:21:15 +0000 (00:21 -0800)]
LU-15222 build: Update ZFS version to 2.0.6

Update ZFS version to 2.0.6. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.0.6

Lustre-change: https://review.whamcloud.com/45567
Lustre-commit: 8e6ef2d91b3acbec5ad1157404c263277a25cbb3

Change-Id: I2a7df45b79f402c3d3bce8b137edd11b5224b576
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46160
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15071 utils: tunefs erease-params for zfs
Alexander Boyko [Tue, 18 Jan 2022 08:16:21 +0000 (00:16 -0800)]
LU-15071 utils: tunefs erease-params for zfs

The patch exclude special zfs params for tunefs erase-params,
skip nvlist modifying. Also fixes test_89 conf-sanity.

tunefs --erase-params produced segmentation fault with old code.

Lustre-change: https://review.whamcloud.com/45145
Lustre-commit: d67bd765f67082204094a20373b4e2c476ecfd81

Test-Parameters: trivial fstype=zfs testlist=conf-sanity
HPE-bug-id: LUS-10314
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ic8385a99ca896ce6d855692b3f77e198bf583d94
Reviewed-on: https://review.whamcloud.com/46159
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15259 tests: skip acl tests if no bin/daemon users
Andreas Dilger [Wed, 15 Dec 2021 20:22:23 +0000 (13:22 -0700)]
LU-15259 tests: skip acl tests if no bin/daemon users

If the bin/daemon users are not configured on the test system, then
sanity test_103a, test_125, test_154a will fail with:

    $ setfacl -m u:bin:rw f -- failed
    -   ? setfacl: Option -m: Invalid argument near character 3

Skip these tests until they are fixed.

Lustre-change: https://review.whamcloud.com/45868
Lustre-commit: da7f2848a2aae2ec4852b23e55d23d42a30205ce

Test-Parameters: trivial clientdistro=sles15sp3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I526f9318862577a6b73c3b63cfc95a3d793ebbe5
Reviewed-on: https://review.whamcloud.com/46141
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoUpdate lipe version to 2.22.
John L. Hammond [Tue, 18 Jan 2022 15:40:26 +0000 (09:40 -0600)]
Update lipe version to 2.22.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I2e7d9b55b8aa8d6396a636a866470f2337028851

3 years agoEX-4375 lipe: set lpurge log prefix
John L. Hammond [Fri, 17 Dec 2021 13:49:43 +0000 (07:49 -0600)]
EX-4375 lipe: set lpurge log prefix

In lpurge, set lx_log_prefix to the ostname.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I602d0b2d10717b5c7993cea815b12cf42ec31ade
Reviewed-on: https://review.whamcloud.com/45881
Reviewed-on: https://review.whamcloud.com/46130
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4306 lipe: allow any SoM to count as size
John L. Hammond [Thu, 9 Dec 2021 15:34:12 +0000 (09:34 -0600)]
EX-4306 lipe: allow any SoM to count as size

In lipe/policy.c:get_size(), allow any SoM attribute to be used as the
size since that is what existing users (Insight) expect.

Test-Parameters: trivial testlist=sanity-lipe
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I7f7be38f100840a327d8dc6f506ad73f2233a7b4
Reviewed-on: https://review.whamcloud.com/45809
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46129

3 years agoEX-4265 lipe: fix automated lpurge configuration
John L. Hammond [Thu, 9 Dec 2021 16:00:16 +0000 (10:00 -0600)]
EX-4265 lipe: fix automated lpurge configuration

In stratagem-hp-config.sh, delay setting LPURGE_FREEHI until the
automatic values have been determined.

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I3042061cf0f2da8e807a3ed2acc527cb9517385b
Reviewed-on: https://review.whamcloud.com/45810
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46128

3 years agoEX-4218 lipe: Fix assertion in lipe_ssh_session_create
Alexandre Ioffe [Tue, 23 Nov 2021 00:58:00 +0000 (16:58 -0800)]
EX-4218 lipe: Fix assertion in lipe_ssh_session_create

Return SSH_ERROR code in all faulty cases
Add reporting libssh return error code

Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: I84d5c3d1fe0e6d1c7818ec181a7b9ea07413eb8a
Reviewed-on: https://review.whamcloud.com/45638
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46127

3 years agoEX-4075 lipe: Modify log messages, use xstrdup()
Alexandre Ioffe [Thu, 28 Oct 2021 20:27:38 +0000 (13:27 -0700)]
EX-4075 lipe: Modify log messages, use xstrdup()

Preserve errno across log message output
Modify log messages for style reason
and corresponded message in test script
Removed trailing blanks
Replace strdup() by xstrdup()
Add double quotes in test script to prevent script's
"ambiguous redirection" error

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: Ie12fe9bbca60fe28ef260d820b55a8195492618f
Reviewed-on: https://review.whamcloud.com/45366
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46126

3 years agoEX-3002 lipe: lpurge units
John L. Hammond [Wed, 27 Oct 2021 18:35:46 +0000 (13:35 -0500)]
EX-3002 lipe: lpurge units

Use kb suffixes consistently in lpurge. Add loa_used_kb() to
encapsulate the poor naming otf the loa_blocks member in struct
lipe_object_attrs. Remove some misleading comments. Rename structure
members lo_blocks to lo_used_kb, and ls_*_space to ls_*_used_kb. In
lpurge_purge_slot() rename target to target_kb, and total to
queued_kb. Rename lpurge_free_space() to lpurge_purge().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: I0bbd4f0ea815e3a95eb36d1dc590f03f76eeccd3
Reviewed-on: https://review.whamcloud.com/45388
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46125

3 years agoEX-4103 lamigo: do not mark cold pool mirror "prefer"
Andreas Dilger [Fri, 22 Oct 2021 22:40:24 +0000 (16:40 -0600)]
EX-4103 lamigo: do not mark cold pool mirror "prefer"

When lamigo_check_hot_on_cold() calls lamigo_new_job_for_hot() it
passes "tgt_pools" as the "tgt" pool argument, for the case when an
active file in the hot pool needs to be mirrored to the cold pool,
unlike other callers pass "src_pools" as "tgt" to mirror to the hot
pool.  This should not result in the cold pool mirror being marked
"prefer", which can trigger a chain of later problems with the file.

Fix comment in lamigo_check_hot_on_cold() to make it clear what case
is being checked, since it described the opposite of what is done.

Test-Parameters: trivial testlist=hot-pools
Fixes: e582abc629e ("EX-978 lamigo: set prefer flag on fast replica")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iad48d6eb2d57817241b8ca3c22c03e38b93ebbe5
Reviewed-on: https://review.whamcloud.com/45345
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45351
Tested-by: jenkins <devops@whamcloud.com>
3 years agoEX-3002 lipe: swap 'used' and 'free' within lpurge
John L. Hammond [Wed, 27 Oct 2021 17:51:28 +0000 (12:51 -0500)]
EX-3002 lipe: swap 'used' and 'free' within lpurge

Within lpurge, convert:

  o_freelo to o_max_used
  o_freehi to o_min_used
  freelo to lpurge_max_used_kb
  freehi to lpurge_min_used_kb

And adjust logic according to the formulas:

  o_max_used = 100 - o_freelo
  o_min_used = 100 - o_freehi
  lpurge_max_used_kb = kbtotal - freelo
  lpurge_min_used_kb = kbtotal - freehi

This change does not add, remove, or rename any command line
options. The relevant old lpurge stats values (free_high, free_low,
kbfree, low, hi) are retained. New values (min_used, max_used,
min_used_kb, max_used_kb, used_kb, total_kb) are added.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: I772ac32041a27904b8b6c50725b149c0b5fa4f45
Reviewed-on: https://review.whamcloud.com/45387
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46124

3 years agoEX-3002 lipe: use static in lpurge
John L. Hammond [Wed, 27 Oct 2021 14:48:58 +0000 (09:48 -0500)]
EX-3002 lipe: use static in lpurge

In lpurge, declare functions and variables static when
possible. Remove any unused functions or variables.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: I902894741c62136eda34327204f9299cd23c9e27
Reviewed-on: https://review.whamcloud.com/45385
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46123

3 years agoEX-3002 lipe: use static in lamigo
John L. Hammond [Wed, 27 Oct 2021 14:47:05 +0000 (09:47 -0500)]
EX-3002 lipe: use static in lamigo

In lamigo, declare functions and variables static when
possible. Remove any unused functions.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: Iad16f31304ed9d9d70aa4ecb54f2a9acbbfdead5
Reviewed-on: https://review.whamcloud.com/45384
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46122

3 years agoEX-3002 lipe: update lamigo command line options
John L. Hammond [Tue, 26 Oct 2021 18:49:56 +0000 (13:49 -0500)]
EX-3002 lipe: update lamigo command line options

Deprecate the '-s', '--src', '--src-dom' '--src-free' '-t', '--tgt',
and '--tgt-free' options to lamigo in favor of new options
  --fast-pool=POOL (default 'ddn_ssd')
  --fast-pool-max-used=MAX stop mirroring to POOL when % used reaches MAX (default 30)
  --slow-pool=POOL (default 'ddn_hdd')
  --slow-pool-max-used=MAX stop mirroring to POOL when % used reaches MAX (default 90)
  --include-dom treat DoM components as if they belong to fast pool

We continue to accept the old options with the same meanings but they
are removed from the help message and we print a warning when they are
used. Update hot-pools.sh to use the new options by default and add a
test that the old options still work.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ib940c188ccdd57710b68f5e97e725a26d0833356
Reviewed-on: https://review.whamcloud.com/45379
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46121
Reviewed-by: Jian Yu <yujian@whamcloud.com>
3 years agoEX-3002 lipe: rename {fast,slow}_pool_free to {fast,slow}_pool_max_used
John L. Hammond [Tue, 26 Oct 2021 18:24:41 +0000 (13:24 -0500)]
EX-3002 lipe: rename {fast,slow}_pool_free to {fast,slow}_pool_max_used

Rename o_{fast,slow}_pool_free to o_{fast,slow}_pool_max_used and
adjust logic.

In struct pool_list, rename
  pl_avail to pl_used_kb (and adjust logic)
  pl_total to pl_total_kb.

Replace defaults with equivalent values:
  DEF_FAST_POOL_FREE=70 becomes DEF_FAST_POOL_MAX_USED=30
  DEF_SLOW_POOL_FREE=10 becomes DEF_SLOW_POOL_MAX_USED=90

This change does not add, remove, or rename any command line options.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: I5f5551230856eabdaa84972218b5dcb73959c029
Reviewed-on: https://review.whamcloud.com/45377
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46120

3 years agoEX-3002 lipe: update lamigo struct options and stats
John L. Hammond [Tue, 26 Oct 2021 17:12:34 +0000 (12:12 -0500)]
EX-3002 lipe: update lamigo struct options and stats

sed -i \
  -e 's/\bo_src_dom\b/o_include_dom/g' \
  -e 's/\bo_src_free\b/o_fast_pool_free/g' \
  -e 's/\bo_tgt_free\b/o_slow_pool_free/g' \
  -e 's/\bo_tgt_pool\b/o_slow_pool/g' \
  -e 's/\bsrc_dom\b/include_dom/g' \
  -e 's/\bsource_pool\b/fast_pool/g' \
  -e 's/\btarget_pool\b/slow_pool/g' \
 lipe/src/lamigo*.[ch]

Remove o_src_pool, o_src_pool_len, o_tgt_pool_len.

Adjust hot-pools.sh for the renaming of the source and target pools
stats.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: I024a2bf1e08023c6dd0884177b2511351bf88116
Reviewed-on: https://review.whamcloud.com/45376
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46119

3 years agoEX-3002 lipe: use ALR_READ and ALR_WRITE to index a{h,r}_heat
John L. Hammond [Wed, 27 Oct 2021 13:45:41 +0000 (08:45 -0500)]
EX-3002 lipe: use ALR_READ and ALR_WRITE to index a{h,r}_heat

sed -E -i \
  -e 's/\b(READ|WRITE)\b/ALR_\1/g' \
  -e 's/\b(ah_heat|ah_iosize|ar_heat|ar_iosize)\[0\]/\1[ALR_READ]/g' \
  -e 's/\b(ah_heat|ah_iosize|ar_heat|ar_iosize)\[1\]/\1[ALR_WRITE]/g' \
 lipe/src/lamigo*.[ch]

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Test-Parameters: trivial testlist=hot-pools
Change-Id: I640d5a56b2b923f67f3aa3c57d34385d28dfb24d
Reviewed-on: https://review.whamcloud.com/45374
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46118

3 years agoEX-3002 lipe: rename lamigo 'target pool' to 'slow pool'
John L. Hammond [Tue, 26 Oct 2021 16:38:21 +0000 (11:38 -0500)]
EX-3002 lipe: rename lamigo 'target pool' to 'slow pool'

sed -i \
  -e 's/\bDEF_TARGET_POOL\b/DEF_SLOW_POOL/g' \
  -e 's/\bDEF_TGT_FREE\b/DEF_SLOW_POOL_FREE/g' \
  -e 's/\btarget pool/slow pool/g' \
  -e 's/\btgt_pools\b/slow_pools/g' \
  -e 's/\bALR_COLD\b/ALR_SLOW/g' \
  -e 's/\bah_pools[1]/ah_pools[ALR_SLOW]/g' \
  -e 's/\bar_pools[1]/ar_pools[ALR_SLOW]/g' \
 lipe/src/lamigo*.[ch]

Adjust hot-pools test_7 accordingly. This change does not add, remove,
or rename any command line options.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I15d2b9b3233bd635a63dfe5f9381b3be3c7e20ed
Reviewed-on: https://review.whamcloud.com/45373
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46117

3 years agoEX-3002 lipe: rename lamigo 'source pool' to 'fast pool'
John L. Hammond [Tue, 26 Oct 2021 16:28:39 +0000 (11:28 -0500)]
EX-3002 lipe: rename lamigo 'source pool' to 'fast pool'

sed -i \
  -e 's/\bDEF_SOURCE_POOL\b/DEF_FAST_POOL/g' \
  -e 's/\bDEF_SRC_FREE\b/DEF_FAST_POOL_FREE/g' \
  -e 's/\bsource pool/fast pool/g' \
  -e 's/\bsrc_pools\b/fast_pools/g' \
  -e 's/\blamigo_lookup_pool\b/lamigo_lookup_fast_pool/g' \
  -e 's/\bALR_HOT\b/ALR_FAST/g' \
  -e 's/\bah_pools[0]/ah_pools[ALR_FAST]/g' \
  -e 's/\bar_pools[0]/ar_pools[ALR_FAST]/g' \
 lipe/src/lamigo*.[ch]

This change does not add, remove, or rename any command line options.

Test-Parameters: trivial testlist=hot-pools
Change-Id: I69599ead28b03bb88519bee11d179625258601f1
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45372
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46101

3 years agoEX-4075 lipe: replace llapi_error()
John L. Hammond [Mon, 25 Oct 2021 23:30:05 +0000 (18:30 -0500)]
EX-4075 lipe: replace llapi_error()

Replace remaining uses of llapi_error(), llapir_err_noerrno() with
LX_ERROR() etc.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia879782658aab8658b8e738649188ee102c18610
Reviewed-on: https://review.whamcloud.com/45361
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46100

3 years agoEX-3548 lipe: lamigo changelog handling
John L. Hammond [Fri, 8 Oct 2021 21:17:26 +0000 (16:17 -0500)]
EX-3548 lipe: lamigo changelog handling

In lamigo, register a named changelog user if no suitable user already
exists. Print error messages in cases where lamigo appears to have
registered multiple changelog users. Add a test that lamigo registers
a named user. Add stratagem-hp-deregister-changelogs.sh to find and
remove lamigo registered changelogs on the local node and adjust files
accordingly. Call stratagem-hp-deregister-changelogs.sh from
stratagem-hp-teardown.sh.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I95330b4d6b7877f162d941af95c7332c927b5af2
Reviewed-on: https://review.whamcloud.com/45170
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-on: https://review.whamcloud.com/46099
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-11303 quota: enforce block quota for chgrp
Hongchao Zhang [Fri, 14 Jan 2022 08:39:24 +0000 (16:39 +0800)]
LU-11303 quota: enforce block quota for chgrp

In patch https://review.whamcloud.com/30146 "LU-5152 quota: enforce
block quota for chgrp", problems were introduced due to synchronous
requests from the MDS to the OSS to change the quota assignment of
files during chgrp operations. However, in some cases, the OSTs are
themselves out of grant and may send a quota request to the MDS,
which may result in a deadlock. Another issue is the slow performance
caused by the synchronous operation between MDT and OSTs.

This patch drops the synchronous RPC requirement of the original
patch #30146 to avoid this problem.

Previously, problems in quota tracking related to chgrp were introduced
due to synchronous RPCs from the MDS to the OSS when changing the group
ownership of objects for quota tracking since

Lustre-change: https://review.whamcloud.com/33996
Lustre-commit: 83f5544d8518ad12ea49e27829fff8f2739b86e2

Fixes: 8a71fd5061b ("LU-5152 quota: enforce block quota for chgrp")
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I40556b9e8a0628eb18aa806d2f6b3dfb9b53e874
Reviewed-on: https://review.whamcloud.com/46111
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-10499 statahead: introducing OBD_CONNECT2_BATCH_RPC flag
Qian Yingjin [Mon, 30 Nov 2020 02:08:17 +0000 (10:08 +0800)]
LU-10499 statahead: introducing OBD_CONNECT2_BATCH_RPC flag

Add a new connection flag OBD_CONNECT2_BATCH_RPC flag for
multi-RPC aggregation.

By necessity, also include definitions for OBD_CONNECT2_REP_MBITS,
OBD_CONNECT2_MODE_CONVERT so obd_connect_names[] works.

Lustre-change: https://review.whamcloud.com/40791
Lustre-commit: 6007dc9382df7260841a4748158307ade25f22ef

Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Change-Id: I05369c3ece75119eb3363d05065a0bb929839b4a
Reviewed-on: https://review.whamcloud.com/46107
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15339 tests: Increase timeout in sanity 208
Patrick Farrell [Tue, 7 Dec 2021 21:54:20 +0000 (16:54 -0500)]
LU-15339 tests: Increase timeout in sanity 208

It's been observed that occasionally the initial request in
sanity 208 does not complete in 1 second, which invalidates
the test.  (And sometimes causes it to fail - but even if
it passes, the test is invalid.)

Increase the time to 2 seconds.

Using trivial testing because this just modifies sanity and
it's such a simple change.

Lustre-change: https://review.whamcloud.com/45779
Lustre-commit: dc015fc0b51b95151366b0355cfc90b068d98b01

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I70cf32813a9a2ced0cc388eb25eba29918ba7d03
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45931
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15190 ptlrpc: fix duplication check
Alex Zhuravlev [Fri, 14 Jan 2022 05:02:44 +0000 (08:02 +0300)]
LU-15190 ptlrpc: fix duplication check

ptlrpc_server_check_for_resend() skips duplication check if
current exp_rpc_count == 0 which is wrong as exp_rpc_count
is incremented for RPCs in progress.

Lustre-change: https://review.whamcloud.com/45445
Lustre-commit: bb83a8af59d30b3f9e6de171eca962316ab7f6f4

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4ba1600341d916871f66aceb4d6a1043dd015e55
Reviewed-on: https://review.whamcloud.com/46116
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14975 dne: dir migration in non-recursive mode
Lai Siyao [Thu, 26 Aug 2021 11:37:09 +0000 (07:37 -0400)]
LU-14975 dne: dir migration in non-recursive mode

Add an option "-d|--directory" option for "lfs migrate -m" to
migrate specified directory only, which is similar to "ls -d".

Add sanity 230w.

Lustre-change: https://review.whamcloud.com/44802
Lustre-commit: 5604a6d270b8be13a8aacd72a105fc72b5e16976

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib97949e3840a3b49f7074b16e259582a9bf16e3b
Reviewed-on: https://review.whamcloud.com/45479
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4058 lamigo: don't generate useless noise
Alex Zhuravlev [Thu, 14 Oct 2021 11:06:05 +0000 (14:06 +0300)]
EX-4058 lamigo: don't generate useless noise

stop to generate useless debugging messages with non-changing
statfs data and hot/cold periods.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I3e601e3395b1f49a9e6bd8ae94fc7da9987a02a3
Reviewed-on: https://review.whamcloud.com/45241
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46098

3 years agoEX-4061 lipe: reinitialize channel after alr disconnection
John L. Hammond [Thu, 14 Oct 2021 16:24:47 +0000 (11:24 -0500)]
EX-4061 lipe: reinitialize channel after alr disconnection

Move most of lamigo_alr_data_collection_thread() into
lamigo_alr_agent_run() which manages ssh channel resources
properly. Keep a while-try-sleep loop in
lamigo_alr_data_collection_thread().

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ibe866f5b4a6ddea560dcb57fb0ed3fbbf3d52710
Reviewed-on: https://review.whamcloud.com/45243
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46097

3 years agoEX-3889 lipe: lamigo/lpurge error reporting
Alexandre Ioffe [Wed, 6 Oct 2021 02:37:20 +0000 (19:37 -0700)]
EX-3889 lipe: lamigo/lpurge error reporting

Replace LAMIGO_{FATAL,ERROR,WARN,INFO,DEBUG}()
by macros with more general name
LX_{FATAL,ERROR,WARN,INFO,DEBUG}()
and use them for both lamigo and lpurge.
Since now lipe will not use llapi_printf(),
but only LX_{FATAL,ERROR,WARN,INFO,DEBUG}() and
llapi_error()

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: Alexandre Ioffe <aioffe@ddn.com>
Change-Id: I4516bb737ec8a308b6e39be2767fd5e03e8b3c61
Reviewed-on: https://review.whamcloud.com/45131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46096

3 years agoEX-3889 lipe: lamigo error reporting and signal handling
John L. Hammond [Wed, 22 Sep 2021 19:15:52 +0000 (14:15 -0500)]
EX-3889 lipe: lamigo error reporting and signal handling

Add new macros LAMIGO_{FATAL,ERROR,WARN,INFO,DEBUG}() to replace the
existing calls to llapi_error() and llapi_printf(). Replace almost all
open coded calls to exit() with LAMIGO_FATAL(). Handle signals
(SIGTERM, SIGUSR1, SIGUSR2) from a dedicated thread. Add
x{malloc,calloc,strdup}() macros that call LAMIGO_FATAL() on OOM
conditions. In main() replace the while (!stop) loop with a
non-breaking while (1) loop.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Idc31da6eca847305ca16b9992a7fb22aa4d0f112
Reviewed-on: https://review.whamcloud.com/45026
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-on: https://review.whamcloud.com/45210

3 years agoLU-14781 osp: osp_object_free access NULL pointer
Bobi Jam [Tue, 2 Nov 2021 07:14:52 +0000 (15:14 +0800)]
LU-14781 osp: osp_object_free access NULL pointer

If an osp_object is created by multiple threads at the same time,
lu_object_find_at() could allocate an osp_object without calling
osp_object_init(). Before hash inserting of the object, it finds another
object has been created and inserted by another thread, it will free
the uninitialized osp_object, and osp_object_free() will access
an uninitialized list_head (opo_xattr_list).

Initializes osp_object fields in osp_object_alloc() to avoid this.

Call trace:
            lu_object_free.isra.30+0xf2/0x170 [obdclass]
            lu_object_find_at+0x496/0x930 [obdclass]
            lod_initialize_objects+0x3e4/0xba0 [lod]
            lod_parse_striping+0x693/0xc20 [lod]
            lod_striping_load+0x2b2/0x660 [lod]
            lod_declare_destroy+0x12b/0x600 [lod]
            mdd_declare_finish_unlink+0x91/0x210 [mdd]
            mdd_unlink+0x48f/0xab0 [mdd]
            mdt_reint_unlink+0xc32/0x1550 [mdt]
            mdt_reint_rec+0x83/0x210 [mdt]
            mdt_reint_internal+0x6e1/0xb00 [mdt]
            mdt_reint+0x67/0x140 [mdt]
            tgt_request_handle+0xaee/0x15f0 [ptlrpc]
            ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
            ptlrpc_main+0xb34/0x1470 [ptlrpc]
            kthread+0xd1/0xe0

Lustre-commit: TBD (from 20dde5a8d428b3f9bf2d0421b333a09545be1c65)
Lustre-change: https://review.whamcloud.com/45442

Fixes: 226fd401f9d ("LU-7660 dne: support fs default stripe")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib86aca5b41e94a1758f177655ea3a0f680335e0f
Reviewed-on: https://review.whamcloud.com/46094
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-4334 tests: disable sanity test_428 temporarily
Andreas Dilger [Fri, 14 Jan 2022 06:38:33 +0000 (23:38 -0700)]
EX-4334 tests: disable sanity test_428 temporarily

sanity test_428 is crashing regularly (about 1/15 runs) on b_es6_0.
Disable it until it is fixed.

Test-Parameters: trivial testlist=sanity env=ONLY=428,ONLY_REPEAT=180
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id3c7722d5f4c4d084bf1dab83733aae8f9d8366f
Reviewed-on: https://review.whamcloud.com/46109
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-4189 lov: include FID in some lov asserts
John L. Hammond [Thu, 4 Nov 2021 16:12:57 +0000 (11:12 -0500)]
EX-4189 lov: include FID in some lov asserts

Include the file FID in the assertions in lov_entry() and
lov_mirror_entry(). Use these two functions more consistently in the
lov layer.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I65978fe409842289c158021fb1b8042916d90e23
Reviewed-on: https://review.whamcloud.com/46093
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-3046 lipe: remove lamigo_init_vars()
John L. Hammond [Mon, 20 Sep 2021 14:48:42 +0000 (09:48 -0500)]
EX-3046 lipe: remove lamigo_init_vars()

Initialize static lists in their declarations. Remove the unused ssh
global variable. Remove the then unnecessary function
lamigo_init_vars().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia5c36e9d2b8f9b0d9467d55c7a2b9d3e7b9f2cf1
Reviewed-on: https://review.whamcloud.com/43398
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45209
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-3843 lipe: lamigo signal handling
John L. Hammond [Mon, 20 Sep 2021 14:16:52 +0000 (09:16 -0500)]
EX-3843 lipe: lamigo signal handling

In lamigo, add a SIGTERM handler that calls psignal() and exits with
status 0. Remove lamigo_null_handle() and replace with SIG_IGN. Set
SA_RESTART on the handlers for SIGUSR1 and SIGUSR2.

Test-Parameters: trivial testlist=hot-pools
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia329a17836cedb1e0d951a67619b828a63c12e67
Reviewed-on: https://review.whamcloud.com/44987
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45208
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-4552 lipe: use version-gen.sh
John L. Hammond [Thu, 13 Jan 2022 14:57:29 +0000 (08:57 -0600)]
EX-4552 lipe: use version-gen.sh

Now that b_es6_0 has a lipe tag we can uncomment the code in version-gen.sh.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Iebf497282197add8893f68b19e3bed113f388208
Reviewed-on: https://review.whamcloud.com/46095
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
3 years agoEX-1873 iokit: fix the obsolete usage of cfg_device
Hongchao Zhang [Wed, 14 Oct 2020 01:46:00 +0000 (09:46 +0800)]
EX-1873 iokit: fix the obsolete usage of cfg_device

The LCTL command "cfg_device" is obsolete and some operations
(such as "cleanup", "detach") don't support it anymore.
In mds_survey and lfsck-performance it causes the echo client
device not to be destroyed and causes LBUG when umounting the
related Lustre device.

Lustre-change: https://review.whamcloud.com/40227
Lustre-commit: 2e6342a7365825091d9c7b25418033c02ecfbb12

Change-Id: If7f6eff080906e395023289652fcd2a78dfb6fb7
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40227
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45879
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14895 osd-ldiskfs: combine checksum functions
Andreas Dilger [Wed, 4 Aug 2021 09:42:37 +0000 (03:42 -0600)]
LU-14895 osd-ldiskfs: combine checksum functions

Reduce code duplication for nearly-identical checksum calculations.
The osd_dif_type1_generate() and osd_dif_type3_generate() were nearly
the same, as were osd_dif_type1_verify() and osd_dif_type3_verify().
Combine these functions to share the code, and handle the difference
between T10-PI type 1 and type 3 with an argument.

Lustre-change: https://review.whamcloud.com/44656
Lustre-commit: 7fdd664b3518e5e8d8a243898d48d9c62c22e18a

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I40afb15fd80577ef6de918c90e4111e775ce7057
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45828
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15069 llite: Add start_idx debug
Patrick Farrell [Wed, 15 Dec 2021 17:08:32 +0000 (12:08 -0500)]
LU-15069 llite: Add start_idx debug

When readahead is triggered, current readahead debug
prints the page the user requested which triggered
readahead and the number of pages read by readahead.

However, readahead does not necessarily start reading from
the user requested page, so it's important to also print
the page where readahead starts.

Test-paremeters: trivial

lustre-change: https://review.whamcloud.com/45674/
lustre-commit: ca2bea3659e43649c5f229d7db3f850964b035c6 (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ie474811f3b0076f4f914fae7f74496e96ddb31da
Reviewed-on: https://review.whamcloud.com/45865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15317 llite: Add D_IOTRACE
Patrick Farrell [Wed, 15 Dec 2021 17:07:59 +0000 (12:07 -0500)]
LU-15317 llite: Add D_IOTRACE

In looking in to performance problems, it's very important
to be able to trace the I/O patterns from userspace in to
Lustre, and also understand the key basics of how Lustre
handles that I/O (readahead, RPC generation).

This is best done with a dedicated debug flag - No
userspace tool can provide all this information, and
existing debug flags collect a huge number of unrelated
pieces of, well, debug information.

The goal is for customers to be able to quickly gather log
files of a reasonable size which contain the necessary
information and which can easily be interpreted by
engineering.  This is not possible if the information is
spread out across a number of heavyweight debug flags.

This is a first pass at adding the flag and the debug
required to track basic data I/O.  One significant
omission in the first patch is RPC generation - I have not
decided how best to do that yet.  That will be added in a
future patch.

lustre-change: https://review.whamcloud.com/#/c/45752/
lustre-commit: e77ef62eb25195ddc4ef63c75dbe7342ddb2b3f5 (tbd)

test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0ed003ec1488e1c267b194c871f64b34f6dc6025
Reviewed-on: https://review.whamcloud.com/45864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15317 libcfs: Remove D_TTY
Patrick Farrell [Wed, 15 Dec 2021 17:06:42 +0000 (12:06 -0500)]
LU-15317 libcfs: Remove D_TTY

The D_TTY flag is almost entirely unused and certainly not
needed.  Remove it so we have a spare flag to use for
iotrace.

test-parameters: trivial

lustre-change: https://review.whamcloud.com/45751/
lustre-commit: 8317690ae36918109594208811c3c6358fe46e18 (tbd)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I1127cbcf6ee51adc07d560a8827fa1e32d16c90c
Reviewed-on: https://review.whamcloud.com/45863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15137 socklnd: decrement connection counters on close
Serguei Smirnov [Sat, 30 Oct 2021 18:39:26 +0000 (11:39 -0700)]
LU-15137 socklnd: decrement connection counters on close

To gracefully handle potential race with delayed connection create,
decrement connection counters per type as connections are being
closed.

Lustre-change: https://review.whamcloud.com/45422
Lustre-commit: 7e26413aa85fdc931721cde36bae3bf2bb97e63f

Test-Parameters: trivial testlist=sanity-lnet
Fixes: cbf740d0 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ieb3b44701e4999ea1fe63234162dd5878d65958a
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46051
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15137 socklnd: expect two control connections maximum
Serguei Smirnov [Thu, 4 Nov 2021 18:35:43 +0000 (11:35 -0700)]
LU-15137 socklnd: expect two control connections maximum

As a result of connecting to ourselves, e.g. pinging own nid,
two control type connections are established vs. just one
in case of connecting externally.
Fix the control connection counter to be able to handle that.

Lustre-change: https://review.whamcloud.com/45461
Lustre-commit: ee9a03d8308c5918a17e2e45fd59ee5a4c38acaf

Test-Parameters: trivial testlist=sanity-lnet
Fixes: cbf740d0 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Idce01d81e3924226b5b163d2472cbcd4f6eb5819
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46050
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
3 years agoLU-14138 ptlrpc: move more members in PTLRPC request into pill
Qian Yingjin [Tue, 17 Nov 2020 15:12:44 +0000 (23:12 +0800)]
LU-14138 ptlrpc: move more members in PTLRPC request into pill

Some data members in the data structure @ptlrpc_request can be
moved into the data structure @rep_capsule:
/** Request message - what client sent */
struct lustre_msg *rq_reqmsg;
/** Reply message - server response */
struct lustre_msg *rq_repmsg;
/** Fields that help to see if request and reply were swabbed */
__u32 rq_req_swab_mask;
__u32 rq_rep_swab_mask;

After these data structures are reconstructed, @rep_capsule can
be more common used and it makes pack and unpack sub requests
in a batch PtlRPC request for the coming batch metadata processing
more easily.

Lustre-change: https://review.whamcloud.com/40669
Lustre-commit: f75d2a1fc9b17b384bbcbc13bcb80ba10412cf29

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ib6d942b79ebf1a444d63b55ad4bc94813cf947c7
Reviewed-on: https://review.whamcloud.com/46029
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13055 doc: update changelog manpages
Mikhail Pershin [Thu, 17 Jun 2021 14:11:51 +0000 (17:11 +0300)]
LU-13055 doc: update changelog manpages

Add lctl-changelog_register.8 and lctl-changelog_deregister.8
manpages and update lctl.8 manpage to refer to them.

Lustre-change: https://review.whamcloud.com/44022
Lustre-commit: 393885c027793d27ec948fd4fccb47aa530d2bf8

Fixes: 15305c3c3fe7 ("LU-12214 build: fix build without lustre_utils")
Test-Parameters: trivial
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ie41db630c72f61a884cd8000e0a4aeeb42ca60eb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46007
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-5369 mdt: check lock handle instead assert
Yang Sheng [Mon, 13 Sep 2021 21:04:00 +0000 (05:04 +0800)]
LU-5369 mdt: check lock handle instead assert

The lock handle could be NULL inn some corner case.
We should check it instead of LBUG.

Lustre-change: https://review.whamcloud.com/44905
Lustre-commit: 5e4411e99cd7d0ccf4e51fac1442673844626639

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I1afa7f8c129c104b012ae23141318365c388c503
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46019
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14474 llog: don't destroy next llog
Alex Zhuravlev [Tue, 21 Sep 2021 12:23:56 +0000 (15:23 +0300)]
LU-14474 llog: don't destroy next llog

do not destroy empty llog if it's referenced as
the next one in a catalog.

Lustre-change: https://review.whamcloud.com/44998
Lustre-commit: 4521f6af35d1dc20b531b87ff3633d89dbac86ec

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I78bfeb90435aaee2b8536b647aa3acec56642ea0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45892
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-15168 osd: use large allocation for idc cache
Alex Zhuravlev [Wed, 27 Oct 2021 05:48:03 +0000 (08:48 +0300)]
LU-15168 osd: use large allocation for idc cache

as in some cases (e.g. ofd precreate) the cache can grow to dozens
of kilobytes (sizeof(struct idc_map_cache)=40 * 1024).

Lustre-commit: a3aa2eefd3d4708ce7094ed644c30b784c39eb2c
Lustre-change: https://review.whamcloud.com/45382

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id9e0996a7a1d07065f4a50c1d5be5051e756559a
Reviewed-on: https://review.whamcloud.com/46040
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14959 ldlm: Check return value of ldlm_resource_get()
Oleg Drokin [Tue, 24 Aug 2021 03:44:45 +0000 (23:44 -0400)]
LU-14959 ldlm: Check return value of ldlm_resource_get()

Fix the comment to properly indicate it returns ERR_PTR on
error and fix osc_req_attr_set() and mdc_get_lock_handle()
to actually check the return value before passing it on and
causing an unintended crash.

Lustre-change: https://review.whamcloud.com/44738
Lustre-commit: 3e0aa9ca6e0a9a6981b9a3ad5f556cd6554a6b5b

Change-Id: Ib85a62140a39744e85989c9a9c8aa2ed771d70d1
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46016
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4270 kernel: increase kernel version to ddn16
Andreas Dilger [Sat, 8 Jan 2022 06:17:20 +0000 (23:17 -0700)]
EX-4270 kernel: increase kernel version to ddn16

Increase kernel build version to -ddn16 due to new kernel patch.

Lustre-change: https://review.whamcloud.com/45869
Lustre-commit: 9cb39cdf470f444decaf183af7b4b6f6a79f80bf

Fixes: afd8b0df0aba ("EX-4270 snapshot: avoid call quota op recursively")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icf0d404ea5ebfb1009078a286585d837b37417ea
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/46023
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-4270 snapshot: avoid call quota op recursively
Hongchao Zhang [Tue, 30 Nov 2021 10:11:01 +0000 (18:11 +0800)]
EX-4270 snapshot: avoid call quota op recursively

In ext4_snapshot_test_and_cow, if there is already in some quota
call, it could cause deadlock if the snapshot calls quota function
to allocate space recursively.

[only the change to snapshot-jbd2-rhel7.7.patch]

Lustre-change: https://review.whamcloud.com/45680
Lustre-commit: 4722f1a0ca9d24bf6fa2678659ccf2cb1be5cdf1

Test-Parameters: trivial
Change-Id: Iac354744fcee8955d8e41020f9cee6d433f38e80
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14793 hsm: record index for further HSM action scanning
Qian Yingjin [Fri, 25 Jun 2021 08:22:35 +0000 (16:22 +0800)]
LU-14793 hsm: record index for further HSM action scanning

there is contention between HSM archive request and "hsm_cdtr"
kernel thread:
->mdt_hsm_request()
  ->mdt_hsm_add_actions()
    ->mdt_hsm_register_hal()
      ->mdt_agent_record_add()
        ->down_write(&cdt->cdt_llog_lock)
        ->llog_cat_add()
        ->up_write(&cdt->cdt_llog_lock)

->mdt_coordinator()
  ->cdt_llog_process()
    ->down_write(&cdt->cdt_llog_lock);
    ->llog_cat_process()
    ->up_write(&cdt->cdt_llog_lock);

HSM archive request and HSM cat llog scanning in the kernel daemon
"hsm_cdtr" are both contenting for write llog lock to add or
update the "hsm_actions" llog.

In the tesing, it uses max_requests = 1000000.
In the current implementation, it means kernel daemon thread
"hsm_cdtr" needs to scan nearly whole "hsm_actions" llog from the
beginning position with write llog lock held.
This will slow down the HSM archive requests which is contented
for write llog lock.

As llog is append-only, we record the latest handled position in
the llog, thus next scanning can start from the previous recorded
postion (llog index), does not need to start from the beginning.

Another way to mitigate this probelm is:
when the llog scanner found that there are other process
contended for the llog lock, it will stop the llog scanning and
release write llog lock properly for incoming HSM archive requests.

After applied this patch, with 200000 HSM actions in llog, the time
to queue 10000 HSM archive requests reduces from 10 seconds to 4
seconds.

Lustre-change: https://review.whamcloud.com/44077
Lustre-commit: a15a5432f8063e3a04a87d74eafac0060a8f9d26

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I2e92daf34844605ee648787daf859143335c68bf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46013
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14724 nrs: TBF rule list broken when change rule rank
Qian Yingjin [Fri, 28 May 2021 03:56:12 +0000 (11:56 +0800)]
LU-14724 nrs: TBF rule list broken when change rule rank

When change rank of two adjacent rules in the TBF rule list in
@nrs_tbf_rule_change_rank():
list_move(&rule->tr_linkage, next_rule->tr_linkage.prev);

The previous pointer of @next_rule is @rule, using list_move
directly will break the rule list.
In this patch, it use list_del + list_add to repace list_move to
avoid TBF rule broken.
And also add a test case sanityn test_77o for this bug.

Lustre-change: https://review.whamcloud.com/43925
Lustre-commit: e688f29275deeadc0ef4faa01f166986bade301f

Fixes: aa14b0b9a152 ("LU-8006 ptlrpc: specify ordering of TBF policy rules")
Change-Id: Ica30d3329f07914657ac2c4089d66f934021b763
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46017
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14713 llite: mend the trunc_sem_up_write()
Bobi Jam [Tue, 3 Aug 2021 06:38:46 +0000 (14:38 +0800)]
LU-14713 llite: mend the trunc_sem_up_write()

The original lli_trunc_sem replace change (commit e5914a61ac) fixed a
lock scenario:

  t1 (page fault)          t2 (dio read)              t3 (truncate)
|- vm_mmap_pgoff()       |- vvp_io_read_start()     |- vvp_io_setattr
 |- down_write(mmap_sem)  |- down_read(trunc_sem)            _start()
  |- do_map()              |- ll_direct_IO_impl()
   |- vvp_io_fault_start    |- ll_get_user_pages()

                                                     |- down_write(
                             |- down_read(mmap_sem)        trunc_sem)
    |- down_read(trunc_sem)

t1 waits for read semaphore of trunc_sem which is hindered by t3,
since t3 is waiting for the write semaphore while t2 take its read
semaphore, and t2 is waiting for mmap_sem which has been taken by t1,
and a deadlock ensues.

commit e5914a61ac changes the down_read(trunc_sem) to
trunc_sem_down_read_nowait() in page fault path, to make it ignore
that there is a down_write(trunc_sem) waiting, just takes the read
semaphore if no writer has taken the semaphore, and breaks the
deadlock.

But there is a delicacy in using wake_up_var(), wake_up_var()->
__wake_up_bit()->waitqueue_active() locklessly test for waiters on the
queue, and if it's called without explicit smp_mb() it's possible for
the waitqueue_active() to ge hoisted before the condition store such
that we'll observe an empty wait list and the waiter might not
observe the condition, and the waiter won't get woke up whereafter.

Lustre-change: https://review.whamcloud.com/43844
Lustre-commit: 39745c8b5493159bbca62add54ca9be7cac6564f

Fixes: e5914a61ac ("LU-12460 llite: replace lli_trunc_sem")
Change-Id: Ifdda2c1c8a4171466be1723923c136e84de8ce0e
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46014
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
3 years agoLU-14854 mdd: proper handle error in mdd_swap_layouts()
Bobi Jam [Thu, 15 Jul 2021 18:22:38 +0000 (02:22 +0800)]
LU-14854 mdd: proper handle error in mdd_swap_layouts()

Only restore object's HSM xattr on error if it's for
SWAP_LAYOUTS_MDS_HSM.

Lustre-change: https://review.whamcloud.com/44319
Lustre-commit: 7648c1c905b0976fc789cfd9c6bac382389385ee

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9d4c58cd3107c3900e72a0946d0ec7d7286dd43f
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46021
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
3 years agoLU-14895 brw: log T10 GRD tags during checksum calcs
Andreas Dilger [Wed, 4 Aug 2021 08:08:12 +0000 (02:08 -0600)]
LU-14895 brw: log T10 GRD tags during checksum calcs

Log the T10 guard tags during checksum calculation on the client and
target to help identify where checksum errors are being introduced.
The added debugging is only active on RPC resend, so will not add
overhead during the normal IO path.

Lustre-change: https://review.whamcloud.com/44655
Lustre-commit: 75ebfb994fb0bce8a0f0400429f04127ead50ea4

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia4f14f2f2296da096acf629c74558386e7ce7057
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46053
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14598 ofd: fix for IDIF sequence at ofd_preprw_write
Alexander Boyko [Thu, 8 Apr 2021 08:23:54 +0000 (04:23 -0400)]
LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write

During recovery write operation could create and load a sequence
if it comes before creation request from MDT0. ofd_preprw_write() uses
wrong logic for taking sequence for IDIF fids. And if oid overflows
32bit and takes a part at IDIF sequence, write request loads wrong
ofd sequence. And after that it is used for other IO. The next
create from MDT0 cause an error:
Too many FIDs to precreate OST replaced or reformatted...

The test 122b reproduce issue when OST using a wrong sequence for
MDT0 IDIF. This error requires objects id grater than 32bit, and
write request during recovery, it should be processed before a create
requset from MDT0.
For a visible error at console the last object id should be
1<<32 + (OST_MAX_PRECREATE * 5). Error is
lustre-OST0000: Too many FIDs to precreate OST replaced or
    reformatted: LFSCK will clean up

Lustre-change: https://review.whamcloud.com/43248
Lustre-commit: 747fed818be5a4e09281ab1d9fd5b3a13763ab40

HPE-bug-id: LUS-9595
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I09e6f88b1f0d03fec59b24ef096cbc7baa5388ae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46015
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-14951 llite: protect fd_{lease_}och
Bobi Jam [Wed, 18 Aug 2021 13:32:21 +0000 (21:32 +0800)]
LU-14951 llite: protect fd_{lease_}och

Access ll_file_data::fd_och and fd_lease_och needs to lli_och_mutex
protection.

Lustre-change: https://review.whamcloud.com/44700
Lustre-commit: b275ccd9787753b9cbf4368d8611c2ac94726e2e

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie9136aa345c6bf015aa73067acdaecf1a765b9f6
Reviewed-on: https://review.whamcloud.com/46030
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-15156 kernel: back port patch for rwsem issue
Yang Sheng [Tue, 11 Jan 2022 17:06:05 +0000 (01:06 +0800)]
LU-15156 kernel: back port patch for rwsem issue

RHEL7 included a defect in rwsem. It can cause a
thread hung on rwsem waiting infinity. Backport
commit: 5c1ec49b60cdb31e51010f8a647f3189b774bddf
to fix this issue.

Lustre-commit: 85362faed8f5ee94ffee1f3f6330beee57ea9284
Lustre-change: https://review.whamcloud.com/45383

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ic5c469ce744ad5882c13163a9bfe14faef8fd446
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14734 ldiskfs: improve message for large_dir
Andreas Dilger [Tue, 11 Jan 2022 17:49:41 +0000 (09:49 -0800)]
LU-14734 ldiskfs: improve message for large_dir

Make it more clear that the large_dir feature has already been
enabled, rather than making the admin think that they need to
enable the feature themselves.

Lustre-change: https://review.whamcloud.com/45046
Lustre-commit: 2a24b6ec67da9224e1cb6226166cde3a9c95431d

Test-Parameters: trivial
Fixes: f5967b06aac5 ("LU-14734 osd-ldiskfs: enable large_dir automatically")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ica59d3370148ed277d3541c05be065c4638daf8d
Reviewed-on: https://review.whamcloud.com/46045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-13397 llite: support fallocate() on selected mirror
Mikhail Pershin [Sun, 22 Aug 2021 19:41:33 +0000 (22:41 +0300)]
LU-13397 llite: support fallocate() on selected mirror

- add ability to do fallocate() on designated mirror in
  FLR file
- add missing FALLOC_FL_KEEP_SIZE flag to fallocate() call
  in llapi_hole_punch(). It was just not working without
  that flag silently
- add corresponding test_50d in sanity-flr.sh

Lustre-change: https://review.whamcloud.com/44721
Lustre-commit: 89736d502cc99f095237dde7520fc4ca86191882

Fixes: 4126fbb30c ("LU-13397 lfs: mirror resync to keep sparseness")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I8d700fce904c84458a50650f1d3cb09d23989eba
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoEX-3626 build: build ptlrpc_gss during ubuntu dkms
Minh Diep [Mon, 9 Aug 2021 19:45:45 +0000 (12:45 -0700)]
EX-3626 build: build ptlrpc_gss during ubuntu dkms

include ptlrpc_gss in dkms.conf

Lustre-change: https://review.whamcloud.com/44539

Change-Id: I952a7019b2bc5687507fdb1f274c100152dae6cd
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46018
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
3 years agoLU-14807 lfsck: fix race in lfsck_pos_fill
Hongchao Zhang [Sun, 27 Jun 2021 21:00:20 +0000 (05:00 +0800)]
LU-14807 lfsck: fix race in lfsck_pos_fill

There is a race for lfsck->li_di_dir between lfsck_di_dir_put and
lfsck_pos_fill, which could cause lfsck_pos_fill to use freed
lfsck->li_di_dir (struct osd_it_ea) and trigger GPF.

Lustre-change: https://review.whamcloud.com/44130
Lustre-commit: 911f638bd6c547591e784fcec668fe9811916e21

Change-Id: Iedadf03ac15d128bb051aea8aafa24dbcd2704fb
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46020
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>