Whamcloud - gitweb
fs/lustre-release.git
12 months agoLU-13199 lustre: remove cl_{offset,index,page_size} helpers 26/37426/4
Wang Shilong [Tue, 4 Feb 2020 10:35:20 +0000 (18:35 +0800)]
LU-13199 lustre: remove cl_{offset,index,page_size} helpers

These helpers could be replaced with PAGE_SIZE and PAGE_SHIFT calculation
directly which avoid CPU overhead.

Change-Id: I624136d4399a03e599f09f00a77b86de045f19e9
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/37426
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-9859 lnet: convert selftest to use workqueues 91/36991/26
Mr NeilBrown [Fri, 17 Jul 2020 23:11:29 +0000 (19:11 -0400)]
LU-9859 lnet: convert selftest to use workqueues

Instead of the cfs workitem library, use workqueues.

As lnet wants to provide a cpu mask of allowed cpus, it
needs to be a WQ_UNBOUND work queue so that tasks can
run on cpus other than where they were submitted.
We use alloc_ordered_workqueue for lst_sched_serial (now called
lst_serial_wq) - "ordered" means the same as "serial" did.
We use cfs_cpt_bind_queue() for the other workqueues which sets up the
CPU mask as required.

An important difference with workqueues is that there is no equivalent
to cfs_wi_exit() which can be called in the action function and which
will ensure the function is not called again - and that the item is no
longer queued.

To provide similar semantics we treat swi_state == SWI_STATE_DONE as
meaning that the wi is complete and any further calls must be no-op.
We also call cancel_work_sync() (via swi_cancel_workitem()) before
freeing or reusing memory that held a work-item.

To ensure the same exclusion that cfs_wi_exit() provided the state is
set and tested under a lock - either crpc_lock, scd_lock, or tsi_lock
depending on which structure the wi is embedded in.

Another minor difference is that with workqueues the action function
returns void, not an int.

Also change SWI_STATE_* from #define to an enum.  The only place these
values are ever stored is in one field in a struct.

Linux-commit: 6106c0f82481e686b337ee0c403821fb5c3c17ef
Linux-commit: 3fc0b7d3e0a4d37e4c60c2232df4500187a07232
Linux-commit: 7d70718de014ada7280bb011db8655e18ed935b1

Test-Parameters: trivial testlist=lnet-selftest
Change-Id: I5ccf1399ebbfdd4cab3696749bd1ec666147b757
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/36991
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-12273 obd: Reserve metadata overstriping flags 07/49707/5
Patrick Farrell [Thu, 19 Jan 2023 18:48:07 +0000 (13:48 -0500)]
LU-12273 obd: Reserve metadata overstriping flags

Reserve flag bits for metadata overstriping.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I894b9420a4b08cceaccca6b3184ecb3bd22a680c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 months agoLU-137 osd-ldiskfs: pass through resize ioctl 61/20161/16
Andreas Dilger [Wed, 23 Nov 2022 23:13:12 +0000 (16:13 -0700)]
LU-137 osd-ldiskfs: pass through resize ioctl

Pass through the EXT4_IOC_RESIZE_FS ioctl to the underlying ldiskfs
code so that it is possible to online resize MDT and OST filesystems.

When running resize2fs against a filesystem, it compares st_rdev of
the block device against st_dev of the mounted filesystem, so the
mounted Lustre stub filesystem needs to return proper stat information
from the ldiskfs root directory.  Add in a server_getattr() method
to the server inode_operations.  Using generic_fillattr() is enough,
we don't need the added complexity of calling ext4_getattr() (which
does not exist on directories for all kernel versions).

Change the OSD API from returning the superblock with dt_mnt_sb_get()
to returning the vfsmount with dt_mnt_get(), since it also contains
the superblock, but is more useful for calling some inode methods.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I934ae1f495bd15c6435be81b51ed04f0986c0322
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/20161
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-6142 obd: change lmd flags to bitmap 12/49912/9
James Simmons [Sun, 30 Apr 2023 13:06:44 +0000 (09:06 -0400)]
LU-6142 obd: change lmd flags to bitmap

Change lmd flags to an enum that is accessible with the Linux
bitmap API. This lays the foundation for creating a match table
for the server options for mounting.

Change-Id: If7906a9a3ba177b67d0cfbaa276a00a6ba9b7b6d
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49912
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-15934 lod: renew the update llog 69/49569/3
Yang Sheng [Fri, 6 Jan 2023 13:10:35 +0000 (21:10 +0800)]
LU-15934 lod: renew the update llog

Skip and renew the update llog file while it was
corrupted.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I3491858dce42b4a8ed11db55ebbf8a12ef5f521d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49569
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-11623 mdt: return UPDATE|PERM on open 85/33585/30
Oleg Drokin [Tue, 4 Jun 2019 04:41:40 +0000 (00:41 -0400)]
LU-11623 mdt: return UPDATE|PERM on open

This patch includes the following changes:
* try lock UPDATE|PERM ibits on open to speed up subsequent stat.
* open returns PR lock to client by default, because CR mode
  UPDATE|PERM ibits don't make sense since all modifications take
  PW lock.
* don't lock UPDATE|PERM ibits for PCC attach, otherwise these ibits
  revoke will cause file detach.
* update sanity-pcc 13a to make it fail on single client test if
  anything went wrong.
* update sanity-lfsck 31d because previously CR UPDATE lock is
  fetched, thus the test pass by mistake.

This should help common workloads with open followed by a stat
or other such operation.

Benchmark results:

This patch can significantly improve open-create + stat on the same
client.

This patch in combination with two others:

https://review.whamcloud.com/32157
https://review.whamcloud.com/33584

Improves the 'stat' side of open-create + stat by >10x.

Without patches (master branch commit 26a7abe):

mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d
/cache1/out/ -F -C -T -v -w 32k

   Operation                      Max            Min           Mean
   ---------                      ---            ---           ----
   File creation     :       3838.205       3838.204       3838.204
   File stat         :      33459.289      33459.249      33459.271
   File read         :          0.000          0.000          0.000
   File removal      :          0.000          0.000          0.000
   Tree creation     :       3146.841       3146.841       3146.841
   Tree removal      :          0.000          0.000          0.000

With the three patches:

mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d
/cache1/out/ -F -C -T -v -w 32k
SUMMARY rate: (of 1 iterations)
   Operation                      Max            Min           Mean
   ---------                      ---            ---           ----
   File creation     :       3822.440       3822.439       3822.440
   File stat         :     350620.140     350615.980     350617.193
   File read         :          0.000          0.000          0.000
   File removal      :          0.000          0.000          0.000
   Tree creation     :       2076.727       2076.727       2076.727
   Tree removal      :          0.000          0.000          0.000

Note 33K stats/second vs 350K stats/second.

ls -l time of the mdtest directory is also reduced from 23.5 seconds
to 5.8 seconds.

Change-Id: Ib3410629c190de6f74246a4a92f8216537fa2b95
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Qian Yingjiin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/33585
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
12 months agoLU-15529 mdt: optimize dir migration locking 91/40891/25
Lai Siyao [Sun, 16 May 2021 06:46:21 +0000 (14:46 +0800)]
LU-15529 mdt: optimize dir migration locking

Optimize dir migration locking and fix some deadlocks:
* don't lock all stripes of parent, but source and target parent
  stripes only.
* use mdt_rename_source_lock() to lock source, because directory
  stripes are not changed in migration.
* refactor migrate links locking code.
* pass spobj and tpobj to mdo_migrate() interface to avoid parsing
  parent directory layout in MDD layer again.
* never lock the same FID twice, which may lead to deadlock:
  . if link parent is local, don't hold local LOOKUP lock, but revoke
    only, because later we need to lock other ibits of sobj.
  . if sobj is plain directory, unlock sobj before locking tobj,
    because sobj will become a stripe of tobj during migration.
* enable striped directory migration in racer test.

Also update sanityn 80b to migrate directory and access it for one
minute and verify filesystem is not broken, though both migration and
directory access may fail in this period.

Test-Parameters: env=SLOW=y mdscount=2 mdtcount=4 testlist=racer,racer,racer
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie9859df244529c986c2f3f032a49e3f9c89a2747
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40891
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16221 build: modify kmodtool for rhel9 65/50865/3
Minh Diep [Fri, 5 May 2023 03:32:38 +0000 (20:32 -0700)]
LU-16221 build: modify kmodtool for rhel9

Customized kmodtool to use our lbuild location

Test-Parameters: trivial \
clientdistro=el9.1 serverdistro=el9.1 testlist=sanity

Change-Id: I0573db09fa33a77d93d052fa12d5b07300d7eff6
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50865
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-13081 tests: skip sanity test_151/test_156 77/50777/3
Alex Deiter [Wed, 26 Apr 2023 22:04:01 +0000 (02:04 +0400)]
LU-13081 tests: skip sanity test_151/test_156

Skip both sanity test_151 and test_156 during interop testing,
since this is really testing server-side functionality only
(OSS caching behavior). And it makes sense to just exclude
test_151 and test_156 during interop testing, otherwise it
seems that the client version of the test can become
inconsistent with the caching behavior/tunables on the OSS
and the failures don't mean anything. There is enough
non-interop testing to catch any regressions in the OSS
cache behavior.

Test-Parameters: trivial
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I39a8b54894d5b0c7573e6c56d1f8e1ba02b3e3fe
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16756 kernel: new kernel [RHEL 9.2 5.14.0-283.el9] 45/50745/4
Jian Yu [Tue, 25 Apr 2023 18:32:24 +0000 (11:32 -0700)]
LU-16756 kernel: new kernel [RHEL 9.2 5.14.0-283.el9]

This patch makes changes to support new RHEL 9.2 release
for Lustre client.

Test-Parameters: trivial env=SANITY_EXCEPT=27J clientdistro=el9.2 testlist=sanity

Change-Id: I4886bbf30d6d6a93c4adbfb68871e9d91f5b64de
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50745
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
12 months agoLU-10733 tests: increase conf-sanity/106 OST size 32/50732/2
Andreas Dilger [Thu, 20 Apr 2023 22:13:54 +0000 (16:13 -0600)]
LU-10733 tests: increase conf-sanity/106 OST size

conf-sanity test_106 is trying to create ~64k files, but OST0000
only has about 48k objects in this case, so the file creates are
failing during the test.  This makes the test somewhat unreliable
and hitting errors not related to what was originally intended
(llog wrap handling).

Increase the OSTSIZE for this test to handle the number of objects
needed by the test so it can run more reliably.

Test-Parameters: trivial
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Test-Parameters: testlist=conf-sanity env=ONLY=106
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie33825801172ea565d9d1d5fb81595d2cad65677
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50732
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16758 krb: use Kerberos machine principal in client 09/50709/2
Sebastien Buisson [Fri, 21 Apr 2023 13:55:21 +0000 (15:55 +0200)]
LU-16758 krb: use Kerberos machine principal in client

In addition to having Lustre client rely on the
lustre_root/<hostname>@REALM principal to authenticate, support the
more standard Kerberos machine principal host/<hostname>@REALM.
That avoids the need for additional keytab entries, and brings Lustre
in line with other services such as OpenSSH and NFS.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id50cef1a3a94248b958ce9ea42b5ae356f29cbf1
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50709
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Jonathan Calmels <jcalmels@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16286 ldiskfs: add ext4_find_delayed_extent patch to more series 20/50820/2
Jian Yu [Mon, 1 May 2023 16:34:58 +0000 (09:34 -0700)]
LU-16286 ldiskfs: add ext4_find_delayed_extent patch to more series

Add rhel8.4/ext4-optimize-find_delayed_extent.patch to RHEL 8.7
and RHEL 8.8 ldiskfs patch series.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I9940ed20e0addfcf2c34db955bd6d36844a268df
Fixes: 3dd73b5c5d61 ("LU-16286 ldiskfs: reimplement nodelalloc optimization")
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50820
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16755 kernel: RHEL 8.8 client and server support 08/50708/4
Jian Yu [Sun, 23 Apr 2023 00:02:47 +0000 (17:02 -0700)]
LU-16755 kernel: RHEL 8.8 client and server support

This patch makes changes to support RHEL 8.8 release
with kernel 4.18.0-477.el8 for Lustre client and server.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: Ie47f131e0340a601c8a5d748ecf9b1b73d4baa1f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50708
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-16751 docs: consolidate and cleanup READMEs 03/50703/6
Timothy Day [Fri, 21 Apr 2023 04:23:15 +0000 (04:23 +0000)]
LU-16751 docs: consolidate and cleanup READMEs

A number of the in-tree READMEs are very outdated. Consolidating
these into the top-level README will make it more likely that the
infomation is read (since this file gets rendered by many git hosts)
and more likely to be keep up-to-date.

The information in the consolidated README has been updated.

Some docs which don't seem relevant anymore are simply deleted.

The llverfs.txt doc has been converted to a proper man page. The
descriptive comment in llverfs.c has been redirected towards the
man page instead, to reduce the risk of these becoming out-of-sync.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I57a6f13056913551d96363ffdbce76beed5c9486
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50703
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16739 uapi: make lustre_disk.h buildable in user land 41/50641/12
James Simmons [Wed, 26 Apr 2023 15:12:48 +0000 (11:12 -0400)]
LU-16739 uapi: make lustre_disk.h buildable in user land

The rbac work introduced a regression that makes lustre_disk.h
UAPI header no longer buildable in user land. This is causing
sanity test 400b to fail with:

lustre_disk.h:266:18: error: 'LUSTRE_NODEMAP_NAME_LENGTH' undeclared here (not in a function)
  char   ncr_name[LUSTRE_NODEMAP_NAME_LENGTH + 1];
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~
lustre_disk.h:267:20: error: 'ncr_flags' is narrower than values of its type [-Werror]
  enum nm_flag_bits ncr_flags:8;
                    ^~~~~~~~~
lustre_disk.h:267:20: error: field 'ncr_flags' has incomplete type
lustre_disk.h:268:21: error: 'ncr_flags2' is narrower than values of its type [-Werror]
  enum nm_flag2_bits ncr_flags2:8;
                     ^~~~~~~~~~
lustre_disk.h:268:21: error: field 'ncr_flags2' has incomplete type
lustre_disk.h:277:2: error: unknown type name 'lnet_nid_t'
  lnet_nid_t nrr_start_nid;
  ^~~~~~~~~~
lustre/lustre_disk.h:278:2: error: unknown type name 'lnet_nid_t'
  lnet_nid_t nrr_end_nid;
  ^~~~~~~~~~

To fix this move several pieces of nodemap handling from lustre_idl.h
to lustre_disk.h.

The git commit 5e6a51787fef20b849682d8c49ec9c2beed5c373 for Linux
kernel version 6.2.0-rc5 made guid_t only available for kernel code.
The only UAPI data structure left is uuid_le. Thankfully MCE requires
this otherwise even uuid_le would be removed. We will need to keep
an eye on this.

Test-Parameters: trivial testlist=sanity envdefinitions=ONLY=400b
Fixes: 5e48ffca322 ("LU-16524 nodemap: add rbac property to nodemap")
Change-Id: I4b962572ec2bf76159a17807c564390ded00d630
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16713 llite: add __GFP_NORETRY for read-ahead page 25/50625/3
Qian Yingjin [Thu, 13 Apr 2023 12:28:26 +0000 (08:28 -0400)]
LU-16713 llite: add __GFP_NORETRY for read-ahead page

We need __GFP_NORETRY for read-ahead page, otherwise the read
process would be OOM killed when reached cgroup memory limits.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If699429d5d5cd29bd895d8455296113aa67645fc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50625
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16649 llite: EIO is possible on a race with page reclaim 44/50344/9
Patrick Farrell [Mon, 20 Mar 2023 21:21:32 +0000 (17:21 -0400)]
LU-16649 llite: EIO is possible on a race with page reclaim

We must clear the 'uptodate' page flag when we delete a
page from Lustre, or stale reads can occur.  However,
generic_file_buffered_read requires any pages returned from
readpage() be uptodate.

So, we must retry reading if page truncation happens in
parallel with the read.

This implements the same fix as:
https://review.whamcloud.com/49647
b4da788a819f82d35b685d6ee7f02809c05ca005

did for the mmap path.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iae0d1eb343f25a0176135347e54c309056c2613a
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50344
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoRevert "LU-14541 llite: Check vmpage in releasepage" 54/49654/8
Patrick Farrell [Mon, 20 Mar 2023 21:49:06 +0000 (17:49 -0400)]
Revert "LU-14541 llite: Check vmpage in releasepage"

This reverts commit c524079f4f59a39b99467d9868ee4aafdcf033e9,
because it breaks releasepage for Lustre and does not
completely fix the data consistency issue in LU-14541.

Breaking releasepage matters because it prevents direct I/O
from working if there is page cache data present, and
because it causes similar issues with GDS, which must be
able to flush page cache pages before doing I/O.

Revert "LU-14541 llite: Check vmpage in releasepage"

This reverts commit c524079f4f59a39b99467d9868ee4aafdcf033e9,
because it breaks releasepage for Lustre and does not
completely fix the data consistency issue in LU-14541.

Breaking releasepage matters because it prevents direct I/O
from working if there is page cache data present, and
because it causes similar issues with GDS, which must be
able to flush page cache pages before doing I/O.

With patches:
"LU-16160 llite: SIGBUS is possible on a race with page reclaim"/
d9c23a7934747eb19e23470b30806482a1aa60f8
and
"LU-14541 llite: Check for page deletion after fault"/
19678e30147f50f813e72e8216cfb0453fe0ca6e
LU-14541 is fully resolved, so we can revert this patch.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I613bdb4f27161ffc3638d1d8ea38827af5a7bd47
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49654
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16508 netlink: resv_start_op for [lnet|lustre]_family 00/49800/13
Shaun Tancheff [Fri, 10 Feb 2023 18:02:41 +0000 (12:02 -0600)]
LU-16508 netlink: resv_start_op for [lnet|lustre]_family

Linux v5.0-11693-g3b0f31f2b8c9
  genetlink: make policy common to family

struct genl_family adds policy and resv_start_op members

Linux v6.1-rc2-63-g4fa86555d1cd
  genetlink: piggy back on resv_op to default to a reject policy

struct genl_family needs to set a policy and/or indicate the
commands that should be validated.

Set the resv_start_op higher than the largest command accepted.
to avoid a default policy of NLA_REJECT.

When genl_family has a policy provide one.

HPE-bug-id: LUS-11454
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If38fbe4c9358bb4f9b57e7d25b8a6df1fba63452
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49800
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
12 months agoLU-16768 lfs: copy optarg string other than using it directly 33/50733/2
Bobi Jam [Tue, 25 Apr 2023 02:15:11 +0000 (10:15 +0800)]
LU-16768 lfs: copy optarg string other than using it directly

Copy optarg string for fp_format_printf_str lest it be messed
later.

Fixes: 6b8e97b76c ("LU-10378 utils: add formatted printf to lfs find")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib32883d3261ae921adf0fdd7b05bcbf728de7557
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50733
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Rick Mohr <mohrrf@ornl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-10391 lnet: set msg field for lnet message header 16/50716/2
James Simmons [Sun, 23 Apr 2023 00:12:35 +0000 (20:12 -0400)]
LU-10391 lnet: set msg field for lnet message header

During testing messages sent for larger NID setups was missing
the actual message. Fill in the header msg field to properly
send the total message.

Test-Parameters: trivial testlist=sanity-lnet
Test-Parameters: serverversion=2.12 serverdistro=el7.9 testlist=runtests
Test-Parameters: clientversion=2.12 testlist=runtests
Fixes: 7b31ef0bbac ("LU-10391 socklnd: add hello message version 4")
Change-Id: I36ef48a239a64a9002f8dc2683437bc3c57492e6
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50716
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16657 build: don't need to use zfs kmodtool on RHEL9.1 46/50746/2
Minh Diep [Tue, 25 Apr 2023 18:44:46 +0000 (11:44 -0700)]
LU-16657 build: don't need to use zfs kmodtool on RHEL9.1

Test-Parameters: trivial fstype=zfs \
clientdistro=el9.1 serverdistro=el9.1 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el9.1 serverdistro=el9.1 testlist=sanity

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.7 serverdistro=el8.7 testlist=sanity

Change-Id: I11ca6a01cb9ddc6e59bf91e1bee35a3ceccb725a
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50746
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16759 o2ib: MOFED 5.5+ ib_dma_virt_map_sg 11/50711/4
Shaun Tancheff [Sun, 23 Apr 2023 12:19:11 +0000 (07:19 -0500)]
LU-16759 o2ib: MOFED 5.5+ ib_dma_virt_map_sg

MOFED 5.5 fails with:
  ERROR: "ib_dma_virt_map_sg" [.../ko2iblnd.ko] undefined!

See if we have a broken ib_dma_map_sg() and provide
a suitable replacement for the missing functionality.

Test-Parameters: trivial
HPE-bug-id: LUS-11587
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3b4454fcafe4640c15b13385a0209ed71f51a3d0
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50711
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-11407 obdclass: init osc.*.rpc_stats start_time 34/50734/2
Andreas Dilger [Tue, 25 Apr 2023 06:30:50 +0000 (00:30 -0600)]
LU-11407 obdclass: init osc.*.rpc_stats start_time

Add missing start_time initialization for osc.*.rpc_stats.

Test-Parameters: trivial
Fixes: ea2cd3af7b ("LU-11407 obdclass: add start time to stats files")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I998b5337ccebc4d3ec18260d259f39c7893ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50734
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-13706 tests: remove test 119d 31/50731/2
Patrick Farrell [Mon, 24 Apr 2023 21:49:18 +0000 (17:49 -0400)]
LU-13706 tests: remove test 119d

The fail_loc used by test 119d was removed in Lustre 2.0.
The fail_loc tests for a bug which should be obvious - a
serious delay when doing DIO writes - and is definitely
fixed in current versions.  (Bugzilla 15950)

And without the fail_loc, the test isn't doing anything
interesting.  But the timer based aspect of it fails
occasionally due to hardware delays.  So let's just remove
the test.

test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I2bc18869258e26dad99c72006f55f31315e67bdd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50731
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16217 iokit: update .gitignore files 94/50594/3
Timothy Day [Tue, 11 Apr 2023 01:45:07 +0000 (01:45 +0000)]
LU-16217 iokit: update .gitignore files

Add missing .gitignore file to new iokit directory.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I4d292c26971f4bd805dc0f2a35e4a281646405ee
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50594
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16650 kernel: update RHEL 7.9 [3.10.0-1160.88.1.el7] 53/50553/2
Jian Yu [Thu, 6 Apr 2023 06:37:16 +0000 (23:37 -0700)]
LU-16650 kernel: update RHEL 7.9 [3.10.0-1160.88.1.el7]

Update RHEL 7.9 kernel to 3.10.0-1160.88.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I4119595943940cca94d1853b59c94a02fed8cb71
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50553
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-15615 target: Free t10pi crypto state on error 39/50539/3
Oleg Drokin [Fri, 4 Mar 2022 22:10:25 +0000 (17:10 -0500)]
LU-15615 target: Free t10pi crypto state on error

Looks like when error happens we forgot to release crypto state that
not only leaks memory directly, but potentially can tie in-memory
pages too.

Change-Id: Ia0870ccbb194e4e9ca8701e1c01d519745c236df
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50539
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
12 months agoLU-16659 build: Detect the mofed path based on running kernel 17/50517/2
Gaurang Tapase [Tue, 4 Apr 2023 05:45:35 +0000 (11:15 +0530)]
LU-16659 build: Detect the mofed path based on running kernel

Test-Parameters: trivial

Change-Id: I519e93e8c26807da6143e2cf4d825ccf4a4180e4
Signed-off-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50517
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-14692 tests: wait for osp in conf-sanity/84 77/50477/4
Li Dongyang [Thu, 30 Mar 2023 12:45:04 +0000 (23:45 +1100)]
LU-14692 tests: wait for osp in conf-sanity/84

Wait for osp to change the first IDIF SEQ to a
normal SEQ, before using replay_barrier.
Otherwise the SEQ change could get lost and we
will trigger LASSERT during replay.

Change-Id: I32daa49d6329902b84eebb00090ae3cebe4a71b0
Test-Parameters: trivial testlist=conf-sanity env=ONLY=84,ONLY_REPEAT=10
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50477
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
12 months agoLU-16427 lfs: rmfid does not print anything on error 88/50388/11
Arshad Hussain [Mon, 3 Apr 2023 21:48:15 +0000 (03:18 +0530)]
LU-16427 lfs: rmfid does not print anything on error

This patch:

01. Improve rmfid

    Adds llapi_root_path_open() This function accepts the device
    or path and returns the open fd for them. This was done so
    that it is called only _once_ and not at every lookup.

    Adds llapi_rmfid_at() This function makes the final IOCTL to
    rmfid. Since llapi_root_path_open() we already had the valid
    fd and populated 'fid_structure'. We could isolate this and
    not call the former llapi_rmfid()

02. Fix rmfid silently accepting fid without fsname
    or lustre root mount point. Make it correctly
    fail if required arguments is not provided.

    After Patch:
    ~~~~~~~~~~~~
    $ lfs rmfid 0x200000402:0x1:0x0
    lfs rmfid: missing <fsname|rootpath> or <fid>
    Remove file(s) by FID(s)
    usage: rmfid <fsname|rootpath> <fid> ...

    Before Patch:
    ~~~~~~~~~~~~
    $ lfs rmfid 0x200000402:0x1:0x0
    $

03. Fix rmfid memory leak

    After Patch:
    ~~~~~~~~~~~~
    $ valgrind --leak-check=full lfs  rmfid lustre 0x200000402:0x1:0x0
    ==33793== HEAP SUMMARY:
    ==33793==     in use at exit: 0 bytes in 0 blocks
    ==33793==   total heap usage: 4 allocs, 4 frees, 1,567 bytes allocated
    ==33793==
    ==33793== All heap blocks were freed -- no leaks are possible

    Before Patch:
    ~~~~~~~~~~~~
    $ valgrind --leak-check=full lfs rmfid lustre 0x200000402:0x1:0x0
    ==30812== LEAK SUMMARY:
    ==30812==    definitely lost: 48 bytes in 1 blocks
    ==30812==    indirectly lost: 0 bytes in 0 blocks
    ==30812==      possibly lost: 0 bytes in 0 blocks
    ==30812==    still reachable: 0 bytes in 0 blocks
    ==30812==         suppressed: 0 bytes in 0 blocks

04. Update/Add Man pages

    Update llapi_rmfid.3 and lustreapi.7 man pages
    Add new llapi_rmfid_at.3 and llapi_root_path_open.3 man pages

Test-Parameters: trivial testlist=sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Idda9313c97e48e9f7bf6486894b6ae3c74d71981
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50388
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Thomas Bertschinger <bertschinger@lanl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16631 llmount: improve usability 60/50260/3
Timothy Day [Fri, 10 Mar 2023 16:56:59 +0000 (16:56 +0000)]
LU-16631 llmount: improve usability

Add some simple help messages to both llmount.sh and
llmountcleanup.sh, similar to what auster has. This
will help unfamilar people understand the use of these
scripts.

Add option to disable client setup for llmount.sh. Add
options for llmount.sh environment variables.

Fix a couple small shellcheck warnings.

Update the file headers to have the SPDX license and
use the standard format.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I50dbb30bad8c8bc0479585293056d61a25aa001d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50260
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16462 utils: handle lack of newer nla_attrs 08/49608/10
James Simmons [Tue, 18 Apr 2023 01:07:47 +0000 (21:07 -0400)]
LU-16462 utils: handle lack of newer nla_attrs

Some platforms like SUSE12SP5 have an older version of libnl3.
Those versions lack proper support for newer nlattrs like NLA_S64.
Without proper support of these newer nlattrs this means
nla_validate() will see newer nlattrs as invalid. We need to
fill in this missing support on older platforms.

The "lctl ping" command will loop forever in jt_ptl_ping() if the
netlink yaml parser doesn't work, instead of falling back to the
"old_api" interface.  However, because IOC_LIBCFS_PING was also
deleted, this doesn't work either.  Use lustre_lnet_ping_nid() to
fallback to IOC_LIBCFS_PING_PEER (since v2_10_52_0-21-g7a36afd9df).

Also, jt_ptl_ping() was passing argv[1] directly to the kernel as
the NID to ping, without doing name resolution on it first, which
broke using "lctl ping HOSTNAME" instead of only numeric IP NIDs.
Ensure it returns a non-zero error code in case of failure.

Restore IOC_LIBCFS_GET_NI for compatibility until there have been
some releases with netlink support, so that "lctl list_nids" works.

Also, sanity test_217 that was testing "lctl ping" was always being
skipped, because "lctl list_nids" is never returning the hostname
with embedded '-', only numeric IP addresses.  Change it to prefer
testing the hostname if it resolves to a NID, otherwise ping the
numeric NID anyway, to confirm that "lctl ping" is still working.

Test-Parameters: trivial clientdistro=sles12sp5 testlist=sanity
Test-Parameters: clientdistro=sles12sp5 testlist=conf-sanity env=ONLY=43+50+70+91+115+130
Fixes: 86ba46c244 ("LU-9680 obdclass: user netlink to collect devices info")
Fixes: d137e9823c ("LU-10003 lnet: use Netlink to support LNet ping commands")
Fixes: 3e4061862e ("LU-864 test: Hostname name doesn't equal NID")
Change-Id: Ia2dfd84c2d1782578ceff1c2dc6f74d7aa9b458b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49608
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-16351 llite: Linux 6.1 prandom, folios_contig, vma_iterator 32/49232/19
Shaun Tancheff [Fri, 27 Jan 2023 06:54:42 +0000 (00:54 -0600)]
LU-16351 llite: Linux 6.1 prandom, folios_contig, vma_iterator

Linux commit v4.10-rc3-6-gc440408cf690
  random: convert get_random_int/long into get_random_u32/u64
Linux commit v6.0-11338-gde492c83cae0
  prandom: remove unused functions

prandom_u32 is a wrapper around get_random_u32, change users
of prandom_u32 to get_random_u32 and provide a fallback
to prandom_u32 when get_random_u32 is not available.

Linux commit v6.0-rc1-2-g25885a35a720
  Change calling conventions for filldir_t
Add a test for the new filldir_t signature
Provide wrappers for transition from int (error code) to bool

Linux commit v6.0-rc3-94-g35b471467f88
  filemap: add filemap_get_folios_contig()
Provide a wrapper and fallback to find_get_pages_contig

Linux commit v6.0-rc3-225-gf39af05949a4
  mm: add VMA iterator
Use vma_iterator and for_each_vma when available.

Test-Parameters: trivial
HPE-bug-id: LUS-11377
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I23dc23d0252e1995555b6685f5cf7c207edf642b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49232
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 months agoLU-15550 ptlrpc: retry mechanism for overflowed batched RPCs 40/46540/9
Qian Yingjin [Thu, 17 Feb 2022 03:42:14 +0000 (22:42 -0500)]
LU-15550 ptlrpc: retry mechanism for overflowed batched RPCs

Before send the batched RPC, the client has no idea about the
actual reply buffer size. The reply buffer size prepared by a
client may be smalller than the reply buffer buffer size in need.
We already have the patch to grow the reply buffer properly in
most cases.

However, when the reply buffer size is growing larger than
BUT_MAXREPSIZE (1000 * 1024), the server will return -EOVERFLOW
error code. At this time, the server only executed the partial
sub requests in the batched RPC. The overflowed sub requests are
not handled.

In this patch, it adds a retry mechanism for overflowed batched
RPC. When found that the reply buffer overflowed, the client will
rebuild the batched RPC for the unhandled sub requests, and use
work queue mechanism to resend the new batched RPC to the server
to re-execute then again.

Add the test case sanity test_123f to verify it for large LOV
stripes with overstriping.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If84fad32f2026bd34ffb47b3e163f84a9d950dbb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46540
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-8704 osp: update local fldb cache 20/44720/30
Alex Zhuravlev [Sat, 21 Aug 2021 15:45:53 +0000 (18:45 +0300)]
LU-8704 osp: update local fldb cache

update local fldb cache during precreate. this is to avoid
a situation when LOD is generating LOVEA and has to lookup
in FLDB. in turn this may lead to an RPC with a local
transaction running. not good.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iadf89eddcef88750d234b0139b67e04715e68855
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/44720
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-16634 misc: standardize iocontrol param handling 14/50314/10
Andreas Dilger [Mon, 20 Mar 2023 02:22:10 +0000 (20:22 -0600)]
LU-16634 misc: standardize iocontrol param handling

Validate uarg and karg early in iocontrol processing where needed.
This needs kernel interop for 4.20+ kernels for access_ok(), but
this can be checked by #ifdef and does not need an autoconf test.

Fix incorrect definition of OBD_IOC_BARRIER to match reality.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reported-by: Tao Lyu <tao.lyu@epfl.ch>
Change-Id: I1a0d2f839949debf346aa15c65b0f407e9ce7057
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50314
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-15515 contrib: Some style fixes for epython scripts 07/50607/4
Oleg Drokin [Tue, 11 Apr 2023 21:10:24 +0000 (17:10 -0400)]
LU-15515 contrib: Some style fixes for epython scripts

tighten formatting, remove spurious if() constructs,
add int conversion for shifts

Test-Parameters: trivial
Change-Id: Id89bc8ca898d0d5870ca6955b185571060eafd58
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-16751 misc: remove unused small files 99/50699/2
Timothy Day [Thu, 20 Apr 2023 05:12:27 +0000 (05:12 +0000)]
LU-16751 misc: remove unused small files

There are a number of small files which don't seem to serve any
purpose anymore. This patches removes them.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: Iddada509bfe1aaefbffdd0ec623a4efd4fc289e6
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50699
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-15123 tests: check quota reintegration after recovery 88/50688/3
Alex Zhuravlev [Wed, 19 Apr 2023 07:20:33 +0000 (10:20 +0300)]
LU-15123 tests: check quota reintegration after recovery

4th step of quota reintegration (reconciliation) waits for recovery
completion. So the tests (like sanity-quota/7a) should wait for
recovery completion before checking reintegration results.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id0aa5db01658621103d94ad6dafe91b2960b3a33
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50688
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
12 months agoLU-12610 contrib: spelling warning for OBD_ macros 65/50665/4
Timothy Day [Tue, 18 Apr 2023 03:13:57 +0000 (03:13 +0000)]
LU-12610 contrib: spelling warning for OBD_ macros

Add a spelling warning for OBD_ macros, which are
going to removed in follow-up patches.

Now, checkpatch.pl will suggest the CFS_ alternatives:

 WARNING: 'OBD_FAILED' may be misspelled - perhaps 'CFS_FAILED'?

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I97180894d30c24acd03eb2784270879a7ec80c0d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50665
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
12 months agoLU-16735 test: cancel MDC locks and wait recovery 38/50638/3
Hongchao Zhang [Fri, 14 Apr 2023 11:12:21 +0000 (19:12 +0800)]
LU-16735 test: cancel MDC locks and wait recovery

During test_35, the MDC LDLM locks should also be cancelled to
flush the pending operations and the recovery should be waited
to complete before checking the quota.

Test-Parameters: trivial testlist=sanity-quota mdscount=2 mdtcount=4
Change-Id: I6508644976be77ad2895107abf90144b51790cfe
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50638
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
12 months agoLU-10391 lustre: obd_connect and reconnect now use large nid 97/50097/5
Mr NeilBrown [Wed, 12 Apr 2023 12:58:01 +0000 (08:58 -0400)]
LU-10391 lustre: obd_connect and reconnect now use large nid

The 'localdata' argument for o_connect and o_reconnect when called
server-side is now a 'struct lnet_nid *' rather than an
'lnet_nid_t *'.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I1ce72ec11a5d2463fb90ab2686410e2dd96118e2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50097
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 months agoLU-14541 llite: Check for page deletion after fault 53/49653/12
Patrick Farrell [Mon, 20 Mar 2023 21:43:43 +0000 (17:43 -0400)]
LU-14541 llite: Check for page deletion after fault

Before completing a page fault and returning to the kernel,
we lock the page and verify it has not been truncated.  But
we must also verify the page has not been deleted from
Lustre, or we can return a disconnected (ie, not tracked by
Lustre) page to the kernel.

We mark deleted pages !uptodate, but this doesn't matter
for faulted pages, because the kernel assumes they are
returned uptodate, and maps them in to the process address
space.  Once mapped, the page state is not checked until
the page is unmapped.

But because the page is referenced by the mapping, it stays
in the page cache even though it's been disconnected from
Lustre.

Because the page is disconnected from Lustre, it will not
be found and cancelled on lock cancellation.  This can
result in stale data reads.

This is particularly an issue with releasepage (called from
drop_caches or under memory pressure), which can delete
pages separately from cancelling covering locks.

If releasepage is disabled, which is effectively what
"LU-14541 llite: Check vmpage in releasepage"
does, this is not an issue.  But disabling releasepage
causes other problems and is incorrect anyway.

Fixes: c524079f4f ("LU-14541 llite: Check vmpage in releasepage")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If1164db8f8e92a1cf811431d56d15f30d8eb3faa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49653
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
12 months agoLU-16661 build: improve lustre.spec.in Requires 97/50397/14
Andreas Dilger [Fri, 24 Mar 2023 02:31:59 +0000 (20:31 -0600)]
LU-16661 build: improve lustre.spec.in Requires

Add Suggests: bash-completion for lustre-client and lustre for
lctl and lfs sub-command completion.

Move perl from Requires to Recommends, since there are only some
uncommonly used tools (llstat, llobdstat) that are using perl.
Remove a couple of ancient obsolete test scripts that used perl.

lustre-iokit incorrectly Required perl instead of python3.

Set minimum kernel version for client to be 3.10 or later.

Change "netstat" to "ss" in tests to avoid dependency issues.
Fix sanity.sh and conf-sanity.sh tests for sles12sp5 issues.

Test-Parameters: trivial testlist=runtests clientdistro=sles12sp5
Test-Parameters: trivial testlist=runtests clientdistro=el9.1
Test-Parameters: trivial testlist=runtests clientdistro=sles15sp3
Fixes: 7521473bdd ("LU-16382 spec: add more dependencies for lustre-tests")
Fixes: fd734cffb3 ("b=18443 tests: remove obsolete tests scripts")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I48c6819596c81cb044e983bc64f1edf1ee3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50397
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
12 months agoLU-14189 contrib: docker example to build Lustre 00/40900/6
Alexey Zhuravlev [Mon, 7 Dec 2020 14:25:48 +0000 (17:25 +0300)]
LU-14189 contrib: docker example to build Lustre

can be easily modified for any target like specific RHEL, CentOS, etc

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9f1abf653e0a89c5d115c53abc21df8d87b86e79
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40900
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-14139 statahead: add test for batch statahead processing 20/41220/22
Qian Yingjin [Thu, 14 Jan 2021 08:07:24 +0000 (16:07 +0800)]
LU-14139 statahead: add test for batch statahead processing

This patch adds sanity test_123ad() for batched statahead
processing, verify it work as expected.

Test-Parameters: trivial testlist=sanity env=ONLY=123ad,ONLY_REPEAT=50
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I742dcac7140ad842fff2afadeb2d948661365b95
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/41220
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-11912 tests: SEQ rollover fixes 78/50478/10
Li Dongyang [Thu, 30 Mar 2023 13:02:33 +0000 (00:02 +1100)]
LU-11912 tests: SEQ rollover fixes

To avoid changeing SEQ after replay_barrier, we
use force_new_seq when starting the test suites heavily
using replay_barrier, e.g. replay-single.
However when there are fewer OSTs, the default 16384
SEQ width could not last the entire test suite, SEQ
rollover could still happen randomly after replay_barrier.

To overcome this, change the default OSTSEQWIDTH to
65536, and divide by number of OSTs, so the SEQ width is
larger with fewer OSTs. For 8 OSTs, the SEQ width is 16384,
and make sure we don't go under it.

Use force_new_seq_all for the test suites using replay_barrier
on MDTs other than mds1.

Add force_new_seq_all for replay-ost-single, which is using
replay_barrier on OST. If SEQ rollover happens after that,
the SEQ range update on ofd is lost due to replay_barrier,
the next time when we try to allocate a new SEQ will end up
with an old one.

Use force_new_seq_all for the test cases(namely sanity-pfl/0b
0c 1c 16b sanity/27Cd) checking for number of stripes created
with overstriping, to make sure we have enough objects
in the precreate pool.

Test-Parameters: trivial ostcount=4 testlist=replay-single
Test-Parameters: ostcount=2 testlist=replay-single
Test-Parameters: mdtcount=2 testlist=conf-sanity env=ONLY=122a,ONLY_REPEAT=10
Test-Parameters: testlist=sanity,sanity-pfl
Test-Parameters: testlist=sanity-scrub,replay-single,obdfilter-survey,replay-ost-single,large-scale
Fixes: 0ecb2a167c ("LU-11912 ofd: reduce LUSTRE_DATA_SEQ_MAX_WIDTH")
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ic65111199f042405d6db8acb729b2cddf91138af
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50478
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-11170 tests: don't fail sanity/415 in VM 54/50654/2
Andreas Dilger [Mon, 17 Apr 2023 08:34:12 +0000 (02:34 -0600)]
LU-11170 tests: don't fail sanity/415 in VM

Don't fail sanity test_415 when running in a VM due to variable
runtimes for the tests.

A proper solution would be to examine the logs to determine if
the renames are blocked or just all slow due to VM contention.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1a5d0f601705c9ec8559e760c4ec27c7f83ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50654
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-15690 tests: skip replay-ost-single/12a on old server 01/50701/4
hxing [Thu, 20 Apr 2023 07:46:29 +0000 (15:46 +0800)]
LU-15690 tests: skip replay-ost-single/12a on old server

Skip 12a of replay-ost-single for older server version.

Test-Parameters: trivial testlist=replay-ost-single env=ONLY=12a
Fixes: 28769c65987c ("LU-15195 ofd: missing OST object")
Signed-off-by: Xing Huang <hxing@ddn.com>
Change-Id: I473452a5326691f4394c9e3ab2ab5dfecbc6ec58
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50701
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-14139 ptlrpc: grow PtlRPC properly when prepare sub request 07/43707/14
Qian Yingjin [Fri, 14 May 2021 14:53:52 +0000 (22:53 +0800)]
LU-14139 ptlrpc: grow PtlRPC properly when prepare sub request

In this patch, it prepares and grows PtlRPC reply buffer
properly for SUB batch request in @req_capsule_server_pack().

At the same time, it adds a limit of reply buffer size with
BUT_MAXREPSIZE = (1000 * 1024).

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I4277974b3b0e9cd19fd0d18ae7c029cccaa9c558
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/43707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16465 tests: update sanity test 806 to use save/restore_opencache 06/50606/3
Oleg Drokin [Tue, 11 Apr 2023 20:22:38 +0000 (16:22 -0400)]
LU-16465 tests: update sanity test 806 to use save/restore_opencache

There are existing primitives to do this, so no need to opencode them

Fixes: dfb08bbf77 ("LU-16465 llite: fix LSOM blocks for ftruncate and close")
Test-Parameters: trivial testlist=sanity env=ONLY=806
Change-Id: Ibc0d34050999bfda8f56384752ede09cacc4df91
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50606
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
13 months agoLU-16734 gss: fix lookup_user_key() bug 23/50623/4
Aurelien Degremont [Fri, 31 Mar 2023 09:30:37 +0000 (11:30 +0200)]
LU-16734 gss: fix lookup_user_key() bug

With more recent kernels, like on Ubuntu 22.04, trying to
delete some keyring resources trigger a kernel warning message
and cleaning is not successful, leading to stuck resources
and warning messages being regularly printed.

This is because Linux 5.8, in commit 8c0637e, introduced an API
change for lookup_user_key() that was not taken in account.

Update the lookup_user_key() call from _user_key() to fix it.

Change-Id: I34ef4dac3f56cbb4aac6bc5a3bad36feb66b8675
Signed-off-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50623
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Jonathan Calmels <jcalmels@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16573 lnet: Check empty list in cfs_match_nid_net 76/50576/2
Chris Horn [Fri, 7 Apr 2023 16:46:29 +0000 (10:46 -0600)]
LU-16573 lnet: Check empty list in cfs_match_nid_net

cfs_match_nid_net() needs to check whether the list of range
expressions describing the address is empty. Otherwise, for numeric
based addresses, we may hit the assert in libcfs_num_match().

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-11480
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Id2460137607f5564751e729ee7b716d9151b5d37
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50576
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16721 lfs: Fix path2fid segfault 57/50557/5
Arshad Hussain [Thu, 6 Apr 2023 07:00:45 +0000 (03:00 -0400)]
LU-16721 lfs: Fix path2fid segfault

This patch fixes segfault when calling path2fid
under interactive mode.

Before Patch
============
lfs > path2fid /mnt/lustre/a
[0x200000401:0x1:0x0]
Segmentation fault

After Patch
============
$ lfs
lfs > path2fid /mnt/lustre/a
[0x200000401:0x1:0x0]

lfs > path2fid --parent /mnt/lustre/a
[0x200000007:0x1:0x0]/a
lfs >

Testcase sanity/154h added.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I4693e0d476e9e7f570f45fd7a31d275d549245aa
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50557
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16714 utils: Clarify fd/fdv naming 46/50546/5
Patrick Farrell [Wed, 5 Apr 2023 16:27:18 +0000 (12:27 -0400)]
LU-16714 utils: Clarify fd/fdv naming

The migrate code uses the deeply opaque 'fd' and 'fdv'
(file-descriptor-volatile) to refer to the source and
destination file descriptors.

Replace these with something that tells a casual reader
what's going on.

Test-parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I56f2dfe81bcd4eff1c168f093e8580b373dc1984
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50546
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16706 kernel: update RHEL 9.1 [5.14.0-162.22.2.el9_1] 12/50512/2
Jian Yu [Mon, 3 Apr 2023 22:43:46 +0000 (15:43 -0700)]
LU-16706 kernel: update RHEL 9.1 [5.14.0-162.22.2.el9_1]

Update RHEL 9.1 kernel to 5.14.0-162.22.2.el9_1.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el9.1 serverdistro=el9.1 testlist=sanity

Change-Id: Ib5186e6f0dcd89660b7000db7f37c0c5a29f944f
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50512
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16694 tests: replace resolveip script 91/50491/4
Timothy Day [Fri, 31 Mar 2023 04:40:38 +0000 (04:40 +0000)]
LU-16694 tests: replace resolveip script

The resolveip script can be replaced with a bash one-liner,
using getent and awk.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I207ea011e43b7b236d5082994ffb51654d8d782c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50491
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16693 lod: ENODEV on setstripe with wrong OST# 87/50487/3
Vitaly Fertman [Fri, 2 Dec 2022 19:41:21 +0000 (22:41 +0300)]
LU-16693 lod: ENODEV on setstripe with wrong OST#

ENODEV should not be returned as it is a recoverable error and
the RPC will be just resent with the same set of parameters

Also, make getstripe to print out the object list for dirs
if it is MAGIC_SPECIFIC, to see the obdidx.

HPE-bug-id: LUS-11395
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: Idffbf3c2b525c4e00c4b662c948460e3735445fc
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50487
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <scherementsev@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16684 utils: reset buf_len in print_out_devices() 75/50475/6
Alex Zhuravlev [Thu, 30 Mar 2023 10:10:23 +0000 (13:10 +0300)]
LU-16684 utils: reset buf_len in print_out_devices()

so that print_out_devices() doesn't hit false buffer overflow
leading to missing devices in lctl device_list output.

Fixes: ba0d5ffc1c ("LU-9680 utils: new llapi_param_display_value().")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia0ebbd0150651d7c631348201c26ea8b3d1ff704
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50475
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-16672 tests: auster node.yml labels Alma and Rocky as CentOS 42/50442/5
Charlie Olmstead [Mon, 27 Mar 2023 20:08:57 +0000 (14:08 -0600)]
LU-16672 tests: auster node.yml labels Alma and Rocky as CentOS

release() assumes a node with /etc/centos-release is CentOS. This patch
removes that assumption and uses the name in the centos-release file.
Corrected the os-release code to strip off the last word if present.

Test-Parameters: trivial

Change-Id: Ia5acbce3351ca23f4d9265d1aaf8d952a2c8b502
Signed-off-by: Charlie Olmstead <charlie@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50442
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Timothy Day <timday@amazon.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16666 doc: remove unmaintained doxyfiles 15/50415/2
Timothy Day [Fri, 24 Mar 2023 17:24:52 +0000 (17:24 +0000)]
LU-16666 doc: remove unmaintained doxyfiles

These Doxygen related files had old urls, some using http rather
than https. Some code scanning tools will flag this. Rather than
update these, remove all of the old doxyfiles. They are very out
of date, Doxygen throws many errors when you try to use them, and
they do not seem to generate usable documentation.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I5a24d4754582ecee558c4e87385b8835d2675adc
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50415
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-13340 mgs: convert class_parse_nid4 to class_parse_nid 94/50094/5
Mr NeilBrown [Wed, 12 Apr 2023 13:27:16 +0000 (09:27 -0400)]
LU-13340 mgs: convert class_parse_nid4 to class_parse_nid

All callers of class_parse_nid4() now use class_parse_nid()
and so much handle a large nid.

do_lcfg_nid() is introduced to help with this.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I502fa16871d689a8248e4243679918d58464efcd
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50094
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-13340 mgs: change record_add_uuid() to take large nids 93/50093/8
Mr NeilBrown [Wed, 12 Apr 2023 13:24:09 +0000 (09:24 -0400)]
LU-13340 mgs: change record_add_uuid() to take large nids

This just changes the function signature and does not really
change any functionality, because it always just converts the
struct lnet_nid back to lnet_nid_t.

Future patches will make more sense of this.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib3427ba7cfe54fc00427232835e7378922d8f616
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50093
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16382 spec: SUSE OBS requires kernel-source requirement. 65/49365/9
Mr NeilBrown [Mon, 12 Dec 2022 04:31:16 +0000 (15:31 +1100)]
LU-16382 spec: SUSE OBS requires kernel-source requirement.

The SUSE OBS creates a virtual environment containing
ONLY the stated BuildRequires requirements, and some defaults.
To be able to build ldiskfs we need the kernel source, so we
need "BuildRequires: kernel-source".

However when contrib/lbuild it is used to build lustre, it finds the
source by other means and fails on that BuildRequires.  So it must be
conditional on running under abuild. (the OBS build tool).

When abuild extracts these BuildRequires, it cannot parse %() so
conditions using that all appear to be "false".  But %() is the only
way to detect abuild - but looking in environment.

So a dance is needed to fit with all these odd requirements.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2fe9ecaf857ecbd5fda7e857b661b5b756501190
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49365
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-15374 tests: check FULL and IDLE for client import state 98/49298/4
Jian Yu [Thu, 13 Apr 2023 06:16:51 +0000 (23:16 -0700)]
LU-15374 tests: check FULL and IDLE for client import state

The client-to-OST import state can be FULL or IDLE.

Test-Parameters: trivial testgroup=review-dne-part-3

Test-Parameters: trivial env=SLOW=no,FAILURE_MODE=HARD \
clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
austeroptions=-R failover=true iscsi=1 \
testlist=recovery-mds-scale

Fixes: 25606a2ce1 ("LU-15342 tests: escape "|"")
Fixes: 3da8f014fd ("LU-12857 tests: allow clients to be IDLE after recovery")
Fixes: 5a6ceb664f ("LU-7236 ptlrpc: idle connections can disconnect")

Change-Id: I3582ceb273d241ee71fe907f6d1423746e453faa
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49298
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-14139 ptlrpc: grow reply buffer properly for batch request 45/40945/18
Qian Yingjin [Fri, 11 Dec 2020 10:38:03 +0000 (18:38 +0800)]
LU-14139 ptlrpc: grow reply buffer properly for batch request

This patch adds the support to grow the reply buffer for batch
PtlRPC request.
With this support, statahead sanity-pfl test for test_16b will
pass for large LOV stripes with overstriping.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Iaa7eb88b49d6ee068ec1fd9666a8bac2839b5041
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40945
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-14139 statahead: add stats for batch RPC requests 43/40943/20
Qian Yingjin [Fri, 11 Dec 2020 03:20:38 +0000 (11:20 +0800)]
LU-14139 statahead: add stats for batch RPC requests

This patch adds stats for batch PtlRPC request. It can show the
statistical information such as how many subreqs in a batch RPC.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I2f71ff5d01ab1070bd8d771a72edd786ad27f03c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/40943
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-12885 mdd: only bottom of clf_flags in changelog 22/36522/8
Andreas Dilger [Fri, 18 Oct 2019 13:39:50 +0000 (22:39 +0900)]
LU-12885 mdd: only bottom of clf_flags in changelog

Make it clear at the callers that only the bottom 12 bits of
clf_flags are stored in the Changelog record.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I23c0844e7ee59dd007b83f37d591df6fbe3ebbe5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/36522
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-10391 ptlrpc: change tc_nid in nrs to be struct lnet_nid 01/50101/6
Mr NeilBrown [Thu, 26 May 2022 04:42:44 +0000 (14:42 +1000)]
LU-10391 ptlrpc: change tc_nid in nrs to be struct lnet_nid

switch to struct lnet_nid and adjust accordingly.

Test-Parameters:trivial testlist=sanityn envdefinitions=ONLY="77"
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iaa73bc5b95d78f1d69500ea8856ae0d5cd442f8b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50101
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Nikitas Angelinas <nikitas.angelinas@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-10391 lustre: change cfs_match_nid to take large nid. 98/50098/7
Mr NeilBrown [Thu, 19 May 2022 01:45:19 +0000 (11:45 +1000)]
LU-10391 lustre: change cfs_match_nid to take large nid.

large nid now used more places.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I181ab0345a4bf2f9bb5c4b27eafb794968e8ef7e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50098
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-10391 obdclass: remove class_parse_nid4() 95/50095/5
Mr NeilBrown [Wed, 12 Apr 2023 13:29:58 +0000 (09:29 -0400)]
LU-10391 obdclass: remove class_parse_nid4()

class_parse_nid4() not used any more.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iefff39a5eaeed8a8ba6fd016e3392db9e92d5422
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50095
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-10391 libcfs: add large-nid string conversion functions. 92/50092/11
Mr NeilBrown [Mon, 20 Feb 2023 01:06:41 +0000 (12:06 +1100)]
LU-10391 libcfs: add large-nid string conversion functions.

The user-space libcfs now has functions to convert between strings and
large nids.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I554e2da0c3d56397ebe60fc84fad28fec6704a18
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50092
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-14976 nrs: change nrs policies at run time 23/48523/14
Etienne AUJAMES [Mon, 12 Sep 2022 11:13:13 +0000 (13:13 +0200)]
LU-14976 nrs: change nrs policies at run time

This patch take extra references on policy to avoid stop a NRS policy
with pending/queued request in it.

It uses a new refcount_t "pol_start_ref" for this purpose to keep
track of policy usage in started state. It enables to safely stop a
policy without "nrs_lock" and avoids to sleep in the spinlock.

It adds a wait queue field "pol_wq" in "struct ptlrpc_nrs_policy" to
wait all queued request in a stopping policy to be drained when
restarting policy with a different argument.

Add test sanityn 77r for this use case.

Test-Parameters: testlist=sanityn env=ONLY=77r,ONLY_REPEAT=20
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I1425f52324f755f1b76ea8210de52647c072a592
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48523
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-15300 mdt: refresh LOVEA with LL granted 13/46413/70
Alex Zhuravlev [Tue, 1 Feb 2022 20:50:02 +0000 (23:50 +0300)]
LU-15300 mdt: refresh LOVEA with LL granted

this change tries to fix two problems:
1) mdt_reint_open() fetches LOVEA before layout lock is taken.
   this may race with another process changing the layout and
   may result in a stale layout returned with a granted layout
   lock - re-fetch LOVEA once layout lock is granted

2) lov_layout_change() should not apply old layouts which
   can get through when MDS doesn't take layout lock

3) LFSCK shouldn't ignore layout version stored on MDS to avoid
   a situation when version degrades compared to client's copy.

This patch misses an optimization and can result in a number of
useless calls to OSD to fetch LOVEA. To be fixed in a followup
patch.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Idee1101d152ab09947faf6d75574a8761a7690a5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/46413
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Zhenyu Xu <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-8367 osp: remove unused fail_locs from sanity/27S,822 43/50543/4
Sergey Cheremencev [Wed, 5 Apr 2023 10:24:51 +0000 (13:24 +0300)]
LU-8367 osp: remove unused fail_locs from sanity/27S,822

OBD_FAIL_OST_GET_LAST_FID and OBD_FAIL_OSP_GET_LAST_FID
are not used anymore since sanity test_27S has been removed.
This may lead to failures in interoperability testing.
For example, 27S caused osp_precreate_cleanup_orphans to
stuck due to not cleared opd_pre_recovering flag(normally,
it should be set to 0 before calling osp_precreate_reserve).
It was the reason of failure tests 27U, 27R, 39r and 51d
in sanity.sh.

The patch also removes OBD_FAIL_NET_ERROR_RPC and
OBD_FAIL_OSP_PRECREATE_PAUSE that were used in sanity-822
"test precreate failure". This test also has been removed
in 63e17799a3.

Fixes: 63e17799a3 (LU-8367 osp: enable replay for precreation request)
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Ia78ce39adb9d59e476eb36e5e69954cc26353b27
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50543
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-15072 lod: extra status for spilling 53/45153/7
Alex Zhuravlev [Thu, 7 Oct 2021 16:07:14 +0000 (19:07 +0300)]
LU-15072 lod: extra status for spilling

to debug possible issues

Test-Parameters: trivial
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ibec0173c5fd70ecf835e96a372352989df1e1f92
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/45153
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandre Ioffe <aioffe@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16655 build: add wirecheck/wiretest for disk structs 82/50482/3
Andreas Dilger [Fri, 31 Mar 2023 23:43:54 +0000 (17:43 -0600)]
LU-16655 build: add wirecheck/wiretest for disk structs

Move OI scrub_file and related constants to lustre/lustre_disk.h
to ensure these on-disk structures do not change in the future.

Add lr_server_data and lsd_client_data structs from last_rcvd file.
Add replay_data_v1 and replay_data_v2 structs from recovery.

Move struct ost_layout together with lustre_ost_attrs where it is
used, and add PFID_STRIPE_IDX_BITS/PFID_STRIPE_COUNT_MASK checks
for the usage of this struct.

Change uuid.h header includes to allow uuid_t in lustre_disk.h.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie992ca383f24c29d2449ee22849a3c476096551c
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16655 scrub: change sf_uuid to guid_t 96/50496/2
Andreas Dilger [Fri, 31 Mar 2023 23:38:38 +0000 (17:38 -0600)]
LU-16655 scrub: change sf_uuid to guid_t

Change the type of sf_uuid from uuid_t to guid_t.  The sizes are
identical, but the benefit is that guid_t is usable in userspace
with the <linux/uuid.h> header, unlike uuid_t.

Change the accessors to use the corresponding guid_*() functions,
but no functional changes are needed.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3048cfee20a5e4ea2c0c2203d22eb76c1437577b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50496
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16163 tests: skip racer_on_nfs for NFSv3 79/50579/4
Alex Deiter [Fri, 7 Apr 2023 19:49:23 +0000 (23:49 +0400)]
LU-16163 tests: skip racer_on_nfs for NFSv3

Export ALWAYS_EXCEPT env for child NFS test

Fixes: 513eb670b0 ("LU-16163 tests: skip racer_on_nfs for NFSv3")
Test-Parameters: trivial testlist=parallel-scale-nfsv3
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: Ibb4a9916166f13ab9bd2374b33d4313453972276
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50579
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16728 misc: fix typos in comments 05/50605/4
Oleg Drokin [Tue, 11 Apr 2023 20:17:03 +0000 (16:17 -0400)]
LU-16728 misc: fix typos in comments

destroied -> destroyed
preapre -> prepare

Change-Id: Id00be2bb4a219d70bb4a69b90f624ef2cc0d6712
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50605
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
13 months agoLU-16510 build: include unsafe_memcpy definition 73/50573/2
Patrick Farrell [Fri, 7 Apr 2023 18:13:03 +0000 (14:13 -0400)]
LU-16510 build: include unsafe_memcpy definition

The original LU-16510 missed a key part of the
unsafe_memcpy code from the upstream kernel, and so we
weren't actually defining unsafe_memcpy() as intended.

Thanks to Aurelien Degremont <adegremont@nvidia.com> for
pointing this out.

Fixes: 919b93b9 ("LU-16510 build: fortified memcpy from linux 6.1")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib9e2d56ed0b3691f1ab9fcd25403fa86ac784b6d
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50573
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16708 tests: fix zconf_mount_clients for SSK 23/50523/2
Sebastien Buisson [Tue, 4 Apr 2023 14:53:22 +0000 (16:53 +0200)]
LU-16708 tests: fix zconf_mount_clients for SSK

When SHARED_KEY is in use, zconf_mount_clients can add 'skpath' mount
option to load nodemap-specific keys. But this must not overwrite
already specified mount options.

Fixes: 05e5cb0b0c ("LU-16683 tests: fix sanity-sec test_61 for SSK")
Test-Parameters: trivial
Test-Parameters: testlist=sanity-sec env=ONLY=61
Test-Parameters: testlist=sanity-sec env=SHARED_KEY=true,ONLY=61
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3dbc6ffe72722659ead77fb1c2c2675873c7aff2
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50523
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-16694 tests: remove deprecated h2tcp, h2o2ib 13/50513/2
Timothy Day [Fri, 31 Mar 2023 04:12:46 +0000 (04:12 +0000)]
LU-16694 tests: remove deprecated h2tcp, h2o2ib

These functions were deprecated about 6 years ago. It's
probably safe to remove them now.

Test-Parameters: trivial
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I6b16d0ad1dc1ea6f00fc730cb95672ef3c07f8a5
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50513
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16700 pcc: reserve flags for PCC-RO 04/50504/2
Qian Yingjin [Mon, 3 Apr 2023 09:38:03 +0000 (05:38 -0400)]
LU-16700 pcc: reserve flags for PCC-RO

This patch reserves flags for PCC-RO. It also adds wire check /
test for these new flags.

Test-Parameters: trivial
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I7086843770d16a6a7d2fd6b6ad3c77d43e04bf3b
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50504
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16674 obdclass: optimize job_stats reads 59/50459/7
Etienne AUJAMES [Tue, 28 Mar 2023 19:46:24 +0000 (21:46 +0200)]
LU-16674 obdclass: optimize job_stats reads

This patch has 2 objectives:

1/ limit the lock time on ojs_list (list of job stats)

"lctl get_param mdt.*.job_stats" can not dump job_stats in a single
read (seq_file buffer is limited to 4k). So, several reads are needed
to dump the full job list.
For each read, we have to find the job entry corresponding to the file
offset. For now, we walk ojs_list from the beginning to get this
entry.

This patch saved the last known entry and the corresponding offset to
start the next read from here.

2/ avoid the lock contention when reading job_stats

This patch replaces the read lock on ojs_lock by RCU locking, this
enables userspace processes reading the job_stats not to interfere
with the kernel target threads.

Add the stress test sanity 205g to check for possible races.
Add stack_trap in sanity test 205a and 205e to restore jobid_name and
jobid_var.

* Performance *

The following command is used to capture records:
$ time grep -c job_id /proc/fs/lustre/mdt/lustrefs-MDT0000/job_stats

- job_stats dump with no fs activity

Here are results after ending sanity test 205g with slow mode and
job_cleanup_interval=300s.
               ___________________________________
              | nbr of job | time | rate          |
 _____________|____________|______|_______________|
|without patch| 14749      | 1.3s | 11345 jobid/s |
|_____________|____________|______|_______________|
|with patch   | 22209      | 0.6s | 37015 jobid/s |
|_____________|____________|______|_______________|
|diff %       | +43%       | -54% | +226%         |
|_____________|____________|______|_______________|

- job_stats dump with fs activity

Here are results before ending sanity test 205g with slow mode and
job_cleanup_interval=300s.
               ___________________________________
              | nbr of job | time | rate          |
 _____________|____________|______|_______________|
|without patch| 14849      | 2.3s | 6428  jobid/s |
|_____________|____________|______|_______________|
|with patch   | 22776      | 1.2s | 18823 jobid/s |
|_____________|____________|______|_______________|
|diff %       | +53%       | -47% | +192%         |
|_____________|____________|______|_______________|

Test-Parameters: testlist=sanity env=SLOW=yes,ONLY=205g,ONLY_REPEAT=10
Test-Parameters: testlist=sanity env=ONLY=205g serverversion=2.15.2
Test-Parameters: testlist=sanity env=SLOW=yes,ONLY=205
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Ic4cd90965720af76eff0ed4e00ca897518bfbc66
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16670 enc: make sure DoM files are correctly decrypted 29/50429/5
Sebastien Buisson [Mon, 27 Mar 2023 08:46:07 +0000 (10:46 +0200)]
LU-16670 enc: make sure DoM files are correctly decrypted

Make sure DoM files are decrypted upon read by loading their
associated encryption context, via llcrypt_prepare_readdir()/
llcrypt_get_encryption_info().

Fix sanity-sec test_50 accordingly.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie9ef3cbb08d2295a2fd10b9e9ab0862119c7723e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50429
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16096 tgt: improve messages for reply_data 39/49939/7
Andreas Dilger [Tue, 7 Feb 2023 23:55:00 +0000 (16:55 -0700)]
LU-16096 tgt: improve messages for reply_data

Due to the format change in reply_data for WBC, it is not possible
to downgrade the MDT filesystem to an older release after mounting
it with 2.16.0 and later.

On the path to fixing that, improve the error messages printed by
tgt_reply_data_init() so that it is possible to see why the mount
is failing.  The next step is to avoid upgrading the replay_data
file if it is not necessary (no clients mounting with WBC enabled).

Test-Parameters: trivial testlist=conf-sanity env=ONLY=32
Fixes: bbf0017fdea5 ("LU-16096 recovery: upgrade reply data after recovery finish")
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Change-Id: Idd176031fef99b16de9278f05516dc45e817a819
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49939
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16483 ptlrpc: Track highest reply XID 07/49807/10
Chris Horn [Fri, 27 Jan 2023 20:47:25 +0000 (14:47 -0600)]
LU-16483 ptlrpc: Track highest reply XID

Keep track of the highest XID that we've received a reply for.
When an OBD_PING expires, do not disconnect the import if the failed
XID is less than or equal to the last reply XID. This avoids situation
where a lost OBD_PING rpc causes a reconnect even though we've
completed other RPCs in the meantime.

HPE-bug-id: LUS-11474
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I7e66bcc1368fa41ec86ffd843abac676f8d29254
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49807
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-930 doc: add lctl-lcfg man pages 87/48887/7
Timothy Day [Wed, 29 Mar 2023 20:16:19 +0000 (20:16 +0000)]
LU-930 doc: add lctl-lcfg man pages

Add "lctl lcfg_clear", "lctl lcfg_erase", and "lctl lcfg_fork"
as aliases for "clear_conf", "erase_lcfg", and "fork_lcfg"
respectively, for more consitent naming.

Add separate man pages for lctl-lcfg_clear, lctl-lcfg_erase,
lcfg-lcfg_fork, lctl-erase_lcfg, lctl-fork_lcfg, lctl-clear_conf.

Remove the combined man page for lctl-lcfg.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Timothy Day <timday@amazon.com>
Change-Id: I9daf2efc0cf0f9a2b549d8513098729151178207
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48887
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16077 tbf: pb_uid/pb_gid ptlrpc_body fields for TBF rules 35/48235/7
Etienne AUJAMES [Fri, 12 Aug 2022 19:41:08 +0000 (21:41 +0200)]
LU-16077 tbf: pb_uid/pb_gid ptlrpc_body fields for TBF rules

The file UID/GID are packed inside bulk IO because the requests are
sent asynchronously (cannot use the current thread UID/GID).

This is an issue for TBF rules if the file/inode UID/GID doesn't
match the process ones (e.g: reading common libraries): we can't limit
the user RPCs doing the IOs in that case.

This patch pack UID/GID for TBF rules inside ptlrpc_body (
pb_padding64_2 -> (pb_uid, pb_gid)) to be independent of quota
interactions: it stores the client process UID/GID instead of the
values of the file attrs.
Moreover, it enables to track requests naturally without UID/GID like
ldlm_flock_enqueue.

This patch saves the process UID/GID inside the ll_inode_info struct.
Then it restores these values when sending a bulk IO from a ptlrpc
thread (like for jobids).

Add sanityn test_77jb to verify.

Fixes: e0cdde1 ("LU-9658 ptlrpc: Add QoS for uid and gid in NRS-TBF")
Test-Parameters: testlist=sanityn env=ONLY=77,ONLY_REPEAT=20
Test-Parameters: serverversion=2.12.9 testlist=sanityn env=ONLY=77
Test-Parameters: clientversion=2.12.9 testlist=sanityn env=ONLY=77
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I61e42267ae568f9a1eb6fe57937a5e96f1824010
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/48235
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
13 months agoLU-13132 osd: osd-zfs to cache dbufs for llog objects 22/37222/55
Alex Zhuravlev [Wed, 18 Aug 2021 08:09:49 +0000 (11:09 +0300)]
LU-13132 osd: osd-zfs to cache dbufs for llog objects

working set for llog objects is tiny and very predictable. osd-zfs
can cache couple dbufs (first block storing the header and last
block for new records).

for sanity/60a (llog test) it gives 5939307 hits and 5776 misses
while average osd_write() goes down from 1.09 usec to 0.27 usec,
total time for sanity/60a: before - 153s, after - 101s.

this approach can be used in few other cases like last_rcvd.

Change-Id: Icc0126658894085d33ef79ae41ac6c1ed4140f4c
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/37222
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-980 mount: improve mount/unmount messages 11/50511/2
Andreas Dilger [Mon, 3 Apr 2023 20:31:26 +0000 (14:31 -0600)]
LU-980 mount: improve mount/unmount messages

In some cases, unmount errors are printed in multiple places, or
are expected so printing them on the console log is not necessary.

Conversely, some status messages such as mounting or unmounting the
whole target should not be rate limited.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id55f6ee3af5ad3cbe44a380314aa4b31f6b4bad3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50511
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16609 target: top_trans_create cannot alloc memory 76/50176/5
Andrew Perepechko [Tue, 10 Jan 2023 21:53:38 +0000 (16:53 -0500)]
LU-16609 target: top_trans_create cannot alloc memory

top_trans_create() requests __GFP_IO memory allocation,
which does not allow direct reclaim. However, if the
memory shortage is temporary, direct reclaim is reasonable.
GFP_NOFS is __GFP_IO with additional reclaim bits.

Change-Id: I2c84d9d74188660063c948573780745a2b59a688
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-11293
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50176
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
13 months agoLU-16732 ldiskfs: update for ext4-delayed-iput for RHEL9.1 15/50615/2
Shaun Tancheff [Wed, 12 Apr 2023 12:54:15 +0000 (07:54 -0500)]
LU-16732 ldiskfs: update for ext4-delayed-iput for RHEL9.1

ext4-delayed-iput patch does not apply cleanly to RHEL9.1
kernel.

Adjust the minor conflict in ext4_put_super()

Test-Parameters: trivial
Fixes: 616fa9b581 ("LU-15404 ldiskfs: use per-filesystem workqueues to avoid deadlocks")
HPE-bug-id: LUS-11570
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Id286904177cb444aa12e3c16e134d5acc17030f3
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50615
Reviewed-by: jsimmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-930 doc: update Alex Deiter contact info 58/50458/3
Alex Deiter [Tue, 28 Mar 2023 17:05:42 +0000 (21:05 +0400)]
LU-930 doc: update Alex Deiter contact info

Update my email address to the one corporate email

Test-Parameters: trivial
Change-Id: Ic540c5a289e2862212cc0164155b5f85cbc0d96c
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50458
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16646 krb: improve lookup of user's credentials 77/50377/7
Sebastien Buisson [Wed, 22 Mar 2023 16:09:58 +0000 (17:09 +0100)]
LU-16646 krb: improve lookup of user's credentials

Rather than only looking up for user's credentials in hard-coded
FILE:/tmp/krb5cc_<uid>, try first the default ccache on the system,
and then fallback to files matching /tmp/*krb5cc* and
/run/user/<uid>/*krb5cc*.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ic2bedb4cc12e9adad0ce63bd0617b2e0ec13907e
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50377
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Jonathan Calmels <jcalmels@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-16646 krb: use system ccache for Lustre services 42/50342/7
Sebastien Buisson [Fri, 17 Mar 2023 16:30:07 +0000 (17:30 +0100)]
LU-16646 krb: use system ccache for Lustre services

Instead of hard-coding a FILE credentials cache for Lustre services,
comply with the system configuration in place and use the default
ccache.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib89fe117e9f1d937925a02c7ed786a81cd8954cb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50342
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Jonathan Calmels <jcalmels@nvidia.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>