Whamcloud - gitweb
fs/lustre-release.git
22 months agoLU-16633 obdclass: fix rpc slot leakage
Alex Zhuravlev [Fri, 10 Mar 2023 17:47:05 +0000 (20:47 +0300)]
LU-16633 obdclass: fix rpc slot leakage

obd_get_mod_rpc_slot() can race with obd_put_mod_rpc_slot():
finishing wait_woken() resets WQ_FLAG_WOKEN (which is set
when the corresponding thread gets a slot incrementing
cl_mod_rpcs_in_flight. then another thread execting
__wake_up_locked_key() may find that wq_entry again and call
claim_mod_rpc_function() one more time again incrementing
cl_mod_rpc_in_flight. thus it's incremented twice for a
single obd_get_mod_rpc_slot().

 #1: obd_get_mod_rpc_slot() #2: obd_put_mod_rpc_slot()
flags &= ~WQ_FLAG_WOKEN
list_add()
wait_woken()
schedule claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()

flags &= ~WQ_FLAG_WOKEN

#3: obd_put_mod_rpc_slot()
claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()
list_del()

the patch introduces a replacement for WQ_FLAG_WOKEN which is never
reset once set.

Lustre-change: https://review.whamcloud.com/50261
Lustre-commit: 91a3726f313df33e099320d171039f8371fec27f

Fixes: 5243630b09 ("LU-15947 obdclass: improve precision of wakeups for mod_rpcs")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I29371c8c85414413c5a8e41dec3632f64ad127bb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51658
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-15947 obdclass: improve precision of wakeups for mod_rpcs
Mr NeilBrown [Mon, 21 Jun 2021 03:25:42 +0000 (13:25 +1000)]
LU-15947 obdclass: improve precision of wakeups for mod_rpcs

There is a limit of the number of in-flight mod rpcs with a
complication that a 'close' rpc is always permitted if there are no
other close rpcs in flight, even if that would exceed the limit.

When a non-close-request complete, we just wake the first waiting
request and assume it will use the slot we released.  When a
close-request completes, the first waiting request may not find a slot
if the close was using the 'extra' slot.  So in that case we wake all
waiting requests and let them fit it out.  This is wasteful and
unfair.

To correct this we revise the wait/wake approach to use a dedicated
wakeup function which atomically checks if a given task can proceed,
and updates the counters when permission to proceed is given.  This
means that once a task has been woken, it has already been accounted
and it can proceed.

To minimise locking, cl_mod_rpcs_lock is discarded and
cl_mod_rpcs_waitq.lock is used to protect the counters.  For the
fast-path where the max has not been reached, this means we take and
release that spinlock just once.  We call wake_up_locked while still
holding the lock, and if that woke the process, then we don't drop the
spinlock to wait, but proceed directly to the remainder of the task.

When the last 'close' rpc completes, the wake function will iterate
the whole wait queue until it finds a task waiting to submit a close
request.  When any other rpc completes, the queue will only be
searched until the maximum is reached.

Lustre-change: https://review.whamcloud.com/44041
Lustre-commit: 5243630b09d22e0b576d81390d604774881f63f7

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iff094c3188a3bd8a04edc1d5d98ec3014e2b059b
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51657
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
22 months agoLU-15398 tests: Use remote peers for health tests
Chris Horn [Tue, 4 Jan 2022 20:42:26 +0000 (14:42 -0600)]
LU-15398 tests: Use remote peers for health tests

LNet health may take different action depending on whether a NID
belongs to the local host or a remote peer. As such, the test cases
need to be careful to use remote or local NIs appropriately.

Introduce helper functions to create and cleanup LNet peers that are
needed for these tests. Convert existing test cases to use the new
helpers.

New function, lnet_if_list(), is added to test-framework.sh to
facilitate configuration of remote interfaces. do_rpc_nodes() modified
to recognize '--quiet' flag to ease parsing of lnet_if_list() output.

Tests 204 and 206 were re-worked to check the health state after each
simulated error. lnet_health_post() modified to reset peer and local
NI health so they are at max value when each error condition is
simulated.

Test 214, 215, and 250 were using hardcoded "eth0" names. These were
switched to use the INTERFACES variable.

The lnet_recovery_limit parameter is deprecated so remove lines that
were setting that parameter.

Lustre-change: https://review.whamcloud.com/45975
Lustre-commit: 3166a201e0a5cbc173ca110f64dc21f32ec10c8c

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-10661
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I685fda8a84bcce024a765ddfc81c085acf24607a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51682
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
22 months agoLU-14773 tests: quiet down some verbose messages
Andreas Dilger [Fri, 18 Jun 2021 21:35:29 +0000 (15:35 -0600)]
LU-14773 tests: quiet down some verbose messages

Don't print anything into the test logs for normal background
operations that are run as part of run_one(), so that they
don't clutter the test output with repeated/useless messages.

Lustre-change: https://review.whamcloud.com/44034
Lustre-commit: 86f16910645d9d9cad17c0f53ca1a375121e3f4c

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib6a49fc268e4cd0ad92c71a391865ce2d73ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51686
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
22 months agoLU-16916 tests: fix client_evicted() not to ignore EOPNOTSUPP
Jian Yu [Sat, 15 Jul 2023 14:36:13 +0000 (22:36 +0800)]
LU-16916 tests: fix client_evicted() not to ignore EOPNOTSUPP

After RHEL 9.x or Ubuntu 22.04 client is evicted, "lfs df" returns
error code 95 (EOPNOTSUPP), which is ignored in check_lfs_df_ret_val()
and then causes client_evicted() to ingore that error.

This patch fixes client_evicted() to check the return value
from "lfs df" directly so as not to ignore EOPNOTSUPP.

Lustre-change: https://review.whamcloud.com/51667
Lustre-commit: a5a9ded43b72238c2df8e0a74f03151ea3d4ce99

Test-Parameters: trivial clientdistro=el9.2 testlist=replay-vbr
Test-Parameters: trivial clientdistro=el8.8 testlist=replay-vbr
Test-Parameters: trivial clientdistro=ubuntu2204 testlist=replay-vbr

Change-Id: I633ae8769fc563b8068f433e2afae29463ac5553
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51691
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-8962 lfs: Handle non-lustre and multiple args
Arshad Hussain [Sat, 15 Jul 2023 14:25:54 +0000 (22:25 +0800)]
LU-8962 lfs: Handle non-lustre and multiple args

This patch addresses:

01: Handle multiple filesystems provided to 'lfs df'
02: Correctly report 'EOPNOTSUPP' for filesystems which
    are non-Lustre.
03: Make changes to test-framework.sh to handle modified
    return value from 'lfs df'. This changes For compatibility
    reason, ignores and masquerades EOPNOTSUPP as success.

The final return value is 0 for _all_ success or
value of the first failure for even a single failure
seen during the argument processing

sanity/56e Test-case added.

Lustre-change: https://review.whamcloud.com/42126
Lustre-commit: 2d714041ba718853be700960b76769a8fb44cf51

LU-15465 tests: conf-sanity failed with code 95

conf-sanity tests 27b, 47 and 84 (the tests execute 'fail mds1' and
then 'cleanup' at the end of test) failed with code EOPNOTSUPP because
of 'set -e' and lfs df <non_lustre> return code 95.
The scenario:
test_27b () {
  facet_failover $SINGLEMDS
    change_active mds1
  ...
  cleanup -> umount_client $MOUNT
}
formatall
  stopall
    activemds=`facet_active mds1`
    if [ $activemds != "mds1" ]; then
       fail mds1
         clients_up
           lfs_df_check
             + local clients=fre0111,fre0112
             + local rc
             + [ -z fre0111,fre0112 ]
             + pdsh -S -w fre0111,fre0112
                 /usr/bin/lfs df /mnt/lustre << lustre not mounted
pdsh@fre0111: fre0111: ssh exited with exit code 95
pdsh@fre0111: fre0112: ssh exited with exit code 95

To reproduce the issue just run:
  ONLY="27b" sh conf-sanity.sh or:
  ONLY="47" sh conf-sanity.sh or:
  ONLY="84" sh conf-sanity.sh

Lustre-change: https://review.whamcloud.com/46236
Lustre-commit: 242fc2ccbacaf171159a20d59c9633707d8fbf66

Fixes: 2d714041ba ("LU-8962 lfs: Handle non-lustre and multiple args")
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10680

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I73287d21792d89b8cde672acdaf9c9caf829522f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51690
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16651 llite: hold invalidate_lock when invalidate cache pages
Qian Yingjin [Tue, 21 Mar 2023 08:53:00 +0000 (04:53 -0400)]
LU-16651 llite: hold invalidate_lock when invalidate cache pages

The newer kernel (such as Ubuntu 2204) introduces a new member:
invalidate_lock in the structure @address_space.
The filesystem must exclusively acquire invalidate_lock before
invalidating page cache in truncate / hole punch (and thus calling
into ->invalidatepage) to block races between page cache
invalidation and page cache filling functions (fault, read, ...)

However, current Lustre client does not hold this lock when remove
pages from page cache caused by the revocation of the extent DLM
lock protecting them.
If a client has two overlapped PR DLM extent locks, i.e:
- L1 = <PR, [1M, 4M - 1]
- L2 = <PR, [3M, 5M - 1]
A reader process holds L1 and reads data in range [3M, 4M - 1].
L2 is being revoken due to the conflict access.
Then the page read-in by the reader may be invalidated and deleted
from page cache by the revocation of L2 (in lock blocking AST).

The older kernel will check each page after read whether it was
invalidated and deleted from page cache. If so, it will retry the
page read.

In the newer kernel, it removes this check and retry.
Instead, it introduces a new rw_semaphore in the address_space -
invalidate_lock - that holding the shared lock to protect adding
of pages to page cache for page faults / reads / readahead, and
the exclusive lock to protect invalidating pages, removing them
from page cache for truncate / hole punch.

Thus, in this patch it holds exclusive invalidate_lock in newer
kernels when remove pages from page cache caused by the revocation
of a extent DLM lock protecting them. Otherwsie, it will result in
-EIO error or partial reads in the new added test case sanity/833.

Lustre-change: https://review.whamcloud.com/50371
Lustre-commit: bba59b1287c9cd8c30a85fafb4fd5788452bd05c

Test-parameters: clientdistro=ubuntu2204 testlist=sanity env=ONLY=833,ONLY_REPEAT=10
Change-Id: If3a27002b89636b9fd4d7b5ea50afa9aeac5d121
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50353
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoLU-16913 quota: fix ASSERTION(lqe->lqe_gl)
Sergey Cheremencev [Tue, 11 Jul 2023 14:28:12 +0000 (18:28 +0400)]
LU-16913 quota: fix ASSERTION(lqe->lqe_gl)

It is possible to add in a 2nd time lqe into qmt_reba_list while
handling of the 1st from the 1st time is not finished. There is a
small window in qmt_id_lock_glimpse when lqe_link is empty but
lqe_gl is not set.

Lustre-change: https://review.whamcloud.com/51629
Lustre-commit: 5df0459712fe1af2bc9459b4ce1b5a1220682c26

Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I1168903bff88df7e5106186b082e8065a6480367
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51685
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
22 months agoEX-7423 tests: skip recovery-mds-scale/failover_mds
Alex Deiter [Wed, 19 Jul 2023 18:52:35 +0000 (22:52 +0400)]
EX-7423 tests: skip recovery-mds-scale/failover_mds

Put recovery-mds-scale/failover_mds to the always_except
list until LU-16671 has been fixed.

Test-Parameters: trivial testlist=recovery-mds-scale
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: Id6bf929a71c3e7d6a190d4c971120f0b93159393
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51718
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
22 months agoEX-7938 tests: skip sanity-compr sanity/100
Andreas Dilger [Sat, 22 Jul 2023 17:27:49 +0000 (11:27 -0600)]
EX-7938 tests: skip sanity-compr sanity/100

Skip test_100 due to constant failures and unclear relation to
anything related to PFL/compression.

Test-Parameters: trivial testlist=sanity-compr
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic426498c120f8d97def44fd2531930e0a183e74f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51746
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoEX-6826 tests: wait before unmounting pcc device if busy
Lei Feng [Tue, 11 Jul 2023 03:13:01 +0000 (11:13 +0800)]
EX-6826 tests: wait before unmounting pcc device if busy

wait at most 10 seconds before unmounting pcc device
if it is busy.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=sanity-pcc env=ONLY=25,ONLY_REPEAT=100
Change-Id: I77ec018d33d14af99bdc5d5c5c94c8fa0dafdb61
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51623
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16953 tests: wait longer in replay-dual/test_31
Lei Feng [Mon, 10 Jul 2023 07:02:32 +0000 (15:02 +0800)]
LU-16953 tests: wait longer in replay-dual/test_31

Wait until file was created in replay-dual/test_31.

Lustre-change: https://review.whamcloud.com/51621
Lustre-commit: TBD (from eed10c2ef36c7e1aebec27ce943b80bd0174ddf0)

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial testlist=replay-dual env=ONLY=31,ONLY_REPEAT=100
Change-Id: I847beb51d53e667f1599c9693aa5eb099dcf9435
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51622
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7786 test: get size before fallback to generic
Mikhail Pershin [Tue, 4 Jul 2023 19:13:58 +0000 (22:13 +0300)]
EX-7786 test: get size before fallback to generic

If ll_file_seek() falls back to generic_file_llseek_size()
then make sure inode size is reliable

Test-Parameters: testlist=sanity env=ONLY=460f,ONLY_REPEAT=4 clientdistro=el9.1
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ieb8c90e66fb19675ba41f0147c0d9cdaf29ea20a
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51564
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16943 tests: fix replay-single/135 under hard failure mode
Jian Yu [Fri, 14 Jul 2023 06:22:25 +0000 (14:22 +0800)]
LU-16943 tests: fix replay-single/135 under hard failure mode

This patch fixes replay-single test_135() to load libcfs module
on the failover partner node to avoid 'fail_val' setting error.
It also fixes the issue that not all of the OSTs are mounted after
failing back ost1.

Lustre-change: https://review.whamcloud.com/51574
Lustre-commit: TBD (from 1b73b6465b77744992bb1f6d782362bf0cf7f409)

Test-Parameters: trivial testlist=replay-single

Test-Parameters: trivial env=FAILURE_MODE=HARD \
    clientcount=4 mdtcount=1 mdscount=2 osscount=2 \
    austeroptions=-R failover=true iscsi=1 \
    testlist=replay-single

Change-Id: Id46c722a6db9d832829a739f41f7462b32a6d9d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
23 months agoLU-16868 tests: add check for replace_nid
Alex Deiter [Fri, 30 Jun 2023 18:02:49 +0000 (22:02 +0400)]
LU-16868 tests: add check for replace_nid

Added check for replace_nid operations and return an
error to prevent module reload errors and timeout
when unmounting targets.

Lustre-change: https://review.whamcloud.com/51524
Lustre-commit: TBD (from a822c82326821c5c30e14d9620cd2976d5438714)

Test-Parameters: trivial
Test-Parameters: testlist=conf-sanity env=ONLY=32a serverdistro=el7.9
Test-Parameters: testlist=conf-sanity env=ONLY=32a serverdistro=el8.7
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I29a5de826ac8f0040dd671e502d30bac4a082c43
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51604
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-15046 tests: skip sanity-flr/test_200c for old MDS
Alex Deiter [Sun, 9 Jul 2023 12:35:53 +0000 (16:35 +0400)]
LU-15046 tests: skip sanity-flr/test_200c for old MDS

Skip sanity-flr test_200c for old MDS missing the fix
for synchronized replicas and its corresponding test.

Lustre-change: https://review.whamcloud.com/51649
Lustre-commit: TBD (from f152fe6e313843067c4c32299acf066d41896d61)

Fixes: b7ec0d2390 ("LU-15046 osp: precreate thread vs connect race")
Test-Parameters: trivial
Test-Parameters: testlist=sanity-flr env=ONLY=200c serverversion=2.14.0-ddn85
Test-Parameters: testlist=sanity-flr env=ONLY=200c serverversion=2.14.0-ddn89
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I87cd7d6b767086f993a27ce6905b05f87e325474
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51613
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16341 tests: skip sanity-quota/test_1b for old MDS
Alex Deiter [Sun, 9 Jul 2023 14:01:43 +0000 (18:01 +0400)]
LU-16341 tests: skip sanity-quota/test_1b for old MDS

Skip sanity-quota test_1b for old MDS missing the fix
for LU-16341 kernel NULL in qmt_site_recalc_cb.

Lustre-change: https://review.whamcloud.com/51648
Lustre-commit: TBD (from a47128600fce1dd5135af610d0b31dafe1baa9d0)

Fixes: d965d63415 ("LU-16341 quota: fix panic in qmt_site_recalc_cb")
Test-Parameters: trivial
Test-Parameters: testlist=sanity-quota env=ONLY=1b serverversion=2.14.0-ddn85
Test-Parameters: testlist=sanity-quota env=ONLY=1b serverversion=2.14.0-ddn87
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I1b1bc3fdfa8f36b0c20a9a06721735c8e02c034c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51612
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16351 llite: Linux 6.1 prandom, folios_contig, vma_iterator
Shaun Tancheff [Fri, 27 Jan 2023 06:54:42 +0000 (00:54 -0600)]
LU-16351 llite: Linux 6.1 prandom, folios_contig, vma_iterator

Linux commit v4.10-rc3-6-gc440408cf690
  random: convert get_random_int/long into get_random_u32/u64
Linux commit v6.0-11338-gde492c83cae0
  prandom: remove unused functions

prandom_u32 is a wrapper around get_random_u32, change users
of prandom_u32 to get_random_u32 and provide a fallback
to prandom_u32 when get_random_u32 is not available.

Linux commit v6.0-rc1-2-g25885a35a720
  Change calling conventions for filldir_t
Add a test for the new filldir_t signature
Provide wrappers for transition from int (error code) to bool

Linux commit v6.0-rc3-94-g35b471467f88
  filemap: add filemap_get_folios_contig()
Provide a wrapper and fallback to find_get_pages_contig

Linux commit v6.0-rc3-225-gf39af05949a4
  mm: add VMA iterator
Use vma_iterator and for_each_vma when available.

Lustre-change: https://review.whamcloud.com/49232
Lustre-commit: ca992899d55fd13e65b75ace02931daaa29c18bd

Test-Parameters: trivial
HPE-bug-id: LUS-11377
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I23dc23d0252e1995555b6685f5cf7c207edf642b
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
23 months agoLU-16328 llite: migrate_folio, vfs_setxattr
Shaun Tancheff [Sun, 2 Apr 2023 11:47:51 +0000 (06:47 -0500)]
LU-16328 llite: migrate_folio, vfs_setxattr

Linux commit v5.19-rc3-392-g5490da4f06d1
 fs: Add aops->migrate_folio

Linux commit v5.19-rc4-52-ge33c267ab70d
  mm: shrinkers: provide shrinkers with names

From Linux commit v5.19-rc5-17-g0c5fd887d2bb
  acl: move idmapped mount fixup into vfs_{g,s}etxattr()
Until Linux commit v6.0-rc3-6-g6344e66970c6
  xattr: constify value argument in vfs_setxattr()
Cast away const when required.

Linux commit v5.19-10313-geba2d3d79829
  get rid of non-advancing variants
iov_iter_get_pages_alloc2() replaces iov_iter_get_pages_alloc()

Lustre-change: https://review.whamcloud.com/49265
Lustre-commit: 0006eb36440dcb4dc06aa61c35db40bf7dec0ddc

Test-Parameters: trivial
HPE-bug-id: LUS-11358
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Id1fa6db94172c0a61008ba4b066907950bdd6473
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51624
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
23 months agoLU-16709 lnet: fix locking multiple peer NIDs
Serguei Smirnov [Tue, 20 Jun 2023 19:21:42 +0000 (12:21 -0700)]
LU-16709 lnet: fix locking multiple peer NIDs

If Lustre identifies the same peer with multiple NIDs,
as a result of peer discovery it is possible that
the discovered peer is found to contain a NID which is locked
as primary by a different existing peer record.
In this case it is safe to merge the peer records,
but the NID which got locked the earliest should be
kept as primary.

This allows for the first of the two locked NIDs
to stay primary as intended for the purpose of communicating
with Lustre even if peer discovery succeeded
using a different NID of MR peer.

This patch adds updates to the original port because master
version of of this moment evolved after it was landed.

Lustre-change: https://review.whamcloud.com/50530
Lustre-commit: 3b7a02ee4d656b7b3e044713681da2f56dddb152

Test-parameters: trivial testlist=sanity-lnet

Fixes: 1a2db3e14b78 ("EX-7251 lnet: fix locking multiple NIDs")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I3303e618b37a76c30be6426972e7853bb31ae497
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51384
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-13343 gss: no sec flavor on loopback connection
Sebastien Buisson [Fri, 4 Mar 2022 15:45:59 +0000 (16:45 +0100)]
LU-13343 gss: no sec flavor on loopback connection

When using a local client, i.e. a client mounted on a server node,
there is no benefit from a security standpoint to enforce an SSK or
KRB flavor, since the data does not go over the network.
So force the 'null' security flavor for connections on 0@lo,
independently of the currently defined srpc flavor.

Lustre-change: https://review.whamcloud.com/46704
Lustre-commit: e3e91ea95fd96a5eafc598e3812390b4cbac05c3

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: If25d69bb1e67735cb0544ca954e49175f7471248
Reviewed-by: Aurelien Degremont <adegremont@nvidia.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51610
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16747 llapi: fix race in get_root_path_slow()
Etienne AUJAMES [Sat, 15 Jul 2023 11:29:30 +0000 (19:29 +0800)]
LU-16747 llapi: fix race in get_root_path_slow()

The patch bdf7788d ("LU-8585 llapi: use open_by_handle_at in
llapi_open_by_fid") caches the Lustre root fd to avoid re-openning
it each time an ioctl() is needed on the fs.

For now, only 1 entry is stored. If a llapi call is performed on
another mountpoint, llapi needs to close the old root fd and open a
new one.

A race condition exists at startup, when root_cached.fd is not
initialized yet. Several threads try to determine root information at
the same time (in get_root_path_slow()). Those threads will close(),
open() and update different "root_cached.fd".
The usage of a closed root fd will return EBADFD (e.g: in
llapi_open_by_fid(), llapi_hsm_request() or llapi_fid2path()).

This patch checks if the fs is the same before updating the root
entry. If so, the root entry (and cached root fd) will not be changed.

Add the regresion test sanityn 85 (llapi_root_test).

Lustre-change: https://review.whamcloud.com/50682
Lustre-commit: 9ef1e097d53000233f9ba23319268f467c276173

Test-Parameters: trivial testlist=sanityn env=ONLY=85,ONLY_REPEAT=20
Test-Parameters: testlist=sanityn
Test-Parameters: testlist=sanity

Fixes: bdf7788d ("LU-8585 llapi: use open_by_handle_at in llapi_open_by_fid")
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I681aac7d5715022e700cdb092db94deaa6bf6a8f
Reviewed-by: Guillaume Courrier <guillaume.courrier@cea.fr>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51689
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16101 tests: skip sanity/27J for more kernels
Andreas Dilger [Wed, 5 Jul 2023 20:32:10 +0000 (14:32 -0600)]
LU-16101 tests: skip sanity/27J for more kernels

This is a bug in the kernel that is not present in older kernels
before commit v5.11-10234-gcbd59c48ae2b (5.12), and is fixed with
commit v6.2-rc4-61-g5956592ce337 (6.2).

Move this from ALWAYS_EXCEPT (bug that needs to be fixed) to skip
(test that is known to fail in some configs but has been fixed).

Lustre-change: https://review.whamcloud.com/51567
Lustre-commit: b711af7d243f3773cec3a37f64c0e0aa8bbc363f

Fixes: af6f49698a18 ("LU-16101 tests: add sanity/27J to always_except")
Test-Parameters: trivial testlist=sanity clientdistro=el9.2 env=ONLY=27J
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8ec0a6d25a90e05672b039cd6c2b2fbf8a3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51583
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-15519 quota: fallocate does not increase projectid usage
Arshad Hussain [Mon, 14 Feb 2022 08:36:47 +0000 (14:06 +0530)]
LU-15519 quota: fallocate does not increase projectid usage

fallocate() was not accounting for projectid quota usage.
This was happening due to two reasons. 1) the projectid
was not properly passed to md_op_data in ll_set_project()
and 2) the OBD_MD_FLPROJID flag was not set receive the
projctid.

This patch addresses the above reasons.

Test-case: sanity-quota/78a added

Lustre-change: https://review.whamcloud.com/46676
Lustre-commit: 5fc934ebbbe665f24e2f11fe224065dd8e9a08ba

Fixes: 48457868a02a ("LU-3606 fallocate: Implement fallocate preallocate operation")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I3ed44e7ef7ca8fe49a08133449c33b62b1eff500
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51639
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoRM-620 build: New tag 2.14.0-ddn93
Andreas Dilger [Thu, 6 Jul 2023 05:21:01 +0000 (23:21 -0600)]
RM-620 build: New tag 2.14.0-ddn93

New tag 2.14.0-ddn93

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic6530f389228232cf1f763810d2abcbebc9b51d7

23 months agoLU-16937 utils: avoid lctl shmget() if not needed
Andreas Dilger [Fri, 30 Jun 2023 10:29:49 +0000 (04:29 -0600)]
LU-16937 utils: avoid lctl shmget() if not needed

lctl is dynamically allocating an IPC shared memory segment
during every startup, even though it is only needed for a
small number of uncommon debug commands:

    shmget(IPC_PRIVATE, 65680, 0600)        = 196641
    shmat(196641, NULL, 0)                  = 0x7f752b1c5000
    shmctl(196641, IPC_RMID, NULL)          = 0

This setup can be moved to sub-commands that actually need it.

Lustre-change: https://review.whamcloud.com/51526
Lustre-commit: TBD (from 309713169fde9e162c26e909bc83cb43cccd67ba)

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I41c790ce7cba2d9c48c1ec06eb23eb94aa548242
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51516
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16934 kernel: update RHEL 8.8 [4.18.0-477.15.1.el8_8]
Jian Yu [Fri, 30 Jun 2023 11:32:03 +0000 (19:32 +0800)]
LU-16934 kernel: update RHEL 8.8 [4.18.0-477.15.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.15.1.el8_8.

Lustre-change: https://review.whamcloud.com/51517
Lustre-commit: TBD (from 7174f706328cb4e6a52c898c1cd7719b81e26c0d)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I66365dce63065a0a07958a182a3c705e9948d424
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51519
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7683 utils: always try to use our own lz4/lz4hc
Sebastien Buisson [Tue, 27 Jun 2023 15:50:03 +0000 (15:50 +0000)]
EX-7683 utils: always try to use our own lz4/lz4hc

lz4/lz4hc provided by the kernel do not grok a compression level.
The built-in lz4/lz4hc do, so always build them as dedicated kernel
modules llz4.ko and llz4hc.ko, with the same .cra_name but with a
slightly higher .cra_priority = 110, so that they are preferred over
the in-kernel modules if any.

And try to manually load the llz4/llz4hc kernel modules when a file
requires compression with the corresponding alg. This is a "one-shot"
try that allows us to prefer our modules that has level support, but
continues to at least compress/decompress files even if our own
modules are not available.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0bdf267f998e21df81e460250a653aed34e3215d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-8582 tests: skip sanity/905 for old OSTs
Andreas Dilger [Wed, 5 Jul 2023 01:04:10 +0000 (19:04 -0600)]
LU-8582 tests: skip sanity/905 for old OSTs

The fail_loc used in sanity test_905 does not exist in older OSTs.
Skip this subtest for older OSTs.

Lustre-change: https://review.whamcloud.com/51568
Lustre-commit: TBD (from 2ced1e0898aacd741c95c25d44350dfefa953853)

Fixes: 566edb8c43 ("LU-8582 target: send error reply on wrong opcode")
Test-Parameters: trivial testlist=sanity serverversion=2.12.9 env=ONLY=905
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8fa50ec0f66afd9f24d562e0be57a416c04d8ba8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51569
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-15235 tests: skip sanity/56od in interop
Andreas Dilger [Wed, 5 Jul 2023 20:13:42 +0000 (14:13 -0600)]
LU-15235 tests: skip sanity/56od in interop

Sanity test_56oc and test_56od were using the btime_supported()
function to check it "lfs find" supported file birth time, but
this did not properly check whether the MDS supported this option.

Remove the btime_supported() check and just use the version, since
this has been around a few releases already.

Lustre-change: https://review.whamcloud.com/51580
Lustre-commit: TBD (from 8add332bda0c58d9908478b9263e8aea46edc135)

Fixes: 186b97e68abb ("LU-11971 utils: Send file creation time to clients")
Test-Parameters: trivial testlist=sanity serverversion=2.12.9 env=ONLY=56
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c85103c843d3b993e3e112bf5d0da976d3ebbe5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51581
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoEX-7814 tests: fixed shell syntax error
Alex Deiter [Tue, 4 Jul 2023 10:30:57 +0000 (14:30 +0400)]
EX-7814 tests: fixed shell syntax error

Fixed a regression introduced in LU-16399.
Auster assumes that all scripts are written in bash,
so this commit removes hardcoded sh calls.

Test-Parameters: trivial testlist=parallel-scale-nfsv4
Fixes: ea13f42719 ("LU-16399 tests: add subtests setup/cleanup records")
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Change-Id: I20d136db3b763143df038d25a167eb84f646444f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51556
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7758 test: Add sanity-compr.sh to run sanity with PFL layout
Wei Liu [Mon, 26 Jun 2023 20:33:21 +0000 (13:33 -0700)]
EX-7758 test: Add sanity-compr.sh to run sanity with PFL layout

Add sanity-compr.sh to run sanity and sanityn with PFL layout
Also fix sanity subtests problem of 56ba,57b,65e,65g,65n,204e

Test-Parameters: trivial testlist=sanity-compr

Signed-off-by: Wei Liu <sarah@whamcloud.com>
Change-Id: Iefdc7757697629eb5c57d7694456249d62a2049e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51577
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-15934 lod: clear up the message
Yang Sheng [Thu, 29 Dec 2022 17:46:56 +0000 (01:46 +0800)]
LU-15934 lod: clear up the message

Print out the precise info while llog context error.

Lustre-change: https://review.whamcloud.com/49528
Lustre-commit: 9882d4e933fd8cdbc4a9bc8bf6b29655009f7e03

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I492201cd3ae5eb39ad34f3a873d7bb346b52430f
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51555
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoEX-6856 utils: add -Z option for 'lfs getstripe/find'
Bobi Jam [Wed, 28 Jun 2023 16:51:25 +0000 (00:51 +0800)]
EX-6856 utils: add -Z option for 'lfs getstripe/find'

Add support for "lfs getstripe -Z" to get the last instantiated
component compression information.

Add support for "lfs find -Z <type>[:[+-]<level>]" to keep consistent
options with "lfs setstripe -Z".

Fixes: 093bd2f343 ("EX-6856 utils: add 'lfs find' support for compressed file")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id3e788761656d05604bc9a72fb1e51c5f2a0ad3b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51497
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-15934 tests: add a test case for update llog
Yang Sheng [Sat, 3 Jun 2023 18:47:30 +0000 (02:47 +0800)]
LU-15934 tests: add a test case for update llog

A test case to simulate the update llog corruption
situation. It replaces the catalog file with a
random data. The recovery of mdt will be blocked
if without the fixing patch.

Lustre-change: https://review.whamcloud.com/51208
Lustre-commit: 54301fe4f598eef5aebdbdb0c7f3dddea9541c4e

Fixes: 814691bcff ("LU-15934 lod: renew the update llog")
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I0ade8d0ff33ddc06b622e5e67cf4b4775dfff129
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51570
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-15934 lod: renew the update llog
Yang Sheng [Fri, 6 Jan 2023 13:10:35 +0000 (21:10 +0800)]
LU-15934 lod: renew the update llog

Skip and renew the update llog file while it was
corrupted.

Lustre-change: https://review.whamcloud.com/49569
Lustre-commit: 814691bcffab0a19240121740fb85a1912886a3c

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I3491858dce42b4a8ed11db55ebbf8a12ef5f521d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51552
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoEX-6127 tests: test compression without bzip2/HDF5
Andreas Dilger [Sat, 1 Jul 2023 23:57:26 +0000 (17:57 -0600)]
EX-6127 tests: test compression without bzip2/HDF5

Run as much of sanity test_460a compression tests as possible,
even if bunzip2 or HDF5 file are unavailable.

Print a clear message in test_84 if bunzip2 unavailable.

Fixes: f43b9ce9af ("EX-6127 osc: osc brw request compression")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I36251834f636600eb9b0194ccd14c8b203da32e5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51532
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16340 quota: notify only global lqe
Sergey Cheremencev [Mon, 25 Apr 2022 18:49:55 +0000 (21:49 +0300)]
LU-16340 quota: notify only global lqe

Don't notify slaves with new limits when set new
limits to the pools. Do this only for lqes that
belong to the global pool.

The fix helps to avoid a case when slaves do not
apply new limit because slaves` data version is
greater than the one comes from the MDT. It was
possible, if set a lot of times different limits to PQ.
After that new limits from the global pool could not
be be applied:

qsd_upd_schedule()) lustre-OST0000: discarding glb
update from glimpse ver:7 local ver:203

For details about the problem see "check indexes versions"
test in sanity-quota.sh.

Add test 25 "check indexes versions" into sanity-quota.
Without the fix it reproduces above problem.

Fix checkpatch to don't print "Invalid vsprintf pointer
extension" for %px.

HPE-bug-id: LUS-10705
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Lustre-change: https://review.whamcloud.com/c/fs/lustre-release/+/49239/
Lustre-commit: 513b1cdbca58913249eb524a37374c418fdec44f

Change-Id: Idb091a10894e9db9f67d215baef2926723d6c65d
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51551
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
23 months agoEX-7818 osc: don't do decomp. with compression disabled
Patrick Farrell [Tue, 4 Jul 2023 18:44:22 +0000 (14:44 -0400)]
EX-7818 osc: don't do decomp. with compression disabled

decompress_request likely has a significant performance
cost for checking if data is compressed, so we should not
call it when compression is disabled.

This is a stop-gap solution for the preview until we can
have the server tell the client if data is compressed, as
described in EX-7818.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iba8feba3ab0fe620d0594f59c2c6ddea25faeb4f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51563
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7807 osc: don't discard decompress_request error
Artem Blagodarenko [Sat, 1 Jul 2023 22:13:20 +0000 (23:13 +0100)]
EX-7807 osc: don't discard decompress_request error

The error handling for decompress_request is unusual - non-zero
returns are just discarded. And rc2 is just discarded.
The read() doesn't fail or get a short read.

Fix this so if decompression fails with an error.

Reported-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Fixes: f43b9ce9af ("EX-6127 osc: osc brw request compression")
Change-Id: Idd01947c7375c9586a64f064dd6ee0ac2308ea86
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51531
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7812 llite: use CIT_MISC in check_compression
Patrick Farrell [Mon, 3 Jul 2023 23:57:17 +0000 (19:57 -0400)]
EX-7812 llite: use CIT_MISC in check_compression

ll_mmap_check_compression uses a CIT_FAULT io type and
initializes it with cl_io_rw_init.  This doesn't
initialize the IO correctly and this sometimes results
in the following crash during io_fini, because the ft_page
pointer is initialized to something else by cl_io_rw_init:

[ 1512.276302] BUG: unable to handle kernel NULL pointer dereference at 000000000000004a
[ 1512.280778] RIP: 0010:cl_pagevec_put+0x9f/0x3a0 [obdclass]
[ 1512.288693]  vvp_io_fault_fini+0x21/0x40 [lustre]
[ 1512.289294]  cl_io_fini+0x7a/0x230 [obdclass]
[ 1512.289876]  ll_mmap_check_compression+0x403/0x540 [lustre]

Fixes: 95df962a92 ("EX-6265 llite: disable mmap on compressed files"

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib1f270f071370d6c045bb9d799ab5b7b41a6c6be
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51553
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-6127 llite: fix chunk_bits usage in readahead
Sebastien Buisson [Thu, 29 Jun 2023 16:06:47 +0000 (18:06 +0200)]
EX-6127 llite: fix chunk_bits usage in readahead

For the minimum compression chunk size, chunk bits is zero,
so we cannot use if (chunk_bits) to determine if we're
doing compression.

This also fixes two other things:
1. A rounding error when rounding to chunk
2. Move rounding of end_idx to before first usage of
end_idx, so calculation of number of pages is correct
Without this, when the user reads 1 page or less, readahead
will calculate the readahead page count as 0 and will exit
without reading the chunk.

Fixes: c05d5990f4 ("EX-6127 llite: getting stripe info optimization")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I273506fd4f6ed5f0b8b5020357fd7caf0531e61c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51504
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-6856 utils: support 'lfs find --printf %LZ'
Bobi Jam [Wed, 28 Jun 2023 18:37:22 +0000 (02:37 +0800)]
EX-6856 utils: support 'lfs find --printf %LZ'

Add support for "lfs find --printf %LZ" to print the compression
type:level of the last instantiated component of a file.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Iaf1b6c031b06c70e7b5be51354697aa6bdcc9850
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51498
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoRM-620 build: New tag 2.14.0-ddn92
Andreas Dilger [Sat, 1 Jul 2023 23:06:17 +0000 (17:06 -0600)]
RM-620 build: New tag 2.14.0-ddn92

New tag 2.14.0-ddn92

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icc54de2ab22bff722ec19c3293767706458f684b

23 months agoEX-6127 osc: add compression to can_merge_pages
Patrick Farrell [Fri, 30 Jun 2023 20:40:51 +0000 (16:40 -0400)]
EX-6127 osc: add compression to can_merge_pages

Some BRW flags are OK to have on only some pages in a BRW,
others are not.  can_merge_pages has a whitelist of the
flags which are safe to have on only some pages in a BRW,
and prints a warning if other flags are seen.

Add compression to the white list, because while all pages
in an niobuf must be compressed, it is normal to have only
some pages in a BRW compressed.

Prior to this patch, this warning was printing during
normal usage of compression.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia262d4fc878e5328bd956865047e997aa77946f0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51528
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-6206 lustre: add lgzip kernel module
Sebastien Buisson [Fri, 23 Jun 2023 11:58:39 +0000 (13:58 +0200)]
EX-6206 lustre: add lgzip kernel module

lgzip kernel module implements compression according to the
deflate/zlib algorithm, through the kernel Crypto API.
It provides 2 cipher drivers under the generic name 'deflate':
* deflate-lustre-generic of type compression
* deflate-lustre-scomp of type scomp

Note the 'deflate' name is identical to the in-kernel module, but
lgzip registers it with a slightly higher .cra_priority = 110, so that
it is preferred over the in-kernel module.
Our 'deflate' is also different in that it accepts a compression level
as explained below.

lgzip kernel module sources are copied from linux v6.2-rc5 and renamed
to gzip.c to avoid name collisions. It implements the Crypto API
interface, and rely on the deflate/zlib kernel library for compression
implementation. It has been modified to grok a compression level, as
read from the top 4 bits of the crypto_tfm flags, and pass it to the
underlying library.
The deflate/zlib library sources are also copied from linux v6.2-rc5
and built statically. Headers have also been copied from linux
v6.2-rc5 for consistency, and source files modified to include the
copied headers instead of the system headers.
All aforementioned sources are located in the lustre/gzip directory.
The lgzip module is always built with Lustre.

This patch enhances the test kernel module kcompr.ko to exercise the
compression level of the provided 'deflate' module.
It also tries to manually load the lgzip kernel module when a file
requires compression with the 'delfate' alg. This is a "one-shot" try
that allows us to prefer our module that has level support, but
continues to at least compress/decompress files even if our own module
is not available.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2b0386457bc781d91816172dea6b52ce3dd273f4
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51422
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7784 tests: skip sanity 460 on 64k PAGE_SIZE
Artem Blagodarenko [Thu, 29 Jun 2023 13:49:11 +0000 (14:49 +0100)]
EX-7784 tests: skip sanity 460 on 64k PAGE_SIZE

Skip this test on the system with 64k PAGE_SIZE.

Test-Parameters: trivial testlist=sanity clientdistro=el8.7 clientarch=aarch64
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I9324ae975bfc3c4f08d5048d7c977447ef62cc78
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51501
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-14301 revert: client: use EOPNOTSUPP instead of ENOTSUPP
Andreas Dilger [Sat, 1 Jul 2023 10:22:53 +0000 (04:22 -0600)]
LU-14301 revert: client: use EOPNOTSUPP instead of ENOTSUPP

This reverts commit 3deafa8a39a964c67f533173a5b2f90d0d2d7730.
Accidentally landed and is likely to cause problems.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7386 build: Update lipe tag to 2.24
Andreas Dilger [Sat, 1 Jul 2023 10:09:49 +0000 (04:09 -0600)]
EX-7386 build: Update lipe tag to 2.24

New version tag.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I711231b7ec063ed9bf8a0976e18360bebdc84d3b

23 months agoLU-14301 client: use EOPNOTSUPP instead of ENOTSUPP
Andreas Dilger [Fri, 30 Jun 2023 00:19:06 +0000 (18:19 -0600)]
LU-14301 client: use EOPNOTSUPP instead of ENOTSUPP

Don't return NFS-specific error code ENOTSUPP back to userspace,
instead use EOPNOTSUPP.

Lustre-change: https://review.whamcloud.com/51511
Lustre-commit: TBD (from 1a0d553000d5a869f9039bab74dbdbb20d4259b0)

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iabd07b31069737e8ee7ca2382fd8cff6143ebbe5

23 months agoEX-6265 llite: allow mmap reads of compressed files
Patrick Farrell [Thu, 29 Jun 2023 17:52:01 +0000 (13:52 -0400)]
EX-6265 llite: allow mmap reads of compressed files

mmap reads of compressed files work, so we should only
block writes.

We cannot block the actual fault operations because that
will cause the application to get a SIGBUS, so we check
the file open mode when we go to create the memory mapping.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I636f398fd247ddcd153f94bf8116440540e8469c
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51506
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Qian Yingjin <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-6265 llite: disable mmap on compressed files
Alex Zhuravlev [Thu, 22 Jun 2023 08:36:51 +0000 (11:36 +0300)]
EX-6265 llite: disable mmap on compressed files

disable mmap(2) on compressed files until well tested.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I3464a03b16708edcd0692bc9db337eb8473ea047
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51413
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16399 tests: add subtests setup/cleanup records
Alex Deiter [Mon, 9 Jan 2023 16:19:26 +0000 (20:19 +0400)]
LU-16399 tests: add subtests setup/cleanup records

* Added setup/cleanup records for subtests

Lustre-change: https://review.whamcloud.com/49582
Lustre-commit: d0ae0079d747d05f74f733fb594d8edb512f8b16

Change-Id: Icb203a864fa8785e423a073b4ee0f02ea3d3ac77
Signed-off-by: Alex Deiter <adeiter@tintri.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51406
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16836 lnet: ensure dev notification on lnd startup
Serguei Smirnov [Fri, 19 May 2023 02:12:19 +0000 (19:12 -0700)]
LU-16836 lnet: ensure dev notification on lnd startup

Look up device and link state on lnd startup so that
the initial NI state may be set properly.

Reduce code duplication by adding lnet_set_link_fatal_state() and
lnet_get_link_status() functions which are shared across LNDs.
LND-specific versions of these are removed.

This fixes the issue with adding LNet NI using an interface with
cable unplugged which results in the NI state initialized as "up".

Lustre-change: https://review.whamcloud.com/51057
Lustre-commit: 09c6e2b872287c847d15620788f6cf50b3a9f30b

Fixes: c4df48116d ("LU-16563 lnet: use discovered ni status")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I16084092cc21a4e42dfef4624adfbf57eb4fdecb
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51310
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7509 tests: enable neterror logging in sanity-benchmark/iozone
Jian Yu [Wed, 28 Jun 2023 09:36:19 +0000 (17:36 +0800)]
EX-7509 tests: enable neterror logging in sanity-benchmark/iozone

This patch enables LNet error logging in sanity-benchmark test_iozone()
to gather network errors while connection issue occurs.

Test-Parameters: trivial env=SLOW=yes,ENABLE_QUOTA=yes \
 clientdistro=el8.6 serverdistro=el7.9 testlist=sanity-benchmark

Change-Id: I398779abc95525fe5579fc7505e6e6221c32bf90
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51483
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-8582 target: send error reply on wrong opcode
Li Xi [Tue, 21 Jun 2022 12:06:22 +0000 (20:06 +0800)]
LU-8582 target: send error reply on wrong opcode

Unknown opcode does not necessarily means insane client. A new client
might send RPCs with new opcodes to an old server. The client might
desperately stuck there waiting for a reply. So, send an error back
when RPC has a wrong opcode.

This patch returns the EOPNOTSUPP to client instead of block. ENOTSUPP
is not used here since strerror() does not understand ENOTSUPP.

OBD_FAIL_OST_OPCODE=0x253 is added for fault injection test of opcode.
To test whether an invalid opcode is handled properly on OST, use the
following command:

    lctl set_param fail_val=${opcode} fail_loc=0x253

Lustre-change: https://review.whamcloud.com/47761
Lustre-commit: 03c1ddf19c83891683e1726f240a2449941e8b22

Change-Id: I46ca62bc532b92368e06a4f883b102c7186c453c
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51513
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-14301 lustre: add ENOTSUPP to spelling.txt
John L. Hammond [Wed, 20 Jan 2021 15:22:54 +0000 (09:22 -0600)]
LU-14301 lustre: add ENOTSUPP to spelling.txt

Add a spelling check for ENOTSUPP to suggest use of EOPNOTSUPP
instead. Note:

ENOTSUPP (524) and defined only in the kernel errno.h and is a NFSv3
specific errno. If ENOTSUPP is returned to userspace then strerror()
will print "Unknown error 524".

EOPNOTSUPP (95) is defined in kernel and userspace errno.h.

ENOTSUP is defined in userspace errno.h as an alias for EOPNOTSUPP.

Lustre-change: https://review.whamcloud.com/41280
Lustre-commit: e00733f0f87659c936039a58ea738cfb070638bc

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I13b0389c9ec0853f43d8ab4a8f6538eb24c8a2ad
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51512
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
23 months agoEX-7683 man: document supported compression types
Sebastien Buisson [Wed, 28 Jun 2023 15:05:18 +0000 (17:05 +0200)]
EX-7683 man: document supported compression types

Update lfs-setstripe to mention the currently supported compression
types: gzip, lz4, lz4fast and lzo.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9b587516d27b882bec5855da4948f489e5a0041f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51485
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7775 utils: fix cp_comp_type size
Sebastien Buisson [Wed, 28 Jun 2023 09:34:22 +0000 (09:34 +0000)]
EX-7775 utils: fix cp_comp_type size

cp_comp_type should be 8 bits, as llch_compr_type and all associated
variables are declared as u8.
So remove useless cp_comp_enabled and fix code to test for compressed
component with cp_comp_type against LL_COMPR_TYPE_NONE.
And update LL_COMPR_TYPE_MAX value to 255 to avoid conflicts with
future compression types.

Fixes: f43b9ce9af ("EX-6127 osc: osc brw request compression")
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia15868ac0ac003b62942540a57f782226ae8c141
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51481
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7775 utils: fix LL_COMPR_TYPE wire checks
Sebastien Buisson [Tue, 27 Jun 2023 14:42:20 +0000 (16:42 +0200)]
EX-7775 utils: fix LL_COMPR_TYPE wire checks

Fix LL_COMPR_TYPE* wire checks to avoid duplication.
The value of LL_COMPR_TYPE_UNCHANGED is also declared as 15 so that
is does not conflict with other potential real compression types in
the future.

Fixes: 67d4601737 ("EX-6249 csdc: set compress component for file")
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iab9830f09f0778e1e1f3b1ea4c9878ce1017de8d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51473
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7683 utils: update handling of compr level for lz4
Sebastien Buisson [Tue, 20 Jun 2023 15:49:14 +0000 (15:49 +0000)]
EX-7683 utils: update handling of compr level for lz4

The way lz4 compression level or acceleration factor is handled needs
to be adapted in order to match what is provided by the lz4 userspace
tool:
- any level between 0 and 2 is interpreted as the default lz4
  acceleration factor of 1;
- any level from 3 and up to 16 is interpreted as a compression level
  for internal lz4hc. Increasing the compression level trades CPU
  time for improved compression ratio;
- acceleration factor can be specified for the lz4fast compression
  type, from 1 to 26. This acceleration factor trades compression
  ratio for faster speed.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4711217c1a6601f29f78d262567da5998f657fc9
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7715 utils: error for invalid compression level
Sebastien Buisson [Wed, 21 Jun 2023 12:00:07 +0000 (12:00 +0000)]
EX-7715 utils: error for invalid compression level

Not all compression types accept a compression level (e.g. lzo in its
kernel implementation). For those, return an error if a compression
level is provided on 'lfs setstripe' command-line.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6c367c4bfd76cc81c890a89ba9f994f4fd9f4f80
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51394
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-6931 tests: add version check in sanity test_904
Alena Nikitenko [Wed, 28 Jun 2023 13:28:44 +0000 (16:28 +0300)]
EX-6931 tests: add version check in sanity test_904

The patch that added hide virtual projid xattr functionality
to b_es6_0 branch was submitted under a tag of 2.14.0-ddn55.
In case of interop testing with older servers test 904 fails.
At the same time the rest of the test should still run on
older servers, so I moved parts of the code related to hide
virtual projid xattr functionality under version check
condition.

Sanity test 904 was modified.

Fixes: 8acf413647 ("LU-15548 osd-ldiskfs: hide virtual projid xattr")
Test-Parameters: trivial testlist=sanity env=ONLY="904"
Test-Parameters: testlist=sanity env=ONLY="904" serverversion=2.14.0-ddn54
Signed-off-by: Alena Nikitenko <anikitenko@ddn.com>
Change-Id: Ie3bc67cb8a7a83954f3b2048f009f84ab77bf53d
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51484
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7752 test: hot-pools test_59: not expected flag 'stale'
Bobi Jam [Thu, 29 Jun 2023 15:01:35 +0000 (23:01 +0800)]
EX-7752 test: hot-pools test_59: not expected flag 'stale'

Make the 1st mirror preferable for write for the test purpose.

Test-Parameters: trivial env=ONLY=59 testlist=hot-pools
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I06c33badfc10ff9743df83a6e421e27afc9b6dbd
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoRM-620 build: New tag 2.14.0-ddn91
Andreas Dilger [Sat, 24 Jun 2023 18:15:33 +0000 (12:15 -0600)]
RM-620 build: New tag 2.14.0-ddn91

New tag 2.14.0-ddn91

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8634b3650bf43aadaf303e48b9ebe2931e6af6ab

23 months agoLU-16923 kernel: update RHEL 8.8 [4.18.0-477.13.1.el8_8]
Jian Yu [Fri, 23 Jun 2023 03:10:35 +0000 (11:10 +0800)]
LU-16923 kernel: update RHEL 8.8 [4.18.0-477.13.1.el8_8]

Update RHEL 8.8 kernel to 4.18.0-477.13.1.el8_8.

Lustre-change: https://review.whamcloud.com/51411
Lustre-commit: TBD (from a04c2ff950a2a6b65e55184113b1513fc9bf0058)

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.8 serverdistro=el8.8 testlist=sanity

Change-Id: I23c6439aedd6f8e9473ddb629ff7e01c50d9c8fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51420
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7680 llite: skip hole/data lseek() on compressed file
Mikhail Pershin [Fri, 23 Jun 2023 22:21:28 +0000 (01:21 +0300)]
EX-7680 llite: skip hole/data lseek() on compressed file

Doesn't execute real lseek() on compressed file with
SEEK_HOLE/SEEK_DATA origin but consider file always
non-sparsed and do generic lseek() only

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2cd587cc0205e85758e06bbaafafe0e2959e0ade
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51429
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-15671 mds: do not send OST_CREATE transno interop
Andreas Dilger [Thu, 18 May 2023 21:41:47 +0000 (15:41 -0600)]
LU-15671 mds: do not send OST_CREATE transno interop

Send OST_CREATE RPCs from the MDS with no_resend and no_delay
when communicating with an old OST that does not support the
OBD_CONNECT2_REPLAY_RESEND.  Likewise, the OST should not reply
to the MDS RPC with rq_transno set, or this will trigger:

   osp_precreate_send() ASSERTION(req->rq_transno == 0) failed

This can be avoided if the MDS is upgraded before the OSS, but
will always be hit if OSS is upgraded first.

After 2.20.53 the MDS/OSS assume that this is always true, since
rolling upgrades are unsupported for larger version differences.

Lustre-change: https://review.whamcloud.com/51056
Lustre-commit: 9ee1281060d0a00a9c5d715a9a6d9b99c27123ff

Test-Parameters: testgroup=rolling-upgrade-oss
Fixes: 63e17799a3 ("LU-8367 osp: enable replay for precreation request")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I1ab601a2f55540dd75cf24838f7cdb7f823ed42c
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoEX-6127 llite: getting stripe info optimization
Artem Blagodarenko [Wed, 14 Jun 2023 21:10:28 +0000 (22:10 +0100)]
EX-6127 llite: getting stripe info optimization

ll_lov_getstripe_ea_info() is expensive call and should be avoided
if possible. Let's use cached chunk size rather than get it from
stripe info every time.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Id08487ec782f797e242e3f673c4a4dd8d526c9cc
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7601 llite: restrict overwrites during preview
Patrick Farrell [Tue, 13 Jun 2023 17:26:09 +0000 (13:26 -0400)]
EX-7601 llite: restrict overwrites during preview

EX-7601 is an issue where when modifying a compressed file
we do not correctly read-up and re-write existing
compressed data.

To avoid this, we can only allow writes which are not
aligned to compression chunk size when they are not
overwriting existing data, ie, when they are extending the
file.

This returns EINVAL for all writes to compressed files
which are not either chunk aligned or extending the file.
This should prevent users from hitting the data corruption
issue but still allows some basic usage.  This is intended
just for the preview period.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0eea01e2249866a074afd0d0642fe6dce9a49664
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51259
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-6127 llite: DIO fallback on compressed files
Patrick Farrell [Fri, 2 Jun 2023 02:24:15 +0000 (22:24 -0400)]
EX-6127 llite: DIO fallback on compressed files

Fully supporting direct I/O on compressed files is tricky
because we cannot pull the full chunk in to the page cache
(because there is no page cache for DIO).

So instead we fall back to buffered I/O for DIO on
compressed files.

This patch adds the check and a test for this.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I8224ef9b8ad1d912d8a11eccad37d3dff8dd8498
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51200
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7528 lov: fallocate is not allowed for compressed files
Artem Blagodarenko [Fri, 19 May 2023 09:27:36 +0000 (10:27 +0100)]
EX-7528 lov: fallocate is not allowed for compressed files

Client Side Data Compression allocates blocks after a compression.
It is impossible to preallocate blocks for the whole file, so
fallocate should be disabled in case of compression.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: Ie834ace183fdcec0d7d6f747237e0964c3c4120b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51059
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Vitaliy Kuznetsov <vkuznetsov@ddn.com>
23 months agoEX-6127 osc: osc brw request compression
Artem Blagodarenko [Wed, 30 Nov 2022 14:54:57 +0000 (14:54 +0000)]
EX-6127 osc: osc brw request compression

This patch adds client-side compression/decompression.

The client-side data compression project (CSDC) reduces storage and network
utilization by leveraging the more plentiful memory and CPU resources on the
local client. Data is sent compressed over the network, saved directly to
storage on the server side, and decompressed back on the client side.
Uncompressed data is kept in client page cache, all while being functionally
transparent to the end user and application.

As an example, a test file is compressed and decompressed.
The resulting file is compared with the original one.

The test case shows 2.5x compression ratio:
356K /mnt/lustre/d460.sanity/sanity.sh
884K /tmp/cmp-46ofie/decompressed_sanity.sh

Compression should read whole chunk even if offset
and size differ.

Let's modify readahead to force reading data from the
offset and size multiple to the chunk size.

Test-Parameters: testlist=sanity env=ONLY=460
Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I9b41ab815db3df9ad7bdea5fca4c093cbda8814b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/49511
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16837 csdc: reserve connect bits for compressed layout
Bobi Jam [Wed, 24 May 2023 00:25:05 +0000 (08:25 +0800)]
LU-16837 csdc: reserve connect bits for compressed layout

Add connect data bit for compressed layout (OBD_CONNECT2_COMPRESS)
and another connect data bit to be used (OBD_CONNECT2_LARGE_NID).

Also reserve obd_connect_data::ocd_compr_type which is a bitmask of
supported compression type to be negotiated between client and MDS.

Lustre-change: https://review.whamcloud.com/51108
Lustre-commit: 83189aef3b23f18cb8c1deae34994a00f8582039

Test-Parameters: trivial
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I21029c6c3e8a7e690ecc8d489bbb95aec3ab1fa8
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51409
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-1904 idl: add checks for OBD_CONNECT flags
Andreas Dilger [Fri, 28 May 2021 08:49:19 +0000 (02:49 -0600)]
LU-1904 idl: add checks for OBD_CONNECT flags

Make it harder to accidentally declare OBD_CONNECT flags without
properly defining their names.  Otherwise, this can cause serious
compatibility problems if two features are using the same flag.

Add the definition lines into spelling.txt so there is *always*
a warning generated, since this always needs proper attention.

Make it clear whom to contact when reserving a new feature flag.

Lustre-change: https://review.whamcloud.com/48053
Lustre-commit: d851381ea6947244842ae6f138cd0bfd399b7ef4

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9a5e2c97c40c39ea57d20979d4b130854edc785a
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51408
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoRM-620 build: New tag 2.14.0-ddn90
Andreas Dilger [Fri, 23 Jun 2023 08:01:51 +0000 (02:01 -0600)]
RM-620 build: New tag 2.14.0-ddn90

New tag 2.14.0-ddn90

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8a73e4233db5fc835eb33849e0508204417e7429

23 months agoEX-7593 csdc: don't set compression layout when disabled
Bobi Jam [Thu, 8 Jun 2023 03:42:46 +0000 (11:42 +0800)]
EX-7593 csdc: don't set compression layout when disabled

When llite_enable_compression is disabled
(lfs set_param llite.*.enable_compression=0), we should check
it before sending it to MDS lest we get a file with compressed
component which we cannot handle.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib1e2123ffdb239c3e1401d682ae9c2c49e3f4a6f
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51250
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7714 utils: logical AND for 'lfs find' compression exprs
Sebastien Buisson [Wed, 21 Jun 2023 11:07:36 +0000 (11:07 +0000)]
EX-7714 utils: logical AND for 'lfs find' compression exprs

All search expressions provided to 'lfs find' must be combined as a
logical AND. Fix newly added options for compression support, so that
they comply with this logical AND.

Fixes: 093bd2f343 ("EX-6856 utils: add 'lfs find' support for compressed file")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b28cd87c1d304df6d04753b413d46f5abcfe16e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51393
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16778 tests: sanity-quota_75 fix
Sergey Cheremencev [Tue, 30 May 2023 08:14:47 +0000 (11:14 +0300)]
LU-16778 tests: sanity-quota_75 fix

Change conf=fsync to oflag=direct to avoid
cache write.

Lustre-change: https://review.whamcloud.com/51158
Lustre-commit: TBD (from b1c5e39335820602abeecbb91a3afb86879f84f2)

Test-Parameters: trivial testlist=sanity-quota env=ONLY=75,ONLY_REPEAT=100
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: Iff04bac63f772dc2d0d0ad765d210b2539fbe33e
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51407
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16922 kernel: update RHEL 9.2 [5.14.0-284.18.1.el9_2]
Jian Yu [Thu, 22 Jun 2023 14:56:23 +0000 (22:56 +0800)]
LU-16922 kernel: update RHEL 9.2 [5.14.0-284.18.1.el9_2]

Update RHEL 9.2 kernel to 5.14.0-284.18.1.el9_2 for Lustre client.

Lustre-change: https://review.whamcloud.com/51410
Lustre-commit: TBD (from 6111eb7a418fb4396e0451d0613af43870951f72)

Test-Parameters: trivial env=SANITY_EXCEPT=27J clientdistro=el9.2 testlist=sanity
Test-Parameters: trivial env=SANITY_EXCEPT=27J clientdistro=el9.2 serverdistro=el8.8 testlist=sanity

Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: Ifa8f13200550e5f473b7d7d641155e349c453c03
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-7596 tests: don't fail if metadata_csum_seed unset
Andreas Dilger [Fri, 23 Jun 2023 01:05:20 +0000 (19:05 -0600)]
EX-7596 tests: don't fail if metadata_csum_seed unset

Fix logic in the sanity-pcc cache filesystem setup.  With mke2fs
1.47.0-wc1 it enabled metadata_csum_seed unconditionally, but it
caused problems on el7.9 kernels. In 1.47.0-wc2 it disabled that
feature, caused the check for removing the feature to fail.

Since metadata_csum_seed has been available since 1.44.2 it
shouldn't be a problem to force it off duing mke2fs.

Test-Parameters: trivial testlist=sanity-pcc env=ONLY=1
Fixes: 7d35dd13273e7 ("EX-7596 tests: disable metadata_csum_seed for pcc cache device")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic04ab4043981dc9b5c32e01c4aa85be343e3f3f8
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51417
Reviewed-by: Feng Lei <flei@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
23 months agoLU-16888 gss: fix ptlrpc_gss automatic loading
Sebastien Buisson [Fri, 9 Jun 2023 12:50:25 +0000 (14:50 +0200)]
LU-16888 gss: fix ptlrpc_gss automatic loading

ptlrpc_gss kernel module is automatically loaded when a GSS security
flavor is enforced. Loading success is recorded in a static variable
in the ptlrpc module, which prevents further reloading in case
ptlrpc_gss is unloaded while keeping ptlrpc loaded.
Get rid of this static variable as it is not required in order to
avoid calling request_module("ptlrpc_gss") when not needed. Indeed,
once loaded, the static array policies[] has an entry at the
SPTLRPC_POLICY_GSS index, indicating that the ptlrpc_gss module is
loaded.

Lustre-change: https://review.whamcloud.com/51264
Lustre-commit: b80d6defb7b018250ef4fafccff7c980aed6a444

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9bb100a202fe9c3fc455a2ffba6ee6398e19b9bf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51374
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-16772 quota: access lqe_glbl_data only under mutex
Sergey Cheremencev [Mon, 19 Jun 2023 14:55:09 +0000 (18:55 +0400)]
LU-16772 quota: access lqe_glbl_data only under mutex

Hold mutex to protect lqe_glbl_data against freeing in
qmt_lvbo_update, qmt_allock_lock_array and qmt_setup_id_desc.
This patch should help against below(LU-14434) and similar panics.

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
  RIP: 0010:qmt_id_lock_cb+0x69/0x100 [lquota]
  qmt_id_lock_cb+0x69/0x100 [lquota]
  qmt_glimpse_lock.isra.19+0x27e/0xfb0 [lquota]
  qmt_reba_thread+0x5da/0x9b0 [lquota]
  kthread+0x112/0x130
  ret_from_fork+0x35/0x40

It is the 2nd part of 50ff4d1da63e8. The latest patchset of
https://review.whamcloud.com/c/fs/lustre-release/+/50748/ has extra
changes, but occasionally has been landed without inspection.
So it's not simple porting from upstream.

Fixes: 50ff4d1da63 ("LU-16772 quota: protect lqe_glbl_data in qmt_site_recalc_cb")
Signed-off-by: Sergey Cheremencev <scherementsev@ddn.com>
Change-Id: I5753ace94a739c6df70dd8ea3bde828e2b5ed812
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51368
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoLU-16804 tests: rename 'complete' to 'complete_test'
Andreas Dilger [Tue, 20 Jun 2023 18:57:56 +0000 (12:57 -0600)]
LU-16804 tests: rename 'complete' to 'complete_test'

The test-framework.sh "complete" function conflicts with "complete"
exported from bash_completion, and this causes lustre-initialization
to fail in some configurations now that the lustre test config
is loaded earlier during test-framework.sh init_test_env() setup.

Rename "complete" to "complete_test" to avoid this conflict.

Lustre-change: https://review.whamcloud.com/51383
Lustre-commit: TBD (from 9bf8d1944995e7bf627a602aa1d0523d810c84b6)

Test-Parameters: trivial
Fixes: fdbb2bc849 ("LU-16804 tests: load CONFIG at beginning of init_test_env")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic72d8d5cc4a65feec6bfb2a76ac5f9b9d78e3f75
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51389
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Charlie Olmstead <charlie@whamcloud.com>
23 months agoLU-16587 utils: give lfs migrate a larger buffer
Nathan Rutman [Wed, 22 Feb 2023 22:34:09 +0000 (14:34 -0800)]
LU-16587 utils: give lfs migrate a larger buffer

lfs migrate is slow because it mostly uses a small 1MB buffer. Bigify.

[root@kjlmo4n00 16G]# time lfs migrate -S 1M -p flash 16G.1
real 0m25.341s
[root@kjlmo4n00 16G]# time /root/tools/lfs_nzr migrate -S 1M -p flash 16G.1
real 0m6.526s

Lustre-change: https://review.whamcloud.com/50118
Lustre-commit: 23224e03dc30c89dd449de5a7fe99b0bd3aca495

Signed-off-by: Nathan Rutman <nathan.rutman@hpe.com>
Change-Id: I850ca475fcd0efe2d71d26e4d1544f462c60252a
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51352
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoEX-6510 csdc: prefer uncompressed mirror for write
Bobi Jam [Fri, 12 May 2023 03:24:35 +0000 (11:24 +0800)]
EX-6510 csdc: prefer uncompressed mirror for write

When writing to mirrored files with both compressed and uncompressed
mirrors, prefer the uncompressed components to write, and that is
better for performance, more compatible with older clients, and better
fits the model of compressing files after initial write.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I62a117d5cc3d34e2c0c96d1a9ade8eef0a2d1291
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50974
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
23 months agoEX-6261 ptlrpc: extend sec bulk functionality
Artem Blagodarenko [Tue, 11 Apr 2023 14:04:49 +0000 (15:04 +0100)]
EX-6261 ptlrpc: extend sec bulk functionality

Client Side Data Compression(CSDC) needs buffers pool for efficient
work. Encryption used ptlrpc sec bulk, but it works with pages.

This patch extends sec bulk functionality to allocate different
size buffers. Memory shrinking and other usefull features
should still work as expected.

Signed-off-by: Artem Blagodarenko <ablagodarenko@ddn.com>
Change-Id: I929b4dfdcb0e8197f3804629b000af0d4bd6f2a0
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoRM-620 build: New tag 2.14.0-ddn89
Andreas Dilger [Sat, 17 Jun 2023 05:47:21 +0000 (23:47 -0600)]
RM-620 build: New tag 2.14.0-ddn89

New tag 2.14.0-ddn89

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id5516daccbef1da2f342afd79e912bc91bc9da13

2 years agoLU-15046 osp: precreate thread vs connect race
Alex Zhuravlev [Thu, 30 Sep 2021 12:16:57 +0000 (15:16 +0300)]
LU-15046 osp: precreate thread vs connect race

lcs_exp (required for fid client) was initialized in osp_obd_connect()
which races with osp_precreate_thread(). the latter can get stuck if
lcs_exp is not initialized and then the whole precreation logic is
blocked until remount. instead the precreation thread can just wait
preliminary until lcs_exp is initialized properly.

Lustre-change: https://review.whamcloud.com/45099
Lustre-commit: 7e0a2b073701e23f6c941d249e034abe1043ccd6
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7a42bf4b17ce5d46bc25bd548d81eb55f168804b
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51308
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16850 socklnd: remove ksnr_myiface from ksock_conn_cb
Serguei Smirnov [Fri, 26 May 2023 17:42:23 +0000 (10:42 -0700)]
LU-16850 socklnd: remove ksnr_myiface from ksock_conn_cb

Drop ksnr_myiface: it is no longer needed since socklnd
TCP bonding got removed. There's one interface per
connection cb per peer_ni, and it can be accessed as
net->ksnn_interface.ksni_index.

Fix setting of ksni_nroutes accordingly. Duplication of
interface index in conn_cb and ksnn_interface was causing
the assertion
ASSERTION( net->ksnn_interface.ksni_nroutes == 0 )
in ksocknal_shutdown() to fail if the corresponding
device is deregistered before lnd shutdown.

Modify test_214 of sanity-lnet to create connections so that
the scenario of socklnd shutdown with NI on a deregistered
interface is covered.

Lustre-change: https://review.whamcloud.com/51148
Lustre-commit: f6be07c457385cfacd9b802e4cade9f6f6ab7d6f

Fixes: a7ee03d7ca4185 ("LU-16378 lnet: handles unregister/register events)
Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I4de164c9e64aa770164a8320b9460fadce49aa06
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51326
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-13641 socklnd: remove remnants of tcp bonding
Mr NeilBrown [Thu, 15 Sep 2022 05:32:05 +0000 (15:32 +1000)]
LU-13641 socklnd: remove remnants of tcp bonding

->ksnp_n_passive_ips is now always zero, so remove it and all uses of
it.  ->ksnp_passive_ips is gone too, as is ksocknal_ip2iface().

Lustre-change: https://review.whamcloud.com/48568
Lustre-commit: 3630e1eaf9db562a1de707762cd649db815459c8

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I5de6d027c545087c961673d8704f68c4f3dd5076
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51325
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Frank Sehr <fsehr@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoEX-4130 lipe: Remove old hotpool scripts
Nathaniel Clark [Mon, 5 Jun 2023 13:49:29 +0000 (09:49 -0400)]
EX-4130 lipe: Remove old hotpool scripts

Hotpool scripts now live in EMF repo.
The scrips in lipe/ are unmaintained and non-functional.

Change-Id: I2f288f7f68886015ddb9a85da4e437c9f03c3928
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Colin Faber <cfaber@ddn.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Artur Novik <anovik@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15047 gss: gss integrity check with multi-rail
Sebastien Buisson [Mon, 18 Oct 2021 11:26:40 +0000 (13:26 +0200)]
LU-15047 gss: gss integrity check with multi-rail

With multi-rail, a primary NID is used as node identifier, but LNet
decides which NID is actually used for sending/receiving data, on a
per request basis.
For the integrity check mechanism implemented as part of GSS, the
primary NID must be used in order to compute HMAC with the correct
key, independently of the actual NID for the current request.

Lustre-change: https://review.whamcloud.com/45277
Lustre-commit: c8301a65c5672a1d081669343466746df983eabc

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2bf3974d3aa0e8365a9413dca56c69ee3734c12b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51274
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-7495 utils: add --links option for lfs find
Thomas Bertschinger [Sun, 7 May 2023 17:34:42 +0000 (13:34 -0400)]
LU-7495 utils: add --links option for lfs find

This adds a "--links" option for lfs find to filter files and
directories by the number of hard links. It also adds a printf format
'%n' to print the number of links for a file.

This commit also fixes '-l' as a short option for '--lazy' which was
added in 11aa7f8704c490b011f60f234c3ac9929ce76948 but the short option
did not work.

Lustre-change: https://review.whamcloud.com/50886
Lustre-commit: f759d6386d5d0edb95d683d97ca8d84c80080c1c

Signed-off-by: Thomas Bertschinger <bertschinger@lanl.gov>
Change-Id: I5d15bc290df8e8a08402f8d5cfa0a7139791b0a4
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51327
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16799 tests: fix sanity-krb5
Sebastien Buisson [Thu, 4 May 2023 23:10:50 +0000 (01:10 +0200)]
LU-16799 tests: fix sanity-krb5

sanity-krb5.sh needs to be fixed in several ways.
It cannot assume that the Kerberos credentials cache is FILE, and
expect ccache files to be under /tmp/krb5cc_*.
The lsvcgssd daemon must be launched with -vvv flags for easier
debugging.
Keyring needs to be cleared appropriately after using 'lfs flushctx'.

Lustre-change: https://review.whamcloud.com/50864
Lustre-commit: f8f8b3c574e95cb7272310bba19f97fe68cd9b11

Test-Parameters: trivial testlist=sanity-krb5 kerberos=true
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I31ca8d2d97e137c7ba9fa478d5432aeedb5135a8
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-12514 llite: move client mounting from obdclass to llite
Mr NeilBrown [Mon, 28 Dec 2020 20:56:12 +0000 (15:56 -0500)]
LU-12514 llite: move client mounting from obdclass to llite

Mounting a lustre client is currently handled
in obdclass, using services from llite.
This requires obdclass to load the llite module
and set up inter-module linkage.

The purpose of this was for common code to support both
client and server mounts.  This isn't really a good idea
and need to go. For lustre servers we already use a
separate filesystem type.

So move the mounting code from obdclass/obd_mount to llite/super25
and remove the inter-module linkages.
Add some EXPORT_SYMBOL() so that llite can access some helpers
that remain in obdclass.

Linux-commit: a989830c88149511ee840356d9c1b34304bac576

Lustre-change: https://review.whamcloud.com/37693
Lustre-commit: 53fa81765750e38f7879ed5092fd729c1bdc8a0f

Change-Id: Ia33bd55a042f90b178156c745a8072b516f00568
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51315
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16425 tests: skip interop recovery-small/144a/144b
Alex Zhuravlev [Thu, 15 Jun 2023 07:45:08 +0000 (10:45 +0300)]
LU-16425 tests: skip interop recovery-small/144a/144b

Skip recovery-small test_144a and test_144b for old MDS
missing the fix and for its corresponding test.

Lustre-change: https://review.whamcloud.com/49679
Lustre-commit: 64faf832a6128cc55c0f3ffa0595d9715d3bdd25

Fixes: 240938f7b1 ("LU-8367 tests: cleanup_orphans hang reproducer")
Fixes: aa6250b741 ("LU-15724 tests: MDT failover hang reproducer")
Test-Parameters: trivial testlist=recovery-small env=ONLY=144 serverversion=2.14.0
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I77bfdf55d0218aa9e252f742cc90f1c61216d506
Reviewed-by: Sarah Liu <sarah@whamcloud.com>
Reviewed-by: Alex Deiter <alex.deiter@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16167 obdclass: fix lctl llog_print with skipped records
Etienne AUJAMES [Mon, 19 Sep 2022 10:23:47 +0000 (12:23 +0200)]
LU-16167 obdclass: fix lctl llog_print with skipped records

The 2a5b50d ignores the skipped records in configuration llog.
But if ioctl OBD_IOC_LLOG_PRINT return 0 record to display,
jt_llog_print_iter() will stop the processing and ignore the
non-skipped records at the end of the llog.

This patch returns to user space if the last index processed
(by llog_print_cb) is the last of llog file. If true,
jt_llog_print_iter() stops the processing.

Add regression test "conf-sanity test_123ai" for this issue.

Lustre-change: https://review.whamcloud.com/48586
Lustre-commit: c6da54aa7546440339265c644538d3d109e46bde

Fixes: 2a5b50d ("LU-15142 lctl: fixes for set_param -P and llog_print")
Test-Parameters: testlist=conf-sanity env=ONLY=123ai,SLOW=yes,ONLY_REPEAT=10
Signed-off-by: Etienne AUJAMES <etienne.aujames@cea.fr>
Change-Id: I78395268c57555e4fd2a4048ccf5b6132ca2877f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander <alexander.boyko@hpe.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51316
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-16619 build: Ubuntu jammy 5.19 client support
Shaun Tancheff [Mon, 15 May 2023 20:41:08 +0000 (13:41 -0700)]
LU-16619 build: Ubuntu jammy 5.19 client support

Ubuntu 5.19 kernel removed lsmcontext_init() and changed
security_dentry_init_security to require struct context *

Linux kernel linux-hwe-5.19
LSM: Removed scaffolding function lsmcontext_init

Linux kernel linux-hwe-5.19
LSM: security_dentry_init_security with struct lsmcontext

Lustre-change: https://review.whamcloud.com/50210
Lustre-commit: TBD (from d7001a6c68c334d15d3daeb932b8456b101153d2)

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ib6479a2cd20df5e565ae6203e05df2afa3f3de31
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/51002
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-16327 llite: read_folio, release_folio, filler_t
Shaun Tancheff [Wed, 17 May 2023 09:09:01 +0000 (02:09 -0700)]
LU-16327 llite: read_folio, release_folio, filler_t

Linux commit v5.18-rc5-221-gb7446e7cf15f
  fs: Remove aop flags parameter from grab_cache_page_write_begin()
flags have been dropped from write_begin() and
grab_cache_page_write_begin()

Linux commit v5.18-rc5-241-g5efe7448a142
  fs: Introduce aops->read_folio
Provide a ll_read_folio handler around ll_readpage

Linux commit v5.18-rc5-280-ge9b5b23e957e
  fs: Change the type of filler_t
Affects read_cache_page, provides a wrapper for read_cache_page
and wrappers for filler functions

Linux commit v5.18-rc5-282-gfa29000b6b26
  fs: Add aops->release_folio
Provide an ll_release_folio function based on ll_releasepage

Lustre-change: https://review.whamcloud.com/49199
Lustre-commit: 133ed0cf6f0c84d2b5b84e1de3ff2c54b1fb902d

Test-Parameters: trivial
HPE-bug-id: LUS-11357
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ibd4ec1133c80cd0eb8400c4cd07b50e421dd35c5
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/50977
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>