Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-14119 mdc: set fid2path RPC interruptible
Lai Siyao [Wed, 13 Jan 2021 09:29:50 +0000 (17:29 +0800)]
LU-14119 mdc: set fid2path RPC interruptible

Sometimes OI scrub can't fix the inconsistency in FID and name, and
server will return -EINPROGRESS for fid2path request. Upon such
failure, client will keep resending the request. Set such request
to be interruptible to avoid deadlock.

Lustre-change: https://review.whamcloud.com/41219
Lustre-commit: bf475262610671534b1b1a33cebb49d8380b74f7

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I82192cb8a8256064ca632cabfe5581b12e86423b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44229
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14741 obdclass: Wake up queue of reqs on close completion
Oleg Drokin [Mon, 7 Jun 2021 19:17:27 +0000 (15:17 -0400)]
LU-14741 obdclass: Wake up queue of reqs on close completion

Origin title:
LU-14741 obdclass: Wake up entire queue of requests on close
completion

Since close requests could be stuck behind normal requests and get
more slots we need to wake up entire accumulated queue waiting
for the next modrpc slot or have additional waitqueue just for
close requests.

This patch goes with the former approach.

Lustre-change: https://review.whamcloud.com/43941
Lustre-commit: a4e1567d67559b797a5c24ee0bfbca4a52649c47

Fixes: 1fc013f901 ("LU-5319 mdc: manage number of modify RPCs in flight")
Change-Id: Ib4333c7f6731dd435364d5e5f529577a1600a235
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/44288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14711 tests: Ensure no eviction with long cache discard
Oleg Drokin [Sat, 29 May 2021 02:42:49 +0000 (22:42 -0400)]
LU-14711 tests: Ensure no eviction with long cache discard

Origin title:

LU-14711 tests: Ensure there's no eviction with long cache discard

Just pause execution while doing page processing
for discard if appropriate failloc is set.

Lustre-change: https://review.whamcloud.com/43869
Lustre-commit: TBD (from 3323b40668cddaa1ac6f6644436bd305c189c5ac)

Change-Id: If0d04f3cad267cbeeab63040d63e048dcf03cd6b
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Test-Parameters: trivial testlist=sanity env=ONLY=903
Reviewed-on: https://review.whamcloud.com/44286
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoDDN-2042 bio: allow BIO integrity to run on any core
Andreas Dilger [Fri, 7 May 2021 21:03:18 +0000 (15:03 -0600)]
DDN-2042 bio: allow BIO integrity to run on any core

Unbind the bio integrity workqueue so that it can run on any available
core in the system, to improve concurrency for this CPU-bound task if
there are multiple requests being submitted from a single thread.

This is done in the same way in dm-verity-target, which has a similar
CPU profile to bio-integrity.

Update kernel build version to -ddn14.

Test-Parameters: trivial
Reported-by: Greg Edwards <gedwards@ddn.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia299e52db9b0995c8f48372782882324ba3ebbe5
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/44240
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14616 readahead: fix reserving for unaliged read
Wang Shilong [Tue, 20 Apr 2021 01:47:25 +0000 (09:47 +0800)]
LU-14616 readahead: fix reserving for unaliged read

If read is [2K, 3K] on x86 platform, we only need
read one page, but it was calculated as 2 pages.

This could be problem, as we need reserve more
pages credits, vvp_page_completion_read() will only
free actual reading pages, which cause @ra_cur_pages
leaked.

Lustre-change: https://review.whamcloud.com/43377/
Lustre-commit: 5e7e9240d27a4b74127ea7a26d910ae41a6e1cb1

Fixes: d4a54de84c0 ("LU-12367 llite: Fix page count for unaligned reads")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I3cf03965196c1af833184d9cfc16779f79f5722c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/44239
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 lfsck: replace dt_lookup() with dt_lookup_dir()
Lai Siyao [Wed, 13 Jan 2021 09:16:55 +0000 (17:16 +0800)]
LU-14119 lfsck: replace dt_lookup() with dt_lookup_dir()

Lfsck code calls dt_lookup() to lookup sub file under directory in
many places, but this function needs to to initialize directory with
dt_try_as_dir() first, while it's missing in several places, since
the overhead is trivial, call dt_lookup_dir() instead.

Lustre-change: https://review.whamcloud.com/41218
Lustre-commit: d525ad4bd0d5d851405e4249859a1c77378f0ee3

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I40bd8d51edece50353af1729cf867572a0abea78
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44228
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14322 tests: skip sanityn 51e for old servers
James Nunez [Thu, 10 Jun 2021 18:54:51 +0000 (12:54 -0600)]
LU-14322 tests: skip sanityn 51e for old servers

sanityn test 51e was added to Lustre version 2.13.54.148.
When we run version interop testing with servers less than
this version, the test will fail.

We should skip sanityn test 51e if the server version is
less than 2.13.54.148.

Lustre-change: https://review.whamcloud.com/43969
Lustre-commit: decdd03cdccdbfdd35f31317c617698198e5ea42

Fixes: 3ea729fe82 ("LU-13693 lfs: check early for MDS_OPEN_DIRECTORY")
Test-Parameters: trivial
Test-Parameters: serverversion=2.12.6 serverdistro=el7.9 env=ONLY=51e testlist=sanityn
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Id2f165b275c97c3a1396a0da18a3f254dbe5efa7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44275
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14622 osd: mark pages accessed on reads
Alex Zhuravlev [Mon, 19 Apr 2021 06:10:17 +0000 (09:10 +0300)]
LU-14622 osd: mark pages accessed on reads

to improve cache hit ratio

Lustre-change: https://review.whamcloud.com/43367
Lustre-commit: 7a2011a4ecee773a5f8064e1e00d441f73aa5b15

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If4850465d118ed62e9da105dc0cf144ff5041fd3
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14711 osc: Notify server if cache discard takes a long time
Oleg Drokin [Fri, 28 May 2021 02:34:44 +0000 (22:34 -0400)]
LU-14711 osc: Notify server if cache discard takes a long time

Discarding a large number of pages from a mapping under a
single lock can take a really long time (750GB is over 170s).
Since there is no stream of RPCs sent to the server as with
read or write to prolong the DLM lock timeout, the server
may evict the client as it does not see progress is being made.

As such send periodic "empty" RPCs to the server to show the
client is still alive and working on the pages under the lock.

For compatibility reasons the RPC is formed as a one-byte
OST_READ request with a special flag set to avoid doing
actual IO, but older servers actually do the one-byte read

Lustre-change: https://review.whamcloud.com/43857
Lustre-commit: 564070343ac4ccf4f97843009e1c36f5130ac19c

Change-Id: I4603c83e92c328d93e29adce8cbfac3d561b25d5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-on: https://review.whamcloud.com/44285
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14721 tests: wait_destroy_complete should check MDTs
Oleg Drokin [Sat, 29 May 2021 03:45:20 +0000 (23:45 -0400)]
LU-14721 tests: wait_destroy_complete should check MDTs

Ever since destroys handling was moved to MDTs we need to
move waiting for destroys completion to MDTs as well.

Lustre-change: https://review.whamcloud.com/43870
Lustre-commit: 3f8e14163a57ddf51047efc1c0a1b9c15631e2b4

Change-Id: I31440ec048b960206a903387d7050aa13e45008d
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44284
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-3478 pcc: avoid uninitialized pcc mutext lock in cleanup
Qian Yingjin [Wed, 14 Jul 2021 07:27:19 +0000 (15:27 +0800)]
EX-3478 pcc: avoid uninitialized pcc mutext lock in cleanup

Running racer concurrently crashed in the following way:
  RIP: 0010:[...]  [...] __list_add+0x1b/0xc0
  __mutex_lock_slowpath+0xa6/0x1d0
  mutex_lock+0x1f/0x2f
  pcc_inode_free+0x1e/0x60 [lustre]
  ll_clear_inode+0x64/0x6a0 [lustre]
  ll_delete_inode+0x5d/0x220 [lustre]
  evict+0xb4/0x180
  iput+0xfc/0x190
  ll_iget+0x156/0x350 [lustre]
  ll_prep_inode+0x212/0x9b0 [lustre]

After analysis, we found that the mutex @lli_pcc_lock is not
initialized. The reason is that ll_lli_init() is not called to
initialize @lli.
When call pcc_inode_free(), it will call mutex_lock() on the
uniniitialized @lli_pcc_lock, thus crash the kernel.

Test-Parameters: testlist=racer env=DURATION=3600
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I612c79a5b8eb4fa9daeb9e446a457e95c666c04a
Reviewed-on: https://review.whamcloud.com/44300
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14826 mdt: getattr_name("..") under striped directory
Lai Siyao [Thu, 8 Jul 2021 14:25:51 +0000 (10:25 -0400)]
LU-14826 mdt: getattr_name("..") under striped directory

For getattr_name(".."), it should return FID of the master object for
striped directories. This includes changes on both client and server:
* lmv_getattr_name() should use master object FID if it's looking up
  "..".
* mdt_raw_lookup() should check parent object is sub stripe, if so
  it needs to lookup again to get master object FID. For old client
  without above change this needs to be checked twice.

This is needed by NFS export, because ll_get_parent() find parent by
getattr_name("..").

Reenable check_fhandle_syscall and update sanityn test_102.

Lustre-change: https://review.whamcloud.com/44168
Lustre-commit: TBD (from 1c4ab69260220be049645b4a38d06a671d21d752)

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I72c951293e41656ce3778750147402d7f8ca4cec
Reviewed-on: https://review.whamcloud.com/44289
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14627 lnet: Ensure ref taken when queueing for discovery
Chris Horn [Thu, 22 Apr 2021 19:51:44 +0000 (14:51 -0500)]
LU-14627 lnet: Ensure ref taken when queueing for discovery

Call lnet_peer_queue_for_discovery() in
lnet_discovery_event_handler() to ensure that we take a ref on
the peer when forcing it onto the discovery queue. This also ensures
that the peer state has LNET_PEER_DISCOVERING.

Add a test to sanity-lnet.sh that can trigger the refcount loss bug
in discovery.

Lustre-change: https://review.whamcloud.com/43418
Lustre-commit: 2ce6957b69370b0ce75725d1d91866bf55c07fa8

HPE-bug-id: LUS-7651
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie2908668c4ffde0f993b5b7ea9aa58acd1d6fa9c
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-on: https://review.whamcloud.com/44272
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14327 tests: skip sanity-sec test 55 for older servers
James Nunez [Tue, 8 Jun 2021 16:34:29 +0000 (10:34 -0600)]
LU-14327 tests: skip sanity-sec test 55 for older servers

sanity-sec test 55 was added to lustre-master version
2.13.57.12 and to lustre-b2_12 version 2.12.6.3.  When
we run version interop testing with Lustre servers less
than these versions, the test will fail.  Thus, skip
sanity-sec test 55 for Lustre severs less than 2.12.6.3.

Lustre-change: https://review.whamcloud.com/43950
Lustre-commit: 34c5a9e1ec2a82b10f9e85bc54cb2a48da0d5037

Fixes: 355787745f21 (“LU-14121 nodemap: do not force fsuid/fsgid squashing”)
Test-Parameters: trivial
Test-Parameters: serverversion=2.12.6 serverdistro=el7.9 env=ONLY=55 testlist=sanity-sec
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ie002c921e853897105396185b38485799df31b7a
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44278
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14533 tests: skip sanity-pfl 0d for older servers
James Nunez [Thu, 10 Jun 2021 21:05:18 +0000 (15:05 -0600)]
LU-14533 tests: skip sanity-pfl 0d for older servers

sanity-pfl test 0d was added in commit v2_14_0-39-gd645373541.
When we run version interop testing with servers with
version less than this, the test will fail.

We should skip sanity-pfl test 0d if the Lustre server
version is less than 2.14.0.1.

Lustre-change: https://review.whamcloud.com/43971
Lustre-commit: 06d2128b3bd0a3ec45e5d54cad29c310ae3ded7c

Fixes: 83e38bba62 ("LU-14180 utils: verify setstripe comp_end is valid")
Test-Parameters: trivial testlist=sanity-pfl
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I49b45c7a1e4804fece33d53a4fb946b49254de2b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44274
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
3 years agoLU-13716 tests: skip sanity 205b for older servers
James Nunez [Sat, 12 Jun 2021 00:05:08 +0000 (18:05 -0600)]
LU-13716 tests: skip sanity 205b for older servers

Lustre job stats and sanity test 205b were modified in Lustre
version 2.13.54.91.  When we run version intop testing with
servers less than this version and clients that are greater,
the test will fail.

Skip sanity test 205b for Lustre servers with version less than
2.13.54.91 and client greater than that version.

Lustre-change: https://review.whamcloud.com/43993
Lustre-commit: 1ce4d064801f30d3498d7e2c55ef3e699e4ef585

Test-Parameters: trivial
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Icc5d6a6adcf03e5bd16b678596f28590fe31516e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44271
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 osd: add mount option "resetoi"
Lai Siyao [Wed, 3 Feb 2021 03:44:15 +0000 (11:44 +0800)]
LU-14119 osd: add mount option "resetoi"

OI files on zfs are special, and they can't be deleted by user space
tools like rm. Sometimes the OI files may contain stale OI mappings,
and they needed to be removed for namespace consistency. Add a mount
option 'resetoi' to recreate OI files on mount time, and it will
support both ldiskfs and zfs. This should be the standard way to
recreate OI files, other than mount as backend filesystem and unlink
them manually.

Add sanity-scrub 18.

Lustre-change: https://review.whamcloud.com/41402
Lustre-commit: f37bce8a573dfc5aac1b9f51f4d5c8314ba05d30

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idc0e4c2f3b81675c49c6c005bc30b61d8fd04503
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44232
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14779 utils: no DNS lookups for NID in get_param
Andreas Dilger [Wed, 23 Jun 2021 08:20:24 +0000 (02:20 -0600)]
LU-14779 utils: no DNS lookups for NID in get_param

Calling libcfs_str2nid() speculatively in "lctl get_param" to see
if there is a NID in the parameter name results in multiple DNS
lookups for invalid hostnames (e.g. "exports.192.168.0.10"). That
may take a very long time if there are a large number of connected
clients, and if the DNS server overloaded or is having problems.

Instead of doing these speculative NID conversions, skip the whole
NID string in the parameter name for the two known parameters that
may contain a NID ("*.exports.<NID>.*" and "*.MGC<NID>.*").  This
is considerably faster since it is only working on a local string.

If new parameters are added that contain a NID (unlikely, but
possible), then "clean_path()" would need to be updated as part
of that change.

Lustre-change: https://review.whamcloud.com/44056
Lustre-commit: f21c507fbca2afab5a5d97d4e816696a69d1c593

Test-Parameters: trivial
Fixes: 85cbe1a3ee69 ("LU-5030 util: migrate lctl params functions to use cfs_get_paths()")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I51f865e4ce3a7bc4879f9d688c4b3a68d731810f
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44281
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoEX-2952 utils: add libzpool
Alex Zhuravlev [Thu, 1 Apr 2021 07:43:17 +0000 (10:43 +0300)]
EX-2952 utils: add libzpool

as libmount_utils uses get_system_hostid() provided by libzpool

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id07de07c9e0bb0efb598ccf8e7abcb389a875318
Reviewed-on: https://review.whamcloud.com/43193
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44236

3 years agoLU-14175 osd: print inode number with FID in OI scrub
Andreas Dilger [Sat, 27 Mar 2021 19:50:33 +0000 (13:50 -0600)]
LU-14175 osd: print inode number with FID in OI scrub

When debugging OI Scrub problems, also print the inode number
with the FID so that it is easier to find the problematic inode.
Otherwise, if the OI is broken it is not easy to find the inode
in question without a full filesystem scan.

Lustre-change: https://review.whamcloud.com/43153
Lustre-commit: 5bab4acf8320b46076c81f32f7954f91dae21bc9

Test-Parameters: trivial testlist=sanity-scrub,sanity-lfsck
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I217624ff2116326f86e053bcfacc6f19873ebbe5
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44235
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-6142 mdc: include linux/idr.h for referenced code
Andreas Dilger [Fri, 16 Apr 2021 06:03:00 +0000 (00:03 -0600)]
LU-6142 mdc: include linux/idr.h for referenced code

Include the <linux/idr.h> header in files that references IDR
functionality.  Don't depend on its indirect inclusion elsewhere.

Lustre-change: https://review.whamcloud.com/43346
Lustre-commit: 3589a3141a4b9f94887b3ac5d6202233b06b8996

Test-Parameters: trivial
Fixes: 66172e3274ca ("LU-13238 ofd: add OFD access logs")
Fixes: d0423abc1adc ("LU-12506 changelog: support large number of MDT")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icdd03e15d31eabc4a1363d1757fc4db7723ebbe5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-on: https://review.whamcloud.com/44233
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 osd: delete stale OI mapping entry
Lai Siyao [Wed, 24 Feb 2021 03:31:06 +0000 (11:31 +0800)]
LU-14119 osd: delete stale OI mapping entry

Once LMA check shows OI mapping entry is stale, delete it from
OI table, as can avoid removing whole OI files.

Don't add OI mapping into cache until osd_fid_lookup(), because
the mapping in OI is not trustable until FID in LMA is checked,
otherwise it may mislead LFSCK.

Lustre-change: https://review.whamcloud.com/41741
Lustre-commit: 99d00b97ef5f209a002f250e7772055ff1a6d6d6

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I4b50dcc02149d485e4bf4a361ca2994daa280feb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44231
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 osd-zfs: enable LUDA_VERIFY
Lai Siyao [Tue, 19 Jan 2021 13:37:50 +0000 (21:37 +0800)]
LU-14119 osd-zfs: enable LUDA_VERIFY

In osd_dir_it_rec(), if dirent is successfully got, and the FID in
dirent is sane, it returns right away, however if
LUDA_VERIFY|LUDA_VERIFY_DRYRUN is set, the FID in dirent should be
compared with the FID in LMA, and replaced with the latter one if
they are differet.

Lustre-change: https://review.whamcloud.com/41274
Lustre-commit: f5136e81957e4b67ae6ed7764d378b817fac5ee2

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I35e2a4d4606044cd37cc5847cffc577740918988
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44230
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14204 tests: make sure we have a single import
Sebastien Buisson [Wed, 9 Dec 2020 17:53:12 +0000 (18:53 +0100)]
LU-14204 tests: make sure we have a single import

In sanity, retrieve the exact name of the import being used on the
client, in order to properly get information such as lock_count
or lru_size.

Lustre-change: https://review.whamcloud.com/41758
Lustre-commit: 9bbc45d3f48acf79a1ad0a1161af832e040ee52f

Lustre-change: https://review.whamcloud.com/42019
Lustre-commit: 4541d5424ebcac028864f454af0e650f8ee9468b

Change-Id: I065b7da7990c7171d5baa24f3400c5f8ffc12fc9
Test-Parameters: trivial
Test-Parameters: env=SHARED_KEY=true testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41855
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14048 obd: fix race between connect vs disconnect
Yang Sheng [Thu, 18 Feb 2021 15:22:09 +0000 (23:22 +0800)]
LU-14048 obd: fix race between connect vs disconnect

The export nid hash would be removed in class_disconnect, But
still a race window exists in target_handle_connect to add it back.
Then the process of cleanup will wait infinity.

Lustre-commit: 8081979e76b3c07f629e0943fcd6d8b0285719e3
Lustre-change: https://review.whamcloud.com/41687

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I9ad3edbd040b81e2aef7ae22494302d9a478d65b
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44226
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14575 ofd: suppress errors on missing parent FID
John L. Hammond [Wed, 31 Mar 2021 17:45:05 +0000 (12:45 -0500)]
LU-14575 ofd: suppress errors on missing parent FID

In ofd_access(), if the parent FID is zero then skip adding an entry
to the OFD access log.

Lustre-change: https://review.whamcloud.com/43184
Lustre-commit:4a6ed7d6351e4fffd8af5745bfe7cbb161c46858

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ib518dc1f181a820d99021dd58ab548916e16f29d
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44225
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14566 lnet: Skip discovery in LNetPrimaryNID if DD disabled
Chris Horn [Fri, 26 Mar 2021 16:28:18 +0000 (11:28 -0500)]
LU-14566 lnet: Skip discovery in LNetPrimaryNID if DD disabled

If discovery is disabled locally then the discovery thread will not
modify any peer objects as a result of the discovery process. Thus,
the primary NID of any peer we're asked to discover will not change
as a result of discovery. Therefore, we do not need to actually
perform discovery in LNetPrimaryNID() if discovery is disabled
locally. Since this routine can result in long client mount times
when a Lustre server is down we should avoid this unnecessary
discovery.

Lustre-change: https://review.whamcloud.com/43141
Lustre-commit: 16264da9e3c43a6368a25b6ded4113e8cfa57427

Test-Parameters: trivial
HPE-bug-id: LUS-9887
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6d188e16422ad47a146d52bb24cdd1b77a30aa71
Reviewed-on: https://review.whamcloud.com/43141
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44224
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14536 obi2lnd: don't try to reconnect if there's no listener
Li Dongyang [Fri, 19 Mar 2021 10:21:58 +0000 (21:21 +1100)]
LU-14536 obi2lnd: don't try to reconnect if there's no listener

For each discovery we try to reconnect up to retry_count times,
default to 5. during MDT mount process conf log, there will be
multiple discovery made for each OST.
If the OSTs are not up, the mount will have a long time out.

Lustre-change: https://review.whamcloud.com/42111
Lustre-commit: 0ab06eb9d865a47ea3e09880a41a9e8f0a78b6a6

Change-Id: If1d854216d2f26089c52d3fb501092b7f48a444d
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44223
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14536 o2iblnd: don't resend if there's no listener
Li Xi [Tue, 13 Jul 2021 09:22:39 +0000 (17:22 +0800)]
LU-14536 o2iblnd: don't resend if there's no listener

If there's no listener at remote peer, we will
get IB_CM_REJ_INVALID_SERVICE_ID, currently we
will try to resend which makes the discovery longer
than necessary when connecting to a node which is
not up.
Use -EHOSTUNREACH instead of -ECONNREFUSED,
so we don't end up queued for resend.

Lustre-change: https://review.whamcloud.com/42109
Lustre-commit: 65e3e4050ec5bb371c1c343fca49a605286a086e

Change-Id: Ifaf14bc3ada2e2469669285917e366af669817e2
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44222
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-3046 lipe: remove IML sockets
John L. Hammond [Wed, 21 Apr 2021 15:28:26 +0000 (10:28 -0500)]
EX-3046 lipe: remove IML sockets

Remove the untested and unmaintained IML sockets from lamigo and
lpurge.

Lustre-change: https://review.whamcloud.com/43374
Lustre-commit: 1e58e57300484952da3e4e7801c8f0d1a9aa5b84

Test-Parameters: trivial
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I75346b7a1478b9f5902e4b139f2ee9e2fca67ffc
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44221
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-3468: Include mlnx-tools when we build MoFED
Gaurang Tapase [Tue, 13 Jul 2021 07:19:18 +0000 (12:49 +0530)]
EX-3468: Include mlnx-tools when we build MoFED

Test-Parameters: trivial

Change-Id: I388810611435ed080b951ff3abae966116bafdf8
Signed-off-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-on: https://review.whamcloud.com/44220
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoRM-620 build: New tag 2.14.0-ddn5
Andreas Dilger [Tue, 13 Jul 2021 21:42:25 +0000 (15:42 -0600)]
RM-620 build: New tag 2.14.0-ddn5

New tag 2.14.0-ddn5

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I53a25184ebf79bb984fa4ae4e40266d4634c4cb1

3 years agoEX-2723 kernel: fix potential infinite loop
Wang Shilong [Mon, 12 Apr 2021 08:43:22 +0000 (16:43 +0800)]
EX-2723 kernel: fix potential infinite loop

In dquot_writeback_dquots(), we write back dquot from dirty dquots
list. There is a potential infinite loop if ->write_dquot() failure
and forget remove dquot from the list. This patch clear dirty bit
anyway to avoid it.

Snapshot destroy might dirty quota list, and umount will hang if
filesystem has been mounted as RO because of corrupted image.

Linux-commit: dd5f6279732e8885061d7455b9d86fdcfdf7f183

Change-Id: If5e9db82eacc3a6a621566fb612b55071e51da25
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/43732
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14430 mdd: use own buffer for changelog
Sebastien Buisson [Wed, 12 May 2021 06:55:29 +0000 (09:55 +0300)]
LU-14430 mdd: use own buffer for changelog

Use own persistent buffer for changelog needs to don't
interfere with generic big_buf in MDD thread info which
can be in use.

Lustre-change: https://review.whamcloud.com/43672
Lustre-commit: c352b89dc981e5ebe73c8bd2d9e0949094c828b2

Fixes: f3d03bc38a ("LU-14430 mdd: fix inheritance of big default ACLs")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I4f4692b5556eaa98e2e23d7b58c925e33401e4e5
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/43731
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14673 sec: annotate algorithms taking optional key
Sebastien Buisson [Tue, 11 May 2021 08:59:03 +0000 (10:59 +0200)]
LU-14673 sec: annotate algorithms taking optional key

Crypto algorithms implementing a ->setkey() method but that can also
be used without a key must set the CRYPTO_ALG_OPTIONAL_KEY flag if
defined in the kernel.
In Lustre, adler32 implementation defines a ->setkey() method, but
its "key" is not actually a cryptographic key.

Linux-commit: a208fa8f33031b9e0aba44c7d1b7e68eb0cbd29e

Lustre-change: https://review.whamcloud.com/43656
Lustre-commit: b161e7b777e63bb4328aeab9e50560f919fedc31

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I362211d1b1aa3763fe1481cebb3629b255f29e41
Reviewed-on: https://review.whamcloud.com/43860
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14747 kernel: kernel update RHEL7.9 [3.10.0-1160.31.1.el7]
Jian Yu [Mon, 21 Jun 2021 18:43:24 +0000 (11:43 -0700)]
LU-14747 kernel: kernel update RHEL7.9 [3.10.0-1160.31.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.31.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I19c82c5b323ae5f097bc891f75beb69ecfea706d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44045
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14775 kernel: kernel update SLES12 SP5 [4.12.14-122.74.1]
Jian Yu [Mon, 21 Jun 2021 18:53:02 +0000 (11:53 -0700)]
LU-14775 kernel: kernel update SLES12 SP5 [4.12.14-122.74.1]

Update SLES12 SP5 kernel to 4.12.14-122.74.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: I98952c097b14c68f744a570e5558fb21d9392ad2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44046
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14195 llite: add force_uaccess_{begin,end} helpers
Jian Yu [Tue, 6 Jul 2021 23:00:20 +0000 (16:00 -0700)]
LU-14195 llite: add force_uaccess_{begin,end} helpers

Linux kernel version 5.10 adds force_uaccess_begin() and
force_uaccess_end() helpers to wrap get_fs() and set_fs()
for undoing any damange done by set_fs(KERNEL_DS).

Lustre-change: https://review.whamcloud.com/44165
Lustre-commit: TBD (from 597a8a1e0c4c09b86b7d4e860cdcd6a3fedcb6dc)

Fixes: a84af70dcab ("LU-12358 pcc: add project quota support on PCC backend")
Change-Id: I68745a8a1e26312ffe6ee8388f962b9c834df97b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44160
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14195 libcfs: switch to kfree_sensitive
Mr NeilBrown [Tue, 6 Jul 2021 00:15:47 +0000 (17:15 -0700)]
LU-14195 libcfs: switch to kfree_sensitive

In Linux 5.10, kzfree() has been renamed kfree_sensitive().

So switch to the new name and provide back-compat support for older
kernels.

Lustre-change: https://review.whamcloud.com/40908
Lustre-commit: 67d17dd590f913643f5adc8aced369221faccf05

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If665168477a0b6241a8ddf31a111cd465fe97783
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-on: https://review.whamcloud.com/44144
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14195 lustre: remove 'fs' from 'struct lvfs_run_ctxt'
Mr NeilBrown [Tue, 6 Jul 2021 00:10:15 +0000 (17:10 -0700)]
LU-14195 lustre: remove 'fs' from 'struct lvfs_run_ctxt'

The code protected by push_ctxt() and pop_ctx() never tries to access
any user-space data, so call set_fs() to KERNEL_DS is not needed.

So remove the 'fs' field and related code.

In linux-5.10 this code fails to compile as set_fs() is deprecated.

Lustre-change: https://review.whamcloud.com/40910
Lustre-commit: de60e7767c0e3ba38f4de37e46328012780b6d19

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idb2744d656dc4228375b6da54673e38cc1c112f5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44143
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14195 osd: don't use set_fs() for ->fiemap() calls.
Mr NeilBrown [Tue, 6 Jul 2021 00:06:18 +0000 (17:06 -0700)]
LU-14195 osd: don't use set_fs() for ->fiemap() calls.

->fiemap() only accesses kernel-space data, so does not need, and
never has needed, set_fs() calls.
In Linux 5.10, these calls are deprecated.
So remove the unnecessary code.

Lustre-change: https://review.whamcloud.com/40909
Lustre-commit: d0337cab8e845efcdbfb9e26e573feb18f28e303

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id336855b4787ddbf656dfa3b8d0b12f663564795
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44142
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14195 build: Adjust Makefile for Linux build changes.
Mr NeilBrown [Tue, 6 Jul 2021 00:03:32 +0000 (17:03 -0700)]
LU-14195 build: Adjust Makefile for Linux build changes.

Since v5.10-rc1~51^2~19, "KBUILD_BUILTIN" has been unset
for module builds.  This means that "targets-for-builtin"
isn't built, and that is how "extra-y" is built.

So we need another way to force LUSTRE_KERNEL_TEST to be built.

Since v5.6-rc1~1^2~5 any target listed in "always-y" will always get
built.  So we can assign LUSTRE_KERNEL_TEST to this macro.

Assigning both macros is safe, even for those kernels which include
both in the list of targets.

Lustre-change: https://review.whamcloud.com/40907
Lustre-commit: 9b9e19ca50491f2b74a9bb99f63591147b91bdd5

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I508b3710579c068dec93baf81ee383f3f03bd370
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44141
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoLU-14073 ofd: remove use of smp_read_barrier_depends()
Mr NeilBrown [Mon, 5 Jul 2021 23:59:55 +0000 (16:59 -0700)]
LU-14073 ofd: remove use of smp_read_barrier_depends()

Linux 5.9 removes smp_read_barrier_depends(), so lustre must stop
using it.

There is only one use: in ofd_access_log.c.
This use is unnecessary and can simply be removed.

The code is based on "Documentation/core-api/circular-buffers.rst"
which gives no indication that this barrier is needed.

The comment say its purpose is to ensure the index is read before the
data is read. This is unnecessary.
The data is written in osl_write_entry(), then a barrier is issued
(smp_store_release) before the ->head is written.
oal_read_entry() issues a barrier (smp_load_acquire()) before reading
that head.
'tail' is read without a barrer, but it then compared against ->head
in CIRC_CNT().  Even if reading ->tail was racey, the fact that
comparing it wilth ->head succeeded means that the data written at
->tail must have been safely written, and we can now read it without
any further barrier.

Lustre-change: https://review.whamcloud.com/40394
Lustre-commit: 9d2776f02b67354b58e9ff93bd7fe5b5495ee288

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9d0f0aeb67e1188d2012f4ae2e14b3656211c3e2
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44140
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
3 years agoEX-3292 pcc: avoid to specify ID for every attach
Qian Yingjin [Tue, 8 Jun 2021 09:54:49 +0000 (17:54 +0800)]
EX-3292 pcc: avoid to specify ID for every attach

In this patch, it avoids the need to specify "-i <attach_id>" for
every attach as in the very common case there is only a single
cache for that client.
If attach ID is not specified, it will select the first dataset
on the client as PCC backend.

And the new format of PCC state is as follows:
file: /mnt/lustre/f42.sanity-pcc, type: readonly, PCC_file:
/d42.sanity-pcc/0402/0x200000401:0x3:0x0, open_count: 0, flags: 0

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Icd23eda5dca4711f9bb7af940f6cef5ddb97ce69
Reviewed-on: https://review.whamcloud.com/43946
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
3 years agoLU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4]
Jian Yu [Sun, 6 Jun 2021 07:47:08 +0000 (00:47 -0700)]
LU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4]

This patch makes changes to support new RHEL 8.4 release
for Lustre client.

Lustre-change: https://review.whamcloud.com/43725
Lustre-commit: f269497ac7a730880e590eb9e8405f082522c5e0

Test-Parameters: trivial clientdistro=el8.4
Change-Id: I47d4706f9175d489ef0e6226492af20f44f0677e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43768
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoEX-2908 build: include ucx when we build MOFED
Minh Diep [Thu, 20 May 2021 16:02:11 +0000 (09:02 -0700)]
EX-2908 build: include ucx when we build MOFED

openmpi depends on ucx

Change-Id: If0967a655c29003939758d1099f316fc02896fe2
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43752
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
(cherry picked from commit fd1004074434f79d4086eca49e7eb5a326f58d0e)
Reviewed-on: https://review.whamcloud.com/43848

4 years agoEX-3191 pcc: add test for mmap | write | detach racer
Qian Yingjin [Thu, 27 May 2021 07:52:29 +0000 (15:52 +0800)]
EX-3191 pcc: add test for mmap | write | detach racer

This patch adds the mmap racer among: (write | read | mmap_cat |
detach | unlink).

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I5db160851a95937275fea6ae32f40dcd0fe69f46
Reviewed-on: https://review.whamcloud.com/43842
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2579 pcc: support a flatter HSM archive format
Qian Yingjin [Thu, 29 Apr 2021 13:10:05 +0000 (21:10 +0800)]
EX-2579 pcc: support a flatter HSM archive format

Add versioning (v1 and V2) to the HSM (PCC) archive format (directory
layout):
v1: (oid & 0xffff)/-/-/-/-/-/FID
v2: ((oid ^ seq) & 0xffff)/FID

v1 is the original layout and the default. v2 is the new layout which
should be selected for new installs.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If660f3cf4c02469bb23e65a44f86f0346367adf6
Reviewed-on: https://review.whamcloud.com/43493
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Feng, Lei <flei@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14359 hsm: support a flatter HSM archive format
John L. Hammond [Fri, 22 Jan 2021 16:56:06 +0000 (10:56 -0600)]
LU-14359 hsm: support a flatter HSM archive format

Add versioning (v1 and v2) to the HSM archive format (directory
layout):
  v1: (oid & 0xffff)/-/-/-/-/-/FID
  v2: ((oid ^ seq) & 0xffff)/FID

v1 is the original layout and the default. v2 is the new layout which
should be selected for new installs.

Add an option --archive-format to select the archive format.

Add YAML configuration file support to lhsmtool_posix with properties
achive_format and archive_path. Add an option --config to set the
config file.

Adapt sanity-hsm and test-framework to allow testing of both archive
formats.

Lustre-change: https://review.whamcloud.com/41312
Lustre-commit: 65062463199fa76b6313e9452e3ab9590cbedaa2

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6d6bd0c8817a491848b554fa76078d876549cc1f
Reviewed-on: https://review.whamcloud.com/43490
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoRM-620 build: New tag 2.14.0-ddn4
Andreas Dilger [Wed, 19 May 2021 02:36:27 +0000 (20:36 -0600)]
RM-620 build: New tag 2.14.0-ddn4

New tag 2.14.0-ddn4

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ife64820c72a134ce5ae749d5c61cbc8511b3a9de

4 years agoLU-14502 lov: fault page update cp_lov_index
Bobi Jam [Tue, 9 Mar 2021 09:15:20 +0000 (17:15 +0800)]
LU-14502 lov: fault page update cp_lov_index

In fault IO, vvp_io_fault_start() could find an existing cl_page
associated with the vmpage covering the fault index, and the page
may still refer to another mirror of an old IO.

This patch update the fault page's cp_lov_index in lov_io_fault_start

Lustre-commit: e9bac5fa455eab5371cdfb141b73a3beb0cc8d9c
Lustre-change: https://review.whamcloud.com/41954

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I50639700159a76061437fd2f1a09dadf25cfd33f
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43454
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3]
Jian Yu [Wed, 5 May 2021 17:36:09 +0000 (10:36 -0700)]
LU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.22.1.el8_3.

Test-Parameters: trivial \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.3 serverdistro=el8.3 testlist=sanity

Change-Id: I1a3152d95822a74e05f9b44f590a6cdb1f8b02b6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43547
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14671 kernel: kernel update SLES15 SP2 [5.3.18-24.61.1]
Jian Yu [Mon, 10 May 2021 21:18:49 +0000 (14:18 -0700)]
LU-14671 kernel: kernel update SLES15 SP2 [5.3.18-24.61.1]

Update SLES15 SP2 kernel to 5.3.18-24.61.1 for Lustre client.

Test-Parameters: trivial \
env=SANITY_EXCEPT="100 130 136 817" \
clientdistro=sles15sp2 serverdistro=el7.9 \
testlist=sanity

Change-Id: Ie0aab7cc7200796ed8e4d75862ceaef020943c08
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43631
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7]
Jian Yu [Mon, 10 May 2021 19:38:11 +0000 (12:38 -0700)]
LU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.25.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ic846d648c45476cc4886ce86577605bf3e66d935
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2]
Jian Yu [Mon, 10 May 2021 21:59:02 +0000 (14:59 -0700)]
LU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2]

Update SLES12 SP5 kernel to 4.12.14-122.66.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ib2bf4795ccb21dbd0bb9202228ff32d73a203eee
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43634
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2659 tests: add sanity-lipe.sh to test LiPE utilities
Jian Yu [Tue, 4 May 2021 17:41:23 +0000 (10:41 -0700)]
EX-2659 tests: add sanity-lipe.sh to test LiPE utilities

This patch adds sanity-lipe.sh test script to test
lipe_find and lipe_scan utilities for LiPE.

Lustre-change: https://review.whamcloud.com/42151
Lustre-commit: 5d67c987c8d2dc393b1e0952fe01d33978efdea0

Test-Parameters: trivial testlist=sanity-lipe

Change-Id: I69d82f7e3675becb4e38915ff363e853d0accb77
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43536
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-3176 build: remove extra_version from kernel rpm
Minh Diep [Fri, 14 May 2021 16:56:06 +0000 (09:56 -0700)]
EX-3176 build: remove extra_version from kernel rpm

This will allow us to use the same kernel for both
ES5 and ES6

Change-Id: I7e49f97b28d6e74ab6fe79f0438900c3ebd665df
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43708
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14055 lmv: reduce struct lmv_obd size
Andreas Dilger [Tue, 11 May 2021 07:11:36 +0000 (00:11 -0700)]
LU-14055 lmv: reduce struct lmv_obd size

The lmv_obd struct contains lmv_mdt_descs which is large enough
to reference 512 * 512 = 262144 targets, but there can be only
65536 OSTs or MDTs in a single filesystem today.

Shrink the allocation size to match the current limits, reducing
the size of obd_device.u since this is the largest union member.

This reduces the size of each obd_device from 6752 to 4568 bytes.

Lustre-change: https://review.whamcloud.com/41162
Lustre-commit: e11deeb1e6d114608eac4ee998d4cea22e30b0f5

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I752b021bdb5d02e3ead3bb266121be5dbf3ebbe5
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43651
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: support absence of account_page_dirtied
Mr NeilBrown [Tue, 11 May 2021 07:05:12 +0000 (00:05 -0700)]
LU-13783 libcfs: support absence of account_page_dirtied

Some kernels export neither account_page_dirtied nor
kallsyms_lookup_name.
For these kernels we need to use __set_page_dirty() and suffer the
cost of dropping an reclaiming the page-tree lock.

Lustre-change: https://review.whamcloud.com/40827
Lustre-commit: 6be4b3118c16039cff52e9a781b7d1852489a969

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I69d934480832f3909d3ec103f11e1d62489d70d7
Reviewed-on: https://review.whamcloud.com/43650
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: use lsmcontext in security_release_secctx
Jian Yu [Tue, 11 May 2021 07:02:23 +0000 (00:02 -0700)]
LU-13783 libcfs: use lsmcontext in security_release_secctx

Kernel linux-hwe-5.8 (5.8.0-22.23~20.04.1) introduces
struct lsmcontext and uses it in security_release_secctx(),
which reduces the argruments from 2 to 1.

Lustre-change: https://review.whamcloud.com/43284
Lustre-commit: c9e644add7091299d030a96e46384912ac2bef50

Change-Id: I37e185493001d335b40ea0a6102db593cb18beb3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: add cfs_kallsyms_lookup_name()
Jian Yu [Tue, 11 May 2021 06:51:33 +0000 (23:51 -0700)]
LU-13783 libcfs: add cfs_kallsyms_lookup_name()

The inline kallsyms_lookup_name() added by
commit d7249d9d70a caused the following failures:

libcfs/include/libcfs/linux/linux-misc.h:150:21:
error: conflicting types for ‘kallsyms_lookup_name’
  150 | static inline void *kallsyms_lookup_name(char *func)
      |                     ^~~~~~~~~~~~~~~~~~~~

include/linux/kallsyms.h:76:15:
note: previous declaration of ‘kallsyms_lookup_name’ was here
   76 | unsigned long kallsyms_lookup_name(const char *name);
      |               ^~~~~~~~~~~~~~~~~~~~

This patch removes the inline kallsyms_lookup_name() definition
from linux-misc.h and adds cfs_kallsyms_lookup_name() to wrap
kallsyms_lookup_name() if it is exported or return NULL in case of
kallsyms_lookup_name() is not exported.

Lustre-change: https://review.whamcloud.com/43296
Lustre-commit: 783002035ae9612b5b0aa80f2342a2ee9e81c374

Fixes: d7249d9d70a ("LU-13783 libcfs: provide fallback kallsyms_lookup_name()")
Change-Id: I4b2d4499948a8586b48db68484491ec76c3a609d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43648
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13783 libcfs: provide fallback kallsyms_lookup_name()
Mr NeilBrown [Tue, 11 May 2021 06:41:37 +0000 (23:41 -0700)]
LU-13783 libcfs: provide fallback kallsyms_lookup_name()

Since Linux 5.7, kallsyms_lookup_name() is no longer exported, so we
cannot rely on it.

So test for this, and when not available provide a fallback which just
returns NULL.

As this was the only way to access apply_workqueue_attrs() in recent
kernels, we need to cope with the absence of that function.

Lustre-change: https://review.whamcloud.com/40826
Lustre-commit: d7249d9d70ac0fcfa665ece78634b495bc9a22cd

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I09cc00047ec163a9395c5acd415505a8586e4e99
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43647
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13783 libcfs: don't depend on sysctl support for debugfs
Mr NeilBrown [Tue, 11 May 2021 06:38:41 +0000 (23:38 -0700)]
LU-13783 libcfs: don't depend on sysctl support for debugfs

Since Linux v5.8-rc1~55^2~6 sysctl support routines like
proc_dointvec() expect a pointer to kernel-space, not userspace.

So stop using these function for debugfs files, and instead
provide bespoke functions.

Lustre-change: https://review.whamcloud.com/40832
Lustre-commit: d707b390aec5e95a1aec9910fb3c8248c231cbfb

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I340a748bbfbd066054a73299ce32698aa39a0e2d
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/43646
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: support __vmalloc with only 2 args.
Mr NeilBrown [Tue, 11 May 2021 06:35:09 +0000 (23:35 -0700)]
LU-13783 libcfs: support __vmalloc with only 2 args.

Since v5.8-rc1~201^2~19 Commit 88dca4ca5a93 ("mm: remove the pgprot
argument to __vmalloc") __vmalloc only takes 2 arguments.

So introduce __ll_vmalloc which takes 2 args, and calls
__vmalloc with correct number of args.

Lustre-change: https://review.whamcloud.com/40328
Lustre-commit: 2a32eaa35dd7b96bb29f6a17991f48fe07fa833e

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2c89512a12e28b27544a891620e448a9b752b089
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-on: https://review.whamcloud.com/43645
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13783 libcfs: support removal of kernel_setsockopt()
Mr NeilBrown [Tue, 11 May 2021 06:27:46 +0000 (23:27 -0700)]
LU-13783 libcfs: support removal of kernel_setsockopt()

Linux 5.8 removes kernel_setsockopt() and kernel_getsockopt(), and
provides some helper functions for some accesses that are
not trivial.

This patch adds those helpers to libcfs when they are not available,
and changes (nearly) all calls to kernel_[gs]etsockopt() to
either use direct access to a helper call.

->keepalive() is not available before v4.11-rc1~94^2~43^2~14
and there is no helper function, so for SO_KEEPALIVE we
need to have #ifdef code in the C file.

TCP_BACKOFF* setting are not converted as they are not available in
any upstream kernel, so no conversion is possible.

Also include some minor style fixes and change lnet_sock_setbuf() and
lnet_sock_getbuf() to be 'void' functions.

Lustre-change: https://review.whamcloud.com/39259
Lustre-commit: 99d9638d6c074b48f1c21c5c94d6dfe347eed3ee

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I539cf8d20555ddb3565fa75130fdd3acf709c545
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43644
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13783 libcfs: switch from ->mmap_sem to mmap_lock()
Mr NeilBrown [Tue, 11 May 2021 06:13:38 +0000 (23:13 -0700)]
LU-13783 libcfs: switch from ->mmap_sem to mmap_lock()

In Linux 5.8, ->mmap_sem is gone and the preferred interface
for locking the mmap is to suite of mmap*lock() functions.

So provide those functions when not available, and use them
as needed in Lustre.

Lustre-change: https://review.whamcloud.com/40288
Lustre-commit: 5309e108582c692f3b60705818fddc4a3b3b1345

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4ce3959f9e93eae10a7b7db03e2b0a1525723138
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43643
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13344 libcfs: Abstract proc_fs with proc_ops
Shaun Tancheff [Tue, 11 May 2021 04:22:12 +0000 (21:22 -0700)]
LU-13344 libcfs: Abstract proc_fs with proc_ops

Linux 5.6 introduces proc_ops with v5.5-8862-gd56c0d45f0e2
proc: decouple proc from VFS with "struct proc_ops"

Map proc_ops and it's members to file_operations and
the appropriate members for older kernels.

One remaining 'PROC_OWNER()' macro is left to deal with
proc_ops being unable to sensibly map the owner member.

Lustre-change: https://review.whamcloud.com/37873
Lustre-commit: 13cd0f9f667c6e138a8cb235d4920f8b749cb154

Test-Parameters: trivial
HPE-bug-id: LUS-8589
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3d8940a91b331c4f6bb31a9432194cc082c9cecd
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43642
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13344 all: Separate debugfs and procfs handling
Shaun Tancheff [Tue, 11 May 2021 04:16:20 +0000 (21:16 -0700)]
LU-13344 all: Separate debugfs and procfs handling

Linux 5.6 introduces proc_ops with v5.5-8862-gd56c0d45f0e2
proc: decouple proc from VFS with "struct proc_ops"

Separate debugfs usage and procfs usage to prepare for the divergence
of debugfs using file_operations and procfs using proc_ops

Lustre-change: https://review.whamcloud.com/37834
Lustre-commit: 76626d6c52b19b5cca04007c4b1656cc52a487c1

HPE-bug-id: LUS-8589
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I1746e563b55a9e89f90ac01843c304fe6b690d8b
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43641
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-13485 libcfs: FIELD_SIZEOF macro removed
Shaun Tancheff [Tue, 11 May 2021 03:59:41 +0000 (20:59 -0700)]
LU-13485 libcfs: FIELD_SIZEOF macro removed

Linux v4.15-rc2-5-g4229a470175b introduced sizeof_field() macro
Linux v5.5-rc4-1-g1f07dcc459d5 removed FIELD_SIZEOF() macro

Provide a sizeof_field() macro in terms of FIELD_SIZEOF()
when sizeof_field() is not provided.

Lustre-change: https://review.whamcloud.com/39710
Lustre-commit: 03b7befcc0a9308cbac91370046f6c00e5cf1005

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I48ca9abb931d58919d788199e5089984c9e854dd
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43640
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-6142 lustre: change super/file/inode operations to const
Mr NeilBrown [Tue, 11 May 2021 03:55:32 +0000 (20:55 -0700)]
LU-6142 lustre: change super/file/inode operations to const

All 'struct file_operations', 'struct inode_operations', 'struct
export_operations' and 'struct super_operations' are changed to
'const'.  This potenetially allows them to be placed in read-only
memory, and ensure they are never changed.

Lustre-change: https://review.whamcloud.com/39394
Lustre-commit: 140b9e6d736a8c11d660094fc11ee61a89264b13

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8b236f0248eca11f91f11da02fe18be3f6d2e17c
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43639
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-6142 lustre: make various 'struct file_operations' static
Mr NeilBrown [Tue, 11 May 2021 03:47:57 +0000 (20:47 -0700)]
LU-6142 lustre: make various 'struct file_operations' static

These 'struct file_operations' are only used locally, so make them
static.
Except lprocfs_evict_client_fops() which isn't used at all and doesn't
exist, so discard the declaration.

Lustre-change: https://review.whamcloud.com/39741
Lustre-commit: 950200a21fb0636c53eefc9b6337bf1d10ad121e

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib6c51683c1e765db202b3f72d2accebe17191303
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43638
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-930 misc: limit CDEBUG console message frequency
Andreas Dilger [Sat, 7 Nov 2020 07:53:28 +0000 (00:53 -0700)]
LU-930 misc: limit CDEBUG console message frequency

Some CDEBUG() messages have variable message levels, but if printed
to the console it is not rate limited like CWARN() and CERROR():

 server_bulk_callback()) event type 5, status -110
 server_bulk_callback()) event type 5, status -110
 server_bulk_callback()) event type 5, status -110
 :

Instead, use CDEBUG_LIMIT() for those messages to limit them.

Lustre-change: https://review.whamcloud.com/40571
Lustre-commit: 7462e8cad730897f459da31886c57585654f26b8

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9081398c7d014b2873e764dc283ce2f4623ebbe5
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43400
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-3080 pcc: avoid dead lock for auto attach in PCC-RO
Qian Yingjin [Wed, 12 May 2021 03:43:28 +0000 (11:43 +0800)]
EX-3080 pcc: avoid dead lock for auto attach in PCC-RO

In this patch, It releases the pcc inode lock when calling
ll_layout_refresh() in @pcc_try_auto_attach() as it may cause the
following deadlock:
1. The client is writing or truncating a file in readonly mode.
   At this time, it will send a write layout intent lock to clear
   the readonly state on the layout on MDT.
2. A read process tries to auto attach the file with pcc inode
   lock hold. During the pregress of auto attach, it will call
   ll_layout_refresh(). The client-side enqueue request for a
   layout lock returned a blocked lock, it will sleep and wait for
   the lock being granted;
3. MDT will take EX layout lock to cancel all cached layout lock
   on client to change the layout for clearing the PCC-RO state.
4. when the client handles the revocation of layout lock, it needs
   to invalidate the PCC state which needs under the protection of
   pcc inode lock.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I18890d19d03726a5991c923505e8c5363382fdc2
Reviewed-on: https://review.whamcloud.com/43668
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-3124 build: liblnetconfig.so.4 is needed by liblustreapi.so
Minh Diep [Wed, 12 May 2021 05:58:28 +0000 (22:58 -0700)]
EX-3124 build: liblnetconfig.so.4 is needed by liblustreapi.so

Need to include llnetconfig
libssh >= 0.8.0 does not provide libssh_thread.so anymore

Change-Id: Ia3884dd1c45712c099ab1e03739f6ba684c11ae1
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43670
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
4 years agoEX-3144 pcc: revalidate the pointer after attach
Yang Sheng [Tue, 11 May 2021 16:57:47 +0000 (00:57 +0800)]
EX-3144 pcc: revalidate the pointer after attach

We need refresh pointer again since the lock may
be released in pcc_try_readonly_open_attach.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I470358dfde525e08e7110e862b30b527e5db94fe
Reviewed-on: https://review.whamcloud.com/43662
Reviewed-by: Yingjin Qian <qian@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-3133 pcc: keep PCC copy when it is being attached
Qian Yingjin [Tue, 11 May 2021 03:38:36 +0000 (11:38 +0800)]
EX-3133 pcc: keep PCC copy when it is being attached

When detach a file from PCC backend via FID, if the file is being
attached, it should not purge the coresponding PCC copy from the
PCC backend. Just keep the PCC copy to finish the attach process.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I8a8f7c6986d51eaf9b2516e5dd5a6f21aa38b7db
Reviewed-on: https://review.whamcloud.com/43637
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2861 pcc: don't reopen mountpoint for each cache file
Qian Yingjin [Fri, 19 Mar 2021 08:45:26 +0000 (16:45 +0800)]
EX-2861 pcc: don't reopen mountpoint for each cache file

When scanning and processing files in the PCC cache filesystem
(e.g. "llapi_pcc_scan_detach()" is looking for the Lustre
mountpoint and reopening it for every file processed.

This patch changed it to open the Lustre mountpoint only once,
then reuse the file handle for all of the later calls. The file
handle will be closed when finished the processing.

This patch also repaces to use llapi_fid_parse to get FID from
an given string.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Iad92c216262296096e30ca4a4c6b2765dfd3afaa
Reviewed-on: https://review.whamcloud.com/42107
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoEX-2872 pcc: mtime rule for 'lctl pcc add'
Andreas Dilger [Sat, 20 Mar 2021 11:14:04 +0000 (05:14 -0600)]
EX-2872 pcc: mtime rule for 'lctl pcc add'

Add an "mtime>N" rule to allow skipping files for PCC-RO auto-attach
if they were created or modified more than N seconds ago.  Otherwise,
it may be that files are added to the PCC cache before they finished
writing, or if they will be modified again quickly after creation.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibb99bff5b483717ae6e5b83f82f1bcd86c3ebbe5
Reviewed-on: https://review.whamcloud.com/42122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2860 pcc: test interoperability with 2.14.0
Qian Yingjin [Thu, 25 Mar 2021 02:44:16 +0000 (10:44 +0800)]
EX-2860 pcc: test interoperability with 2.14.0

For Lustre 2.14.0 servers, it fails many of subtests that are
PCC-RO specific.
In this patch, each subtest related to PCC-RO adds an connect
flag check and skip it when run against old servers without
PCC-RO support.

Test-Parameters: serverversion=2.14 testlist=sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ie4fc41b2dc51a038027009fbcc6e86f9d61cd54f
Reviewed-on: https://review.whamcloud.com/43104
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoEX-2873 pcc: don't fallback sync attach for EINPROGRESS error
Qian Yingjin [Wed, 31 Mar 2021 09:44:06 +0000 (17:44 +0800)]
EX-2873 pcc: don't fallback sync attach for EINPROGRESS error

When a file is read-only attaching into PCC backend in background
with asynchronous mode by a thread, other threads trying to open
attach the same file will get -EINPROGRESS error code. It should
tolerate this erorr instead of falling back to synchronous attach
mode.

For asynchronous open attach, it can not reuse the Lustre file
handle directly for data copy when the file is opening for read
as the file position in the file handle can not be shread by the
user thread and the asynchronous attach thread in kernel on the
background. It needs reopen the file without O_DIRECT flag and
use the new Lustre file handle to do data copy from Lustre OSTs
to the PCC copy.

As i_size_read(inode) without stat() call sometimes returns zero
value, not the actual file size value. This may result in wrong
open attach action. Also it does not know whether the lazysize is
always going to be set. Thus, in this patch it uses max(lazysize,
i_size_read(inode)) to determine whether do open attach in
background asynchronously.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I80b88a8ba05af4af45433ba9be5b87854e116b10
Reviewed-on: https://review.whamcloud.com/43180
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14597 flr: allow multiple primary mirrors
Bobi Jam [Fri, 9 Apr 2021 04:53:07 +0000 (12:53 +0800)]
LU-14597 flr: allow multiple primary mirrors

Users can set "prefer" flag on any mirror/component, so the IO should
not report error if multiple mirrors are encountered.

Rename lod_mirror_entry::lme_primary to lme_prefer to avoid confusion.

Lustre-change: https://review.whamcloud.com/43247
Lustre-commit: 93258b9d93611e75b79c30f3ddfc2c9c21f25917

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I45748e56e38985a0d9028792ba3d976a4e03efb8
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43535
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
4 years agoLU-14468 utils: improve 'lfs rmfid' error messages
John L. Hammond [Tue, 23 Feb 2021 15:40:08 +0000 (09:40 -0600)]
LU-14468 utils: improve 'lfs rmfid' error messages

In lfs_rmfid_and_show_errors(), convert the error messages printed by
'lfs rmfid' from the format
  rmfid([0x20001a9f5:0x159:0x0]): rc = -39
to
  lfs rmfid: cannot remove [0x20001a9f5:0x155:0x0]: Directory not empty

Simplify the logic and swap rc and rc2 to follow conventions.

Lustre-commit: 6560ae08a788b3779118640837f68b499a99ee8c
Lustre-change: https://review.whamcloud.com/41727

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Iccd9e1054ed8842fc4f65dd601077cfdeaa1320c
Reviewed-on: https://review.whamcloud.com/41727
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14550 libcfs: fix setting of debug_path
Andreas Dilger [Thu, 25 Mar 2021 06:39:07 +0000 (00:39 -0600)]
LU-14550 libcfs: fix setting of debug_path

While it was possible to set "lctl set_param debug_path=path" or
"echo path > /sys/module/libcfs/parameters/libcfs_debug_file_path"
this change does not affect the path used to dump debug logs.

Connect these parameters to the pathname used for the debug log.

Lustre-commit: f7392c7c4a16bc1127ee448f937ba81c50dcdfd5
Lustre-change: https://review.whamcloud.com/43109

Test-Parameters: testlist=sanity env=ONLY=60f,ONLY_REPEAT=30
Fixes: 7092309f325 ("LU-8066 libcfs: migrate to debugfs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic18b5b24d1ac939c09637e66a342f5e3622367c3
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43450
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13730 lod: don't confuse stale with primary flag
Alex Zhuravlev [Thu, 11 Mar 2021 05:47:34 +0000 (08:47 +0300)]
LU-13730 lod: don't confuse stale with primary flag

there can be few in-sync replicas which are not primry.

Lustre-commit: 571f3cf1115973d0fdaf6d5244bfeee230b52989
Lustre-change: https://review.whamcloud.com/42003

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8b984463a2665bc88f2f76247df5366a68d74ea6
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43448
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13073 osp: don't block waiting for new objects
Alex Zhuravlev [Fri, 16 Oct 2020 16:09:04 +0000 (19:09 +0300)]
LU-13073 osp: don't block waiting for new objects

if OST is down, then it's possible that few threads trying
to get already precreated object will get stuck. even worse
that all QoS-based allocations then are serialized by the
single semaphore, even those that wouldn't try to allocate
on failed OST.

the patch introduces noblock flag in the allocation hint
which is passed to OSP. then QoS code tries to allocate
objects in a non-blocking manner.

Lustre-commit: 2112ccb3c48ccf86aaf2a61c9f040571a6323f9c
Lustre-change: https://review.whamcloud.com/40274

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I38e66d7569aefecf800dbc32f1049ac87853439e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/43148
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-14588 o2ib: make config script aware of the ofed symbols
Serguei Smirnov [Tue, 6 Apr 2021 22:54:01 +0000 (15:54 -0700)]
LU-14588 o2ib: make config script aware of the ofed symbols

LNet o2ib configuration script needs to be aware of the external
ofed dkms symbols when testing for availability of o2ib features
by building "conftest" kernel objects. If this is not done,
symbols from the core kernel are used by default which is
different from what is used when actually building LNet,
at least on Ubuntu. This patch adds the check for external symbols.

Lustre-commit: bcc5d784826d2d7a8eece28e96fab8b0fa02ab17
Lustre-change: https://review.whamcloud.com/43223

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Iea566f8a3feb86b8bef2f4501a3abc968d76451a
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14506 hsm: correct default stripe offset in import
John L. Hammond [Wed, 10 Mar 2021 15:20:29 +0000 (09:20 -0600)]
LU-14506 hsm: correct default stripe offset in import

In lhsmtool_posix, when calling llapi_hsm_import(), pass a stripe
offset of -1 rather than 0 to select the default. Add sanity-hsm
test_11c() to check that a file may be imported to a directory with a
default striping specifing a pool that does not include OST0000.

Lustre-commit: ea964031d7bdc6f31fccb7f136591b682eb35087
Lustre-change: https://review.whamcloud.com/41978

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I40636c0620b2f9314eb13bf23a8cf6d02990f851
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/43457
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoRM-620 build: New tag 2.14.0-ddn3
Andreas Dilger [Wed, 5 May 2021 04:14:41 +0000 (22:14 -0600)]
RM-620 build: New tag 2.14.0-ddn3

New tag 2.14.0-ddn3

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia872cc5544e97a281a1854b138aae19acb3ebbe5

4 years agoLU-14405 mdt: read LMV with mdt_stripe_get()
Lai Siyao [Tue, 9 Feb 2021 14:09:09 +0000 (22:09 +0800)]
LU-14405 mdt: read LMV with mdt_stripe_get()

mdt_path_current() reads LMV into mdt_thread_info.mti_xattr_buf,
whose size is static, and will return -ERANGE if LMV contains too
many stripes, instead it should call mdt_stripe_get(), the latter
will allocate dynamic memory for LMV.

Lustre-change: https://review.whamcloud.com/41452
Lustre-commit: 9dbfa36d3dd2434cfcffa13f76beb89fa3516586

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I1ed78f7a7f951fa5984e604a8773143a70b419e7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/41966
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-13440 utils: fix handling of lsa_stripe_off -1
Andreas Dilger [Tue, 4 May 2021 01:25:23 +0000 (19:25 -0600)]
LU-13440 utils: fix handling of lsa_stripe_off -1

Use LMV_OFFSET_DEFAULT instead of "-1" for parsing lfs_setdirstripe()
since parse_targets() will return "(__u32)-1" to the caller for the
stripe index, but lsa_stripe_off is a signed long long so it is
interpreted as 4294967295.  This causes the parsing to fail when
"lfs setdirstripe -i -1 --max-inherit-rr 1" is used.

Update sanity test_413a/413c to also specify "-i -1" to verify this.

Lustre-change: https://review.whamcloud.com/43530
Lustre-commit: TBD (from 792fa045a1975a1a18af0d72470134e5bf997d6a)

Fixes: 01d34a6b3b2e ("LU-13440 lmv: add default LMV inherit depth")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic934f859173155b1b2df56fcd315c8da633ebbe5
Reviewed-on: https://review.whamcloud.com/43524
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13439 lmv: qos stay on current MDT if less full
Andreas Dilger [Sun, 25 Apr 2021 11:02:19 +0000 (05:02 -0600)]
LU-13439 lmv: qos stay on current MDT if less full

Keep "space balanced" subdirectories on the parent MDT if it is less
full than average, since it doesn't make sense to select another MDT
which may occasionally be *more* full.  This also reduces random
"MDT jumping" and needless remote directories.

Reduce the QOS threshold for space balanced LMV layouts, so that the
MDTs don't become too imbalanced before trying to fix the problem.

Change the LUSTRE_OP_MKDIR opcode to be 1 instead of 0, so it can
be seen that a valid opcode has been stored into the structure.

Lustre-change: https://review.whamcloud.com/43445
Lustre-commit: 3f6fc483013da443b1494d81efe2d271ac67f901

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iab34c7eade03d761aa16b08f409f7e5d69cd70bd
Reviewed-on: https://review.whamcloud.com/43431
Tested-by: jenkins <devops@whamcloud.com>
4 years agoLU-13440 lmv: add default LMV inherit depth
Lai Siyao [Mon, 15 Mar 2021 03:57:36 +0000 (11:57 +0800)]
LU-13440 lmv: add default LMV inherit depth

A new field "__u8 lum_max_inherit" is added into struct lmv_user_md,
which represents the inherit depth of default LMV. It will be
decreased by 1 for subdirectories.

The valid value of lum_max_inherit is [0, 255]:
* 0 means unlimited inherit.
* 1 means inherit end.
* 250 is the max inherit depth.
* [251, 254] are reserved.
* 255 means it's not set.

A new field "__u8 lum_max_inherit_rr" is added, if default stripe
offset is -1, lum_max_inherit_rr is non-zero, and system is balanced,
new directories are created in roundrobin mannner, otherwise they
are created on the MDT where their parents are located to avoid
creating remote directories. And similarly this value will be
decreased by 1 for each level of subdirectories.

The valid value of lum_max_inherit_rr is different:
* 0 means not set.
* 1 means inherit end.
* 250 is the max inherit depth.
* [251, 254] are reserved.
* 255 means unlimited inherit.

However for the user interface of "lfs", the valid value is [-1, 250]:
* -1 means unlimited inherit.
* 0 means not set.
* others are the same.

Add sanity 413c.

Lustre-change: https://review.whamcloud.com/43131
Lustre-commit: 01d34a6b3b2e34f7414f627e4f87993322dafa78

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I98ccad8556a0469f83bd7d79f5086a2184d5b115
Reviewed-on: https://review.whamcloud.com/43429
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-14366 mdt: lfs mkdir should return -EEXIST if exists
Lai Siyao [Sat, 23 Jan 2021 10:28:26 +0000 (18:28 +0800)]
LU-14366 mdt: lfs mkdir should return -EEXIST if exists

'lfs setdirstripe' will try restripe if target exists, however
it's confusing to get -ENOTSUPP or -EALREADY for 'lfs mkdir', while
the latter invokes the same function as 'lfs setdirstripe'.

Pack MDS_OPEN_CREAT flag in request for 'lfs mkdir', and MDT won't
try restripe if it's set.

Add sanity 230s.

Lustre-change: https://review.whamcloud.com/41329
Lustre-commit: 65e3e4050ec5bb371c1c343fca49a605286a086e

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7b7ed04ee0b150253ff4d13bbdf1fe847d8f577c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/43428
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-13440 obdclass: server qos penalty miscaculated
Lai Siyao [Wed, 21 Apr 2021 12:05:52 +0000 (20:05 +0800)]
LU-13440 obdclass: server qos penalty miscaculated

Server qos penalty calculation uses active target count, but it
should use server count, which will make it larger than expected,
then weight of targets are often 0, and finally cause MDT0 is
often chosen in qos allocation.

Lustre-change: https://review.whamcloud.com/43385
Lustre-commit: 0ccce7ecb72f847f4235a513424d90119edad7ca

Fixes: 45222b2ef ("LU-12624 obdclass: lu_tgt_descs cleanup")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I1982363e4ff74c7344dd5e07d04e29214afa8a7f
Reviewed-on: https://review.whamcloud.com/43399
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13212 osc: fall back to vmalloc for large RPCs
Andreas Dilger [Mon, 12 Apr 2021 18:53:07 +0000 (12:53 -0600)]
LU-13212 osc: fall back to vmalloc for large RPCs

For large RPC sizes (16MB+) the page array (4096 brw_page) can
become very large (128KB+ with fscrypt) and should fall back to
vmalloc() if kmalloc() fails due to memory fragmentation.

The mdc/mdt allocations are currently limited to 1MB for readdir
RPCs, but it doesn't hurt to prepare them for larger RPCs from
clients in the future if this limit is increased.

Lustre-commit: 037a9e2cf6d5b8d6fdbcde02c1c22e22272c5c07
Lustre-change: https://review.whamcloud.com/43281

Fixes: 51b32ac2b9b8 ("LU-7990 rpc: increase bulk size")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I56805f5701d6850412664ce0681a1456b9405580
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43460
Tested-by: jenkins <devops@whamcloud.com>
4 years agoEX-2992 tests: add sleep to verify lamigo and lpurge params
Jian Yu [Tue, 4 May 2021 06:23:36 +0000 (23:23 -0700)]
EX-2992 tests: add sleep to verify lamigo and lpurge params

This patch improves verify_one_lamigo_param() and
verify_one_lpurge_param() in hot-pools.sh to try
more times while verifying lamigo and lpurge params
in case there is a latency time for the param(s)
to be updated.

Test-Parameters: trivial testlist=hot-pools \
env=HOT_POOLS_EXCEPT="56"
Test-Parameters: trivial clientdistro=el8.3 \
testlist=hot-pools env=HOT_POOLS_EXCEPT="56"
Test-Parameters: trivial clientdistro=sles15sp2 \
testlist=hot-pools env=HOT_POOLS_EXCEPT="56"
Test-Parameters: trivial testgroup=review-dne-part-2 \
env=SANITY_LFSCK_EXCEPT="30",HOT_POOLS_EXCEPT="56"

Lustre-change: https://review.whamcloud.com/43256
Lustre-commit: b465db8b9b99e217d175f31230d04e10a9a17906

Change-Id: I0f4818baa1c2cd87920ff3189461b45b53871e90
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43529
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-2718 tests: remove lpurge mds validation
John L. Hammond [Tue, 4 May 2021 06:04:05 +0000 (23:04 -0700)]
EX-2718 tests: remove lpurge mds validation

The lpurge changes for EX-2718 (use local mountpoint
for purge operations) lands with commit dfa0760e7d8.
However, the hot-pools.sh changes were missing somehow.
This patch adds the changes back to remove lpurge
mds validation.

Lustre-change: https://review.whamcloud.com/43140
Lustre-commit: 7b00329e09ef73335e33dd8e83bb7993c39990e9

Fixes: dfa0760e7d8 ("EX-2718 lpurge: use local mountpoint for purge operations")
Change-Id: I55a40dfd958e40859ddb5e98b3f76ad568d0095b
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43528
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoEX-3108 build: update kernel to -ddn13
Andreas Dilger [Tue, 4 May 2021 15:09:35 +0000 (09:09 -0600)]
EX-3108 build: update kernel to -ddn13

Update the kernel version to -ddn13 to match the version used
on b_es5_2 so that it is possible to just upgrade the Lustre
RPMs when moving from EXA5.2.2 to EXA6.0.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8845fa2c797769b94971e60dc92cdfb2c79bb570
Reviewed-on: https://review.whamcloud.com/43534
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Minh Diep <mdiep@whamcloud.com>
4 years agoEX-1135 lipe: add support to build against centos8
Gu Zheng [Fri, 8 May 2020 07:16:42 +0000 (03:16 -0400)]
EX-1135 lipe: add support to build against centos8

There's a huge difference between centOS8 and centOS 7 series, especailly
the strict distinction between python2 and python3, and related python
rpms or pypi packages are the same condition.
Following changes are introduced to add support to build against centOS8:
1. the python platform is strict to python2(python2.7)
2. use 'pip2' instead of 'pip' for pypi
3. improve dependency package list (rpm and pypi module), make it can
be acceptable to centOS7.x and centOS8.x
4. fix code sytle issues to make pylint/pep8 on centOS8 happy
5. set encoding via environ "PYTHONIOENCODING" if sys.setdefaultencoding
is gone (python2.7 on centOS8)
6. improve the lipe.spec to support "make rpms" against centOS8

Change-Id: Id172e2a6aa29f382c4d12ff0d2e748e8b0cde444
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/43483
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Gaurang Tapase <gtapase@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
4 years agoEX-3082 lipe: posix scan cannot get projid
Lei Feng [Mon, 26 Apr 2021 08:28:57 +0000 (16:28 +0800)]
EX-3082 lipe: posix scan cannot get projid

Regular file or directory can have projid. So if an entry is
not regular AND not directory, set projid to 0.

Change-Id: Id9e7dd471513817ac1cb9d146563b369f9ebe2eb
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/43447
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>