Whamcloud - gitweb
fs/lustre-release.git
5 years agoNew RC 2.10.7-RC2 2.10.7-RC2 v2_10_7-RC2
Andreas Dilger [Thu, 21 Mar 2019 21:58:37 +0000 (15:58 -0600)]
New RC 2.10.7-RC2

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-12073 tests: fix conf-sanity test_123 startup 28/34428/2
Andreas Dilger [Thu, 14 Mar 2019 22:19:52 +0000 (16:19 -0600)]
LU-12073 tests: fix conf-sanity test_123 startup

The startup of conf-sanity test_123aa() et.al. shouldn't try to use
all of the OSTs if they have not been formatted into the filesystem
by the previous tests.  It may run after test_109b() in a non-DNE
config, which is only using "startup" and "cleanup", so it should
do the same.

Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: testgroup=review-dne-zfs-part-1 testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibf76895ec51c881df0df257ccd680af3213ebbe5
Reviewed-on: https://review.whamcloud.com/34428
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
5 years agoLU-12065 lnd: increase CQ entries 27/34427/6
Amir Shehata [Wed, 20 Mar 2019 18:10:34 +0000 (11:10 -0700)]
LU-12065 lnd: increase CQ entries

Several sites have reported RDMA timeouts. Most of the timeouts
are occurring for transmits on the active_tx queue. Transmits are
placed on the active_tx queue until a completion is received. If
there isn't enough CQ entries available, it's possible for a
completions events to be delayed, causing these timeouts.

Lustre-change: https://review.whamcloud.com/34473
Lustre-commit: bf3fc7f1a7bf82c02181c64810bc3f93b3303703

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9edad734b5860ce20af4977b4c1cdc07f25f078e
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34427
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-10070 tests: Fix replay-single test_85b 06/34406/3
Patrick Farrell [Tue, 4 Dec 2018 21:16:50 +0000 (15:16 -0600)]
LU-10070 tests: Fix replay-single test_85b

test_85b of replay single sets a default striping on $DIR
and does not remove it.  This makes it impossible to
correctly test self-extending layouts, so fix this first.

This patch is back-port from:
Lustre-commit: 0b9fb772e68db7cbf0c8a755092c1d8b5de6b83d
Lustre-change: https://review.whamcloud.com/33777

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I0057c8403e3dae2437cf0c8810af8086e2971c35
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/34406
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew RC 2.10.7-RC1 2.10.7-RC1 v2_10_7-RC1
Oleg Drokin [Fri, 8 Mar 2019 06:02:35 +0000 (01:02 -0500)]
New RC 2.10.7-RC1

Change-Id: I87e4b30c7552468bf21408054e7ad9c9e55aa7c8
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11652 ldiskfs: remove ext4-mmp-dont-mark-bh-dirty.patch 63/34363/3
Minh Diep [Sat, 2 Mar 2019 02:31:32 +0000 (18:31 -0800)]
LU-11652 ldiskfs: remove ext4-mmp-dont-mark-bh-dirty.patch

Remove the ext4-mmp-dont-mark-bh-dirty.patch patch because this
fix is already included in the SLES12 SP3 kernel update.

Test-Parameters: trivial clientdistro=sles12sp3 ossdistro=sles12sp3 mdsdistro=sles12sp3

Fixes: e1445f24c5a0 ("LU-11652 kernel: kernel update [SLES12 SP3 4.4.162-94.69]")
Change-Id: I4a8e737b58587e4a40b512fa3171e1c46ae6b6f6
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34363
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
5 years agoLU-12015 build: kernel update Ubuntu16.04 4.4.0-142-generic 20/34320/5
Minh Diep [Mon, 25 Feb 2019 20:12:25 +0000 (12:12 -0800)]
LU-12015 build: kernel update Ubuntu16.04 4.4.0-142-generic

Update Ubuntu16.04 build to 4.4.0-142-generic.

Change-Id: I98ffe73b4f6ac7610f6b24560d8487c95e6ac168
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34320
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
5 years agoLU-10900 osd: wrong assertion in osd_transfer_project 93/33993/7
Li Xi [Wed, 11 Apr 2018 10:10:38 +0000 (06:10 -0400)]
LU-10900 osd: wrong assertion in osd_transfer_project

When project ID feature is not enabled on ldiskfs, the project
ID of any inode should be zero. osd_transfer_project() made
the opposite assertion.

Lustre-change: https://review.whamcloud.com/31947
Lustre-commit: ee9a90eafe173e22ad3c60a407a28f082fc95341

Fixes: LU-10565 osd: move ext4_tranfer_project to osd
Test-Parameters: trivial testlist=sanity-quota testgroup=review-ldiskfs \
mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I8c065b9453e0e2b3f9f26e39fc82e8e73902df91
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33993
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11943 llog: Reset current log on ENOSPC 46/34346/4
Patrick Farrell [Thu, 28 Feb 2019 17:56:41 +0000 (12:56 -0500)]
LU-11943 llog: Reset current log on ENOSPC

The original LU-10527 patch:
"LU-10527 obdclass: don't recycle loghandle upon ENOSPC"
https://review.whamcloud.com/33850

Kept the current log on ENOSPC.

This appears to cause llog corruption on failover, and the
other part of the original patch (removing an incorrect
assert) should be sufficient to fix the original issue.

Fixes: 51e962be60cf ("LU-10527 obdclass: don't recycle loghandle upon ENOSPC")

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I5b9de69b8e737ca540e9a5af9ff6d1181213a0eb
Reviewed-on: https://review.whamcloud.com/34346
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-9810 lnet: fix build with M-OFED 4.1 22/34322/2
Alexey Lyashkov [Mon, 4 Sep 2017 14:25:54 +0000 (17:25 +0300)]
LU-9810 lnet: fix build with M-OFED 4.1

Add uapi path into includes to make build happy

Lustre-change: https://review.whamcloud.com/28277
Lustre-commit: 344b6fd6934b30665e7ea172b5793c3f4f5adc57

Seagate-bug-id: MRP-4508
Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Change-Id: If9c61a303de24c78261a7b6fdafec77f52efa0d3
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34322
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-7558 ptlrpc: connect vs import invalidate race 90/34290/5
Andriy Skulysh [Wed, 22 Aug 2018 19:22:49 +0000 (22:22 +0300)]
LU-7558 ptlrpc: connect vs import invalidate race

Connect can't be sent while import invalidate is
in progress, thus it leaves the import in not
initialized state.

Don't allow reconnect in evicted state.

Lustre-change: https://review.whamcloud.com/33718
Lustre-commit: b1827ff1da829ae5f320a417217757221eedda5f

Change-Id: I79a1a1eb05fede30e100ba09b6f3f98636a46213
Cray-bug-id: LUS-6322
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34290
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11244 build: apply IB_OPTIONS to debian rules 23/34323/2
Jinshan Xiong [Tue, 14 Aug 2018 03:33:33 +0000 (20:33 -0700)]
LU-11244 build: apply IB_OPTIONS to debian rules

IB_OPTIONS should be honored when making debian package.

Lustre-change: https://review.whamcloud.com/32996
Lustre-commit: 65904fd6fbfbd9dc9f8d3498950b77e81961075f

Signed-off-by: Jinshan Xiong <jinshan.xiong@uber.com>
Change-Id: Ibc16a5428d47f072499c39a62ea457c922ae7352
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Martin Schroeder <martin.h.schroeder@intel.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34323
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11130 osd-ldiskfs: create non-empty local agent symlinks 73/33973/9
Alexander Zarochentsev [Sat, 7 Jul 2018 21:21:36 +0000 (00:21 +0300)]
LU-11130 osd-ldiskfs: create non-empty local agent symlinks

e2fsck doesn't like zero-sized symlink inodes created by
osd_create_local_agent_inode().  Store the FID of the remote
inode as the symlink target so that it is possible to debug
where this symlink comes from in case there is some problem
in the future.

It would be better to just migrate the whole symlink instead
of creating a remote symlink, in the very common case where
there is not a hard link to the symlink (which POSIX allows,
but is extremely uncommon).  For the short term, we store
keep the remote symlinks to ensure on-disk consistency with
this simple patch, until the migration code can be fixed.

Lustre-change: https://review.whamcloud.com/32797
Lustre-commit: c3a836364892cacbc4737645893b094971c6ec49

Cray-bug-id: LUS-6189
Change-Id: Ida43616c51b6903f0a51aeec05a9a2dd189efe31
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33973
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11689 lfs: make sure project proceed all dirs 03/34103/4
Wang Shilong [Thu, 22 Nov 2018 01:23:36 +0000 (09:23 +0800)]
LU-11689 lfs: make sure project proceed all dirs

Leftover fix since LU-10986 lfs: make lfs project tolerant errors
We should proceed other dirs if we hit errors, otherwise,
some dirtree like following will fail if aaaa not exists.

testdir/
├── subdir
│   └── 1
├── bbbb -> aaaa
└── cccc

Also remove extra error output since we have output failing
messages inside every action function.

Lustre-change: https://review.whamcloud.com/33707
Lustre-commit: e022922fb4a2429d0c2488a13ad8127c068aa2b8

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I0062dbc3f4d1925c9e9e1a509ee35ac569bd9b74
Reviewed-on: https://review.whamcloud.com/34103
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10124 lnet: Correctly add peer MR value while importing 55/32255/3
Sonia Sharma [Thu, 1 Feb 2018 23:40:03 +0000 (15:40 -0800)]
LU-10124 lnet: Correctly add peer MR value while importing

while adding peer using lnetctl import, the MR value of the
peer is not correctly imported.

Checks for MR value other than True/False in
handle_yaml_config_peer() -
1. No value provided - Use default as True
2. Value other than True/False - Error out

Change-Id: I02a21e35086f1c6f29081b464dd1a63aba692cbc
Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/32255
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9706 dt: remove dt_txn_hook_commit() 74/34274/2
Alex Zhuravlev [Mon, 18 Feb 2019 13:10:16 +0000 (16:10 +0300)]
LU-9706 dt: remove dt_txn_hook_commit()

it's not used and it's not safe as dt_txn_callback_del()
and dt_txn_callback_add() can race with commit callbacks.

Lustre-change: https://review.whamcloud.com/#/c/34212/

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I1744aaa621e28cb3f7e812db5695aa42e8d596cd
Reviewed-on: https://review.whamcloud.com/34274
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10986 lfs: make lfs project tolerant errors 02/34102/5
Wang Shilong [Wed, 2 May 2018 08:54:15 +0000 (16:54 +0800)]
LU-10986 lfs: make lfs project tolerant errors

This patch try to fix following problems:
1)command hang on pipe file, reproduced by following steps:
 $ mkfifo tmp/pipe
 $ lfs project -srp 500 tmp -->this will never finish.

Problem is opening a pipe file will be blocked in default
without O_NOBLOCK or O_NODELAY flag.

2)If a symbolic link with missing target exists, command
returns error and does not process remaining entries.

we should fix this problem by allowing command process
further even it hit some errors.

3)fix a wrong check for MAX_PATH

Lustre-change: https://review.whamcloud.com/32243
Lustre-commit: d189024bd3065c69c51ba90f6228c3ea28419dd0

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I7d08a7547e6b1351a1eff23063da6cd9c4cdc5e3
Reviewed-on: https://review.whamcloud.com/34102
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-6632 mgs: dont remove EXCLUDE records on lctl replace_nids 46/34046/3
Vladimir Saveliev [Wed, 7 Mar 2018 12:01:49 +0000 (06:01 -0600)]
LU-6632 mgs: dont remove EXCLUDE records on lctl replace_nids

conf-sanity.sh:test_66 is modified to illustrate the problem:
  add EXCLUDE records to config file. lctl replace_nids removes
  those records which leads to mounting problem
fix: Remove records marked as SKIP instead of EXCLUDE ones.

Lustre-change: https://review.whamcloud.com/14921
Lustre-commit: 00c89bf0148a105cb7145475194136dc672d1623

Change-Id: Ica4b23a74870d8ebcb09b240313df4d4c33bbbde
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Signed-off-by: Alyona Romanenko <alyona.romanenko@seagate.com>
Cray-bug-id: MRP-2105
Cray-bug-id: MRP-2766
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Test-Parameters: trivial envdefinitions=ONLY=66 testlist=conf-sanity
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34046
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11652 kernel: kernel update [SLES12 SP3 4.4.162-94.69] 61/33761/2
Jian Yu [Fri, 30 Nov 2018 08:56:52 +0000 (00:56 -0800)]
LU-11652 kernel: kernel update [SLES12 SP3 4.4.162-94.69]

Update SLES12 SP3 kernel to 4.4.162-94.69.

Test-Parameters: clientdistro=sles12sp3 ossdistro=sles12sp3 mdsdistro=sles12sp3

Change-Id: I28b9aee30e0c73b3b49656355d59a3735abea720
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33761
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11723 kernel: kernel update RHEL7.6 [3.10.0-957.1.3.el7] 21/33821/2
Jian Yu [Mon, 10 Dec 2018 21:21:07 +0000 (13:21 -0800)]
LU-11723 kernel: kernel update RHEL7.6 [3.10.0-957.1.3.el7]

Update RHEL7.6 kernel to 3.10.0-957.1.3.el7.

Test-Parameters: clientdistro=el7.6 serverdistro=el7.6

Change-Id: I79558783991da3acaae76b33a9caab7d92642c06
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33821
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11870 kernel: kernel update RHEL6.10 [2.6.32-754.10.1.el6] 54/34054/2
Jian Yu [Thu, 17 Jan 2019 07:00:21 +0000 (23:00 -0800)]
LU-11870 kernel: kernel update RHEL6.10 [2.6.32-754.10.1.el6]

Update RHEL6.10 kernel to 2.6.32-754.10.1.el6 for Lustre client.

Test-Parameters: clientdistro=el6.10

Change-Id: I1dde7a41aae80653ac28a0917a0e8eb519c51ff0
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34054
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10507 tests: use {save,restore}_layout() in test 59/34159/3
Jinshan Xiong [Fri, 1 Feb 2019 08:19:10 +0000 (00:19 -0800)]
LU-10507 tests: use {save,restore}_layout() in test

Revised test cases sanity:test_{27A,65i,65j,65m,406}(),
sanity-pfl:test_10() to use new interfaces to save and restore
layout.

This patch is back-ported from the following one:
Lustre-commit: 7b980e101e172d7d8b43a0db2dcaabc8c8c6c855
Lustre-change: https://review.whamcloud.com/30858

Test-Parameters: trivial testlist=sanity-pfl,sanity

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I11f4e5dcd486d4f7d08666c462d056041e125365
Reviewed-on: https://review.whamcloud.com/34159
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-11739 lod: don't inherit default layout from root directory 39/34139/6
Jian Yu [Thu, 7 Feb 2019 06:15:50 +0000 (22:15 -0800)]
LU-11739 lod: don't inherit default layout from root directory

There is no need to inherit the default directory layout from
the root directory when subdirectories are created therein.
This consumes xattr space on the subdirectories, and makes it
more complex to change the filesystem default layout in the future.

This patch fixes the above issue in lod_ah_init() to check if
the parent directory is the root directory and not copy
the default layout xattr to the new subdirectory.

Lustre-change: https://review.whamcloud.com/33956
Lustre-commit: 0a988cae95f99fee1a9c0d489ce00d0954d2a68e

Lustre-change: https://review.whamcloud.com/34175
Lustre-commit: ad1a74527f0ec59510bfa124b8280617a2b93840

Change-Id: Ie0d286785bdbcd73e2ae60b429e66d5d54b44eef
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34139
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11566 utils: fix lctl llog_print for large configs 63/33863/6
Andreas Dilger [Mon, 10 Dec 2018 10:18:19 +0000 (03:18 -0700)]
LU-11566 utils: fix lctl llog_print for large configs

If "lctl llog_print" is called for a large configuration, it will
overflow the 8KB buffer limit for OBD ioctl commands.  The kernel
snprintf calls try to overflow the supplied buffer.  Avoid that.
If the configuration is large, fetch the configuration records in
chunks and print them incrementally.

Add --start and --end options to llog_print and deprecate the use of
positional parameters, since positional parameters are increasingly
complex to parse as options are added, and are harder to use.

The callback for the configuration records will allow "lctl pool_*"
commands to be processed directly on the MGS.

Move existing llog_print test_60aa, test_60ab to conf-sanity as
test_123aa and test_123ab.
Add new test_123ac and test_123ad for the new llog_print --start and
--end param, and update test_123aa to test old positional parameters.

Lustre-commit: 3783aa285b15a811081a8de829d52f7f83e91209
Lustre-change: https://review.whamcloud.com/33815

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib7d2ae893033bd4594646c980b7d0ddbd2b3a089
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33863
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11947 scripts: handle ZFS targets in Lustre RA 71/34271/2
Nathaniel Clark [Fri, 8 Feb 2019 18:02:28 +0000 (13:02 -0500)]
LU-11947 scripts: handle ZFS targets in Lustre RA

Fixes a regression introduced in LU-11461
This handles the case of realpath of target being an empty string.

Lustre-change: https://review.whamcloud.com/#/c/34217/
Lustre-commit: 7c446895595c626d731bb52113bb2df420347279

Fixes: c36d70272541 ("LU-11461 scripts: Support symlink target")
Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I1bcb85908019e968ac0d69e437db217594a6565e
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34271
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joe Grund <jgrund@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10143 osd-zfs: allocate sequence in advance 67/34267/3
Alex Zhuravlev [Sun, 20 Jan 2019 05:38:26 +0000 (08:38 +0300)]
LU-10143 osd-zfs: allocate sequence in advance

on the controller, so that we have it ready before any potential
read-only makeup. this is what osd-ldiskfs is doing already.

Lustre-change: https://review.whamcloud.com/34069
Lustre-commit: 51c449b73994f2bba98ee27ac77f90c9aa846e88

Change-Id: I3d27f112b0d013ac923c5d250b296b5528b8112d
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34267
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
5 years agoLU-9785 lov: take lov layout lock for I/O with ignore_layout 65/33965/5
Jinshan Xiong [Sun, 15 Oct 2017 19:19:30 +0000 (12:19 -0700)]
LU-9785 lov: take lov layout lock for I/O with ignore_layout

A rule of thumb for taking lov layout configuration lock is if I/O
is initiated from LLITE layer, it should grab the lock. If an I/O
starts from the OSC layer, it won't be necessary because if the OSC
object exists, layout reconfiguration will move forward.

Right now CIT_MISC + ci_ignore_layout can identify the I/O from the
OSC layer, I just use this in lov_io_init() for this purpose. In the
future, an explicit bit may be defined for this.

Lustre-commit: e43b0e5c0ccbacd8adf30713babd865b5a7c58c7
Lustre-change: https://review.whamcloud.com/29638

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Change-Id: I2fe37a957b5fb4161c4c723062f6469b915c1dd5
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-on: https://review.whamcloud.com/33965
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11790 ldiskfs: add terminating u32 when expanding inodes 70/34270/2
Li Dongyang [Wed, 19 Dec 2018 03:03:14 +0000 (14:03 +1100)]
LU-11790 ldiskfs: add terminating u32 when expanding inodes

In ext4_expand_extra_isize_ea(), we calculate the total size of the
xattr header, plus the xattr entries so we know how much of the
beginning part of the xattrs to move when expanding the inode extra
size.  We need to include the terminating u32 at the end of the xattr
entries, or else if there is uninitialized, non-zero bytes after the
xattr entries and before the xattr values, the list of xattr entries
won't be properly terminated.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I247b935b3cf315481dc4658133a7eee02b6350e9
Lustre-change: https://review.whamcloud.com/33893
Lustre-commit: 7c800e460661972925a7acab51f023d0b38161b5
Tested-by: Jenkins
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34270
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-8727 mgs: remove skip records from config file 45/34045/4
Vladimir Saveliev [Mon, 19 Feb 2018 13:14:28 +0000 (16:14 +0300)]
LU-8727 mgs: remove skip records from config file

Configuration logs are append-only files of limited size.  Over the
course of time the logs may grow over the limit size.  Usually,
configuration logs keep needless records marked as SKIP. The new lctl
command "clear_conf" is added to allow administartors to clear
configuration files by removing mentioned SKIP records. lctl man page
is updated.
conf-sanity test (for ldiskfs only) is added to test the new command.

Lustre-change: https://review.whamcloud.com/23245
Lustre-commit: 2a9518b1f820f833cbfcff42c6606413a1f5d3e7

Change-Id: I274cb48138c16e536cfca56836c3313e944eba56
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Cray-bug-id: MRP-2091
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Alexey Leonidovich Lyashkov <c17817@cray.com>
Tested-by: Elena V. Gryaznova <c17455@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34045
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11720 spec: srpm should be free of kernel requiements 68/34268/2
Nathaniel Clark [Mon, 3 Dec 2018 19:04:37 +0000 (14:04 -0500)]
LU-11720 spec: srpm should be free of kernel requiements

This moves the fix for LU-9731 into spec file and out of lbuild.
This lets "make rpms" benefit from the fix.
This also prevents the srpm from being incorrectly locked to the
kernel present when lbuild was used to create it (via
kmp-lustre.preamble).

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/33771
Lustre-commit: 3c280a95736a884bc2f36dad674505f1d5b00982

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I15f61c0e37182c0efbea3566d43b1e89f180d3e5
Tested-by: Jenkins
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34268
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-10945 ldlm: fix l_last_activity usage 31/34131/3
Oleg Drokin [Tue, 29 Jan 2019 17:45:48 +0000 (12:45 -0500)]
LU-10945 ldlm: fix l_last_activity usage

When race happen between ldlm_server_blocking_ast() and
ldlm_request_cancel(), the at_measured() is called with wrong
value equal to current time. And even worse, ldlm_bl_timeout() can
return current_time*1.5.
Before a time functions was fixed by LU-9019(e920be681) for 64bit,
this race leads to ETIMEDOUT at ptlrpc_import_delay_req() and
client eviction during bl ast sending. The wrong type conversion
take a place at pltrpc_send_limit_expired() at cfs_time_seconds().

We should not take cancels into accoount if the BLAST is not send,
just because the last_activity is not properly initialised - it
destroys the AT completely.
The patch devides l_last_activity to the client l_activity and
server l_blast_sent for better understanding. The l_blast_sent is
used for blocking ast only to measure time between BLAST and
cancel request.

For example:
 server cancels blocked lock after 1518731697s
 waiting_locks_callback()) ### lock callback timer expired after 0s:
 evicting client

Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I44962d2b3675b77e09182bbe062bdd78d6cb0af5
Cray-bug-id: LUS-5736
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34131
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-7631 tests: add debug info to conf-sanity 82a 21/34121/2
James Nunez [Tue, 20 Nov 2018 15:38:26 +0000 (08:38 -0700)]
LU-7631 tests: add debug info to conf-sanity 82a

In the routine check_stripe_count, the different error
messages need to be modified so when an error occurs,
a user can tell what error was hit. Also, print precreated
object information at the beginning of the test and on
error.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=conf-sanity envdefinitions=ONLY=82a
Test-Parameters: mdscount=1 mdtcount=1 testlist=conf-sanity envdefinitions=ONLY=82a
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ifc75d52d38d9cb401118ef7baa4014bddf6298f2
Lustre-change: https://review.whamcloud.com/33689
Lustre-commit: e76683a5bd540cacd2271a969aa9acd9bf790ccf
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34121
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10565 osd: move ext4_tranfer_project to osd 92/33992/7
Yang Sheng [Wed, 14 Mar 2018 08:10:10 +0000 (16:10 +0800)]
LU-10565 osd: move ext4_tranfer_project to osd

Move ext4_tranfer_project from ldiskfs to osd.
Since upstream has accepted other projid patches
except this part.

Lustre-change: https://review.whamcloud.com/31647
Lustre-commit: d57fcaf893bcaa5a08c298bb421f41bd628add0a

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I6e5acdf68ab9f7bc964d79f29132cee45e2fd3ac
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33992
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10565 osd: bi_error, pagevec_init, PAGE_CACHE_SHIFT changes 89/33989/2
Yang Sheng [Wed, 14 Mar 2018 09:36:48 +0000 (17:36 +0800)]
LU-10565 osd: bi_error, pagevec_init, PAGE_CACHE_SHIFT changes

 - bi_error replace to bi_status in bio
 - pagevec_init takes one parameter
 - PAGE_CACHE_SHIFT be removed

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ia04124d6d636d132550a63e1f8144c26cab39f8e
Lustre-change: https://review.whamcloud.com/31644
Lustre-commit: dd070524bb3c42dd6dbb90519fe292a68a69b636
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33989
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11579 llite: remove cl_file_inode_init() LASSERT 06/33506/4
Andreas Dilger [Mon, 29 Oct 2018 06:42:46 +0000 (00:42 -0600)]
LU-11579 llite: remove cl_file_inode_init() LASSERT

If there is some corruption or other reason that the file layout
cannot be used, the first call to cl_file_inode_init() will fail.
If it is called a second time on the same file then it will hit
an LASSERT() since I_NEW is no longer set on the inode.

It would be good to handle the error in lov_init_raid0() better,
but we still want to avoid this LASSERT() if there is an error.

Convert the LASSERT() in cl_file_inode_init() into a CERROR() and
error return.  This is being triggered due to corruption on the
server, but that shouldn't cause the client to assert.

    lov_dump_lmm_common() oid 0xdf4e:311367, magic 0x0bd10bd0
    lov_dump_lmm_common() stripe_size 1048576, stripe_count 4
    lov_dump_lmm_objects() stripe 0 idx 10 subobj 0x0:151194471
    lov_dump_lmm_objects() stripe 1 idx 12 subobj 0x0:152477530
    lov_dump_lmm_objects() stripe 2 idx 25 subobj 0x0:151589797
    lov_dump_lmm_objects() stripe 3 idx 2 subobj 0x0:150332564
    lov_init_raid0() fsname-clilov: OST0019 is not initialized
    cl_file_inode_init() Failure to initialize cl object
        [0x20004c047:0xdf4e:0x0]: -5

    cl_file_inode_init() ASSERTION(inode->i_state & (1 << 3) ) failed
    cl_file_inode_init() LBUG
    Pid: 37233, comm: ll_sa_4709 3.10.0-862.14.4.el7.x86_64 #1 SMP
    Call Trace:
    libcfs_call_trace+0x8c/0xc0 [libcfs]
    lbug_with_loc+0x4c/0xa0 [libcfs]
    cl_file_inode_init+0x2ac/0x300 [lustre]
    ll_update_inode+0x315/0x600 [lustre]
    ll_iget+0x163/0x350 [lustre]
    ll_prep_inode+0x232/0xc80 [lustre]
    sa_handle_callback+0x3a4/0xf70 [lustre]
    ll_statahead_thread+0x40e/0x2080 [lustre]

Instead, return an IO error instead of killing the client.

Lustre-change: https://review.whamcloud.com/33505
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8a6eb24df09e7e158b61f02e2517132893ebbe5
Reviewed-on: https://review.whamcloud.com/33506
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11647 ptlrpc: always unregister bulk 98/33798/4
Hongchao Zhang [Thu, 15 Nov 2018 16:21:15 +0000 (11:21 -0500)]
LU-11647 ptlrpc: always unregister bulk

In ptlrpc_check_set, the bulk should be unregistered before
ptl_send_rpc in any case.

Lustre-change: https://review.whamcloud.com/22378
Lustre-commit: e34a4cf031a2b83259cee8e05c2f646b5652b6a9

Change-Id: Icf963002f934b43ccbb9d6ef02ba7f9d11f297f8
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33798
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11579 lov: quiet lov_dump_lmm_ console messages 22/34122/2
Andreas Dilger [Tue, 30 Oct 2018 10:20:43 +0000 (04:20 -0600)]
LU-11579 lov: quiet lov_dump_lmm_ console messages

Limit messages in lov_dump_lmm_objects() and lov_dump_lmm_common()
printing to the console repeatedly when D_ERROR is used.  Change
CDEBUG() to CDEBUG_LIMIT() so that rate-limiting is applied.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I32fd70cdf6422222ab0a8f599aa60bc2f6da229e
Lustre-change: https://review.whamcloud.com/33513
Lustre-commit: d9ef75eb8226f22660a7e57241125956daf7fde1
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34122
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11481 utils: disable lfs migrate -m 60/33960/7
Olaf Faaland [Fri, 4 Jan 2019 19:27:15 +0000 (11:27 -0800)]
LU-11481 utils: disable lfs migrate -m

Lustre 2.11 and earlier have bugs in directory migration that risk
data loss.  The fixes landed to 2.12 are too complex to backport.
This patch prevents an unaware user from risking their data by using
lfs migrate -m.  This is not a backport from master.

Skip all the tests that issue lfs migrate -m as they now fail.

Change-Id: I6d620429a0e10941f88285fbcf178797a71be3a6
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/33960
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11625 ofd: handle upgraded filter_fid properly 58/33958/5
Alex Zhuravlev [Fri, 21 Dec 2018 09:27:31 +0000 (12:27 +0300)]
LU-11625 ofd: handle upgraded filter_fid properly

Since there have been several iterations of struct filter_fid stored
on disk, the current code wasn't checking for all of the possible
cases when trying to decide what action to take when accessing and
upgrading the xattr for new capabilities.

Properly check for the various different struct filter_fid sizes and
handle them appropriately.  Add a more verbose description of the
various cases so that this is more clear to others in the future.

Add decoding of filter_fid fields added for FLR in 2.11.

We should already be testing for upgrading the filter_fid xattr
from different OST versions in conf-sanity test_32d.

Lustre-change: https://review.whamcloud.com/33627
Lustre-commit: 51a88bc3f04d0bc76272055ed4fe63138539ebd7

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifef2292296236cb06ff7e8cd50caff4b133ebbe5
Reviewed-on: https://review.whamcloud.com/33958
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-4939 obdclass: llog_print params file 50/34250/2
Ben Evans [Fri, 9 Mar 2018 20:51:26 +0000 (15:51 -0500)]
LU-4939 obdclass: llog_print params file

Allow llog_print to handle the params file in yaml

Lustre-commit: 320e736191cce766cd0838ebbd5e76f1aefa1a6f
Lustre-change: https://review.whamcloud.com/31620

Test-Parameters: trivial testlist=sanity
Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: Icf286bca7a1466bf3c8d9084971e58d2e8b8a651
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34250
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9906 osd: use pagevec for putting pages 88/33988/2
Patrick Farrell [Mon, 5 Feb 2018 12:16:58 +0000 (06:16 -0600)]
LU-9906 osd: use pagevec for putting pages

Using a pagevec instead of individual page puts is much
more efficient.  This should reduce contention on the page
cache allocation/freeing, which becomes a bottleneck with
high speed OSTs.

Cray-bug-id: LUS-5670

Lustre-change: https://review.whamcloud.com/30531
Lustre-commit: 2a2adfd04245a24148d8de29b8558cd98c92bffa

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ic15cb8e30887ec55e9348e50af307bfd7108c7e4
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33988
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-10776 osc: Do not request more than 2GiB grant 51/34051/2
Patrick Farrell [Mon, 5 Mar 2018 16:24:32 +0000 (10:24 -0600)]
LU-10776 osc: Do not request more than 2GiB grant

The server enforces a grant limit of 2 GiB, which the
client must honor.  The existing client code combined with
16 MiB RPCs make it possible for the client to ask for
more than this limit.

Make this limit explicit, and also fix an overflow bug in
o_undirty calculation in osc_announce_cached.  (o_undirty
is a 32 bit value and 16 MiB*256 rpcs_in_flight = 4 GiB.
4 GiB + extra grant components overflows o_undirty.)

Cray-bug-id: LUS-5750
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ifcb8a9ea7529eae4cd209dc72223ed039c6f6a0d
Reviewed-on: https://review.whamcloud.com/31533
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-on: https://review.whamcloud.com/34051
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-9474 tests: fix quoting in stack_trap 58/34158/2
Quentin Bouget [Fri, 1 Feb 2019 08:05:39 +0000 (00:05 -0800)]
LU-9474 tests: fix quoting in stack_trap

stack_trap() mishandled single quotes. This patch is not the cleanest
of fixes, but it works.

(sanity-hsm is the only test suite that uses the function, for now)

This patch is back-ported from the following one:
Lustre-commit: 99420a1830b89a8aba6350b095065d65107f7c0f
Lustre-change: https://review.whamcloud.com/30490

Test-Parameters: trivial testlist=sanity-hsm

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: Ia43219e57079abdbfc75485105d572bbfa85caba
Reviewed-on: https://review.whamcloud.com/34158
Tested-by: Jenkins
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11158 mdt: grow lvb buffer to hold layout 49/34049/4
Bobi Jam [Thu, 19 Jul 2018 15:19:43 +0000 (23:19 +0800)]
LU-11158 mdt: grow lvb buffer to hold layout

Write intent RPC could generate a layout bigger than the initial
mdt_max_mdsize, so that the new layout cannot be returned to client,
this patch fix this issue by:

* fix a glitch in lod_use_defined_striping(), where v3 should be
  updated along v1.
* change lvbo_fill() return -ERANGE in this case, and stores in its
  @buflen parameter the needed buffer size
* in ldlm_handle_enqueue0(), when ldlm_lvbo_fill() detects -ERANGE,
  it grows the corresponding RMF_DLM_LVB buffer and retrives the
  layout to refill the buffer again.
* define a new MAX_MD_SIZE to hold a reasonal composite layout, and
  keeps old MAX_MD_SIZE as MAX_MD_SIZE_OLD.

lustre-review: https://review.whamcloud.com/32847
lustre-commit: e5abcf83c0575b8a79594c1eb9ea727739d91522

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I255b954195b3e64c3edd416c0cb209df0d9fc43a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34049
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11620 lfsck: change llsd_rb_lock to rwsemaphore 79/33979/2
Lai Siyao [Sat, 20 Oct 2018 20:50:49 +0000 (04:50 +0800)]
LU-11620 lfsck: change llsd_rb_lock to rwsemaphore

llsd_rb_lock is taken in ->init, and released in ->fini, but during
this period it may getxattr which can sleep. Change it to rwsemaphore.

Lustre-change: https://review.whamcloud.com/33603
Lustre-commit: 925ce153979d6ac793a65e193181ec14a8281640

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idc68eb886e60dc45ccfb7ac9bf5bf06db42d690d
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33979
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11783 utils: fix warnings when lustre_user.h included 64/34064/2
Andreas Dilger [Fri, 14 Dec 2018 22:53:25 +0000 (15:53 -0700)]
LU-11783 utils: fix warnings when lustre_user.h included

Checking for lustre/lustre_user.h in a configure script
generates a warning because of the included <sys/quota.h>

  checking lustre/lustre_user.h usability... no
  checking lustre/lustre_user.h presence... yes
  WARNING: present but cannot be compiled
  WARNING: check for missing prerequisite headers?
  WARNING: see the Autoconf documentation
  WARNING: section "Present But Cannot Be Compiled"
  WARNING: proceeding with the preprocessor's result
  WARNING: in the future, the compiler will take precedence

Looking into config.log it shows:

  In file included from /usr/include/lustre/lustre_user.h:59,
                   from conftest.c:91:
  /usr/include/sys/quota.h:221: error: expected declaration
    specifiers or '...' before 'caddr_t'

Since we don't really need much from the <sys/quota.h> header,
add conditional #defines for the few needed fields.

The FASYNC constant is not declared everywhere in userspace,
provide a compat declaration if unavailable.

Lustre-change: https://review.whamcloud.com/33876
Lustre-commit: db0592145574c5ad22a7b7372b06ba2da7d85a60

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9cd2b0fcbaf16fe8a5a4a7a0309aada3a72cab07
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34064
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11156 scrub: skip project quota inode 71/33971/6
Alexander Boyko [Wed, 18 Jul 2018 14:17:16 +0000 (10:17 -0400)]
LU-11156 scrub: skip project quota inode

Error happened when scrub try to process project quota inode.
Scrub thinks that it is IGIF, because it has no lma fid. And it starts
to create O/inum/{LAST_ID,d0-d31}, and fails with not enough credits.
The project quota inode s_prj_quota_inum should be skipped
from scrub iteration.

Lustre-change: https://review.whamcloud.com/32829
Lustre-commit: d01fc74e8a347a0c8ebfcf92a49c7f71809cd0ad

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-6197
Change-Id: I38c347377a1c648ac3dd3e3ff4c4d65ee34cde39
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33971
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11605 osp: max_create_count and create_count changes 61/34161/2
Sergey Cheremencev [Wed, 29 Aug 2018 19:20:36 +0000 (22:20 +0300)]
LU-11605 osp: max_create_count and create_count changes

Setting max_create_count to 0 causes setting create_count
to 0. Set create_count to OST_MIN_PRECREATE when setting
back max_create_count.
Without the patch create_count remains equal to 0 despite
on changing max_create_count to something != 0.
This causes create to stuck in osp_precreate_reserve
because osp_precreate_send doesn't send new request to OST.
To understand the number of objects to precreate(grow) it
uses opd_pre_create_count that is equal to 0.

Lustre-commit: a531ab5f38a6da1de7948df979ae839aa847a370
Lustre-change: https://review.whamcloud.com/33559

Cray-bug-id: LUS-6435
Change-Id: I940c48f91e9c7d49b766bd85ea271ce229424c7f
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34161
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11010 tests: remove calls to return after skip() 99/34199/2
James Nunez [Wed, 6 Feb 2019 19:17:41 +0000 (12:17 -0700)]
LU-11010 tests: remove calls to return after skip()

The skip() routine now contains a call to exit. All calls
to skip() and skip_env() should be reviewed and calls to
return that followed skip should be removed.

A problem with the skip message not being printed is corrected.

This patch is partial backport (sanity-pfl is not modified) from:
Lustre-commit: ea76de56fccfdc6f8bd0087adc36c19257e414d3
Lustre-change: https://review.whamcloud.com/32346

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I1a52e9bd79a71de4ab4c0cea9c569f379115a603
Reviewed-on: https://review.whamcloud.com/34199
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
5 years agoLU-8402 ldlm: assume OBD_CONNECT_IBITS 27/34027/2
Hongchao Zhang [Wed, 26 Dec 2018 17:21:47 +0000 (12:21 -0500)]
LU-8402 ldlm: assume OBD_CONNECT_IBITS

Clients and MDSs have supported and required OBD_CONNECT_IBITS since
before 1.6 so remove obsolete code to handle clients that do not
support this flag.

Lustre-change: https://review.whamcloud.com/30009
Lustre-commit: c7a833830de691967081cd7a42199b924ea7efdc

Change-Id: I9233bc3cdc5b4e2543c25e44e68acbecf77ff81d
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Reviewed-on: https://review.whamcloud.com/30009
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34027
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11187 ldiskfs: update rhel7.6 series 63/34063/2
Minh Diep [Fri, 18 Jan 2019 18:19:40 +0000 (10:19 -0800)]
LU-11187 ldiskfs: update rhel7.6 series

Previous landing missed rhel7.6 patch series

Test-Parameters: trivial

Change-Id: I4e7a943446ab1f789ebffb7db5f6697304f79584
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34063
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10565 osd: unify interface for vfs 91/33991/2
Yang Sheng [Fri, 26 Jan 2018 17:26:26 +0000 (01:26 +0800)]
LU-10565 osd: unify interface for vfs

Some vfs changes were applied to other part but
OSD. So unify them with OSD layer.

Lustre-change: https://review.whamcloud.com/31646
Lustre-commit: 55ed739f7efb7029b02fe50999547f9aac40af72

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ia3e907964d6321571f52e4c24a46a8ab64e4d056
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/33991
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
5 years agoLU-11737 lfsck: do not ignore dryrun 27/33827/2
Alex Zhuravlev [Tue, 11 Dec 2018 11:28:28 +0000 (14:28 +0300)]
LU-11737 lfsck: do not ignore dryrun

lfsck_layout_recreate_lovea() shouldn't ignore dryrun.

Lustre-change: https://review.whamcloud.com/#/c/33826/
Lustre-commit: Ia8bafc13f148b03573dee5db26b6aff9386b5b5f

Change-Id: I686a59e87b89de666bc2042f4e655a0164d1f030
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33827
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11006 lnet: fix show peer yaml tree with no peer 18/32318/3
Sonia Sharma [Tue, 8 May 2018 03:27:14 +0000 (20:27 -0700)]
LU-11006 lnet: fix show peer yaml tree with no peer

When no peer exists then the root created for the peer
yaml tree should be deleted. And lnetctl show peer
should not display anything.

Currently lnetctl peer show shows the root string "peer"
even when there is no peer. This create issues when
starting lnet using /etc/lnet.conf derived from the
existing configuration.

Lustre-change: https://review.whamcloud.com/32320
Lustre-commit: 40295e5ca3e5ed51c8236a2e641627d687b7d59c

Change-Id: I58a337233c4dcea80e78c9dfd6eaabab33766f04
Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/32318
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11757 lod: use calculated stripe count 81/33981/2
Andriy Skulysh [Mon, 3 Dec 2018 14:45:18 +0000 (16:45 +0200)]
LU-11757 lod: use calculated stripe count

lod_prep_md_striped_create() tries to allocat big
chunk of memory because
lum->lum_stripe_count == -1 and is converted to __u32.

ldo_dir_stripe_count was calculated already in lod_ah_init()

Lustre-change: https://review.whamcloud.com/33829
Lustre-commit: 622a94d5e27ed3e596918863c08b304a6be9a646

Change-Id: Id99d9e024638dfb1b34262840d2e543c808a9cdc
Cray-bug-id: LUS-6694
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33981
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11736 utils: don't set max_sectors_kb on MDT/MGT 16/34016/2
Andreas Dilger [Fri, 11 Jan 2019 19:35:06 +0000 (12:35 -0700)]
LU-11736 utils: don't set max_sectors_kb on MDT/MGT

The max_sectors_kb tunable should not be applied to MDT and MGT
devices. This tuning is needed for efficiency of large IOs for
spinning disks, but is not needed for SSDs or regular IO. It can
cause problems with DM Multipath configurations for minimal
benefits, so should be limited to OST devices.

This only applies to ldiskfs backend filesystems, no such tuning
is currently done for any ZFS devices.

Lustre-change: https://review.whamcloud.com/33796
Lustre-commit: 2f8d7b4679de3fa467040aa61733f262714e39c9

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I496603da13aae042f63cc37c0dea221a393ebbe5
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/34016
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11196 tests: clean up after conf-sanity test_101 95/33495/3
Andreas Dilger [Fri, 26 Oct 2018 17:39:15 +0000 (10:39 -0700)]
LU-11196 tests: clean up after conf-sanity test_101

conf-sanity test_101() creates up to 50000 files in the top-level
test directory, which can sometimes cause the later test_103() setup
to fail, because "rm -rf" fails with "Argument list too long" when
trying to clean up the test directory.

Create the test_101 files in a subdirectory for cleanliness, and
remove them at the end of the test so that we don't run out of space.

This patch is back-ported from the following one:
Lustre-commit: ed44caee9af0eeb32bce7e7de5f5146fa6dc1d00
Lustre-change: https://review.whamcloud.com/33268

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1167339d03340e1cc545d2855c4b32eef18cab07
(cherry picked from commit ed44caee9af0eeb32bce7e7de5f5146fa6dc1d00)
Reviewed-on: https://review.whamcloud.com/33495
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-11049 ssk: correctly handle null byte by lgss_sk 58/33358/2
Sebastien Buisson [Tue, 22 May 2018 15:50:53 +0000 (17:50 +0200)]
LU-11049 ssk: correctly handle null byte by lgss_sk

lgss_sk must include null byte with fsname and nodemap info taken from
command line.

Lustre-change: https://review.whamcloud.com/32510
Lustre-commit: 10173500fdb7332c369e72dc00e365f329d20f20

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7bd41ba3fc9fc56e4049f8080c0ee95ba26334a9
Reviewed-on: https://review.whamcloud.com/33358
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10516 misc: require ldiskfsprogs-1.42.13.wc6 or later 67/33767/4
Andreas Dilger [Sat, 1 Dec 2018 18:16:44 +0000 (11:16 -0700)]
LU-10516 misc: require ldiskfsprogs-1.42.13.wc6 or later

Require a current version of ldiskfsprogs to include many bug fixes.
e2fsprogs-1.42.13.wc4 release was recommended for Lustre 2.10.0, and
1.42.13.wc6 has been released since 2017-05-05.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I24eeb1c30d5c7b1daa1ad7d5d2f6032273054035
Reviewed-on: https://review.whamcloud.com/33767
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-7988 hsm: added coordinator housekeeping flag 08/28908/3
Frank Zago [Fri, 8 Apr 2016 17:59:06 +0000 (13:59 -0400)]
LU-7988 hsm: added coordinator housekeeping flag

When the coordinator is not performing housekeeping, only the requests
in the ARS_WAITING state will be processed as they are new
requests. The other requests, in states ARS_FAILED, ARS_CANCELED,
ARS_SUCCEED and ARS_STARTED can wait a few more seconds until the
housekeeping starts.

Also, when not performing housekeeping, as soon as hsd.request is
full, exit from the loop as there is enough potential work queued;
there's no need to examine all the HSM records, thus shortening the
time spent in cdt_llog_process() holding the critical lock
cdt_llog_lock.

Lustre-change: https://review.whamcloud.com/19582
Lustre-commit: afc9ff6caff7d572041cabf0a957dc8749fce49d

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ib73c97d29ca2f86b912aeb8d055c004cff14d5cf
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/28908
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8895 target: limit grant allocation 07/32907/5
Vladimir Saveliev [Fri, 15 Dec 2017 09:33:17 +0000 (12:33 +0300)]
LU-8895 target: limit grant allocation

tgt_grant_alloc() is missing a check for amount of space already
granted to a client. If the client submits number of RPCs
simultaneously when the client's grant is below its max amount of
grants then the server may grant the client with amount of grants
substantially exceeding the amount of grants requested in one RPC. In
case of decent number of clients that may lead to ENOSPC long before
the lack of disk space is really achieved.

Limit grants given to a client to asked amount plus grants for 2 full
write RPCs.

A test to illustrate the issue is included.
The test needs to lower debug level so that dd provided sufficient I/O
throughput.

Lustre-change: https://review.whamcloud.com/24096
Lustre-commit: 82e494a36e9ea4f51ec163ab15beb9fdda7fa8d6

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Seagate-bug-id: MRP-4013
Change-Id: Ie6a8abbad28a06bc1d55ff2fd042b9664a29e9e4
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/32907
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10213 lnd: calculate qp max_send_wrs properly 75/33975/2
Amir Shehata [Tue, 28 Nov 2017 01:52:20 +0000 (17:52 -0800)]
LU-10213 lnd: calculate qp max_send_wrs properly

The maximum in-flight transfers can not exceed the
negotiated queue depth. Instead of calculating the
max_send_wrs to be the negotiated number of frags *
concurrent sends, it should be the negotiated number
of frags * queue depth.

If that value is too large for successful qp creation then
we reduce the queue depth in a loop until we successfully
create the qp or the queue depth dips below 2.

Due to the queue depth negotiation protocol it is guaranteed
that the queue depth on both the active and the passive
will match.

This change resolves the discrepancy created by the previous
code which reduces max_send_wr by a quarter.

That could lead to:
mlx5_ib_post_send:4184:(pid 26272): Failed to prepare WQE
When the o2iblnd transfers a message which requires more
WRs than the max that has been allocated.

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/30310
Lustre-commit: 017d328fa832697533e4e032fe9a9213ea105320

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I88f96f950bf4c0a8efd4df812d44e5e20d5907dc
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33975
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9810 lnd: use less CQ entries for each connection 74/33974/2
Alexey Lyashkov [Mon, 27 Nov 2017 12:44:56 +0000 (15:44 +0300)]
LU-9810 lnd: use less CQ entries for each connection

Currently we have a 2 work requests chains per transfer.
It mean OFED stack will generate only 2 events if transfer will
faild. Reduce number CQ entries to avoid extra resource consumption.

Test-Parameters: trivial
Seagate-bug-id: MRP-4508
Lustre-change: https://review.whamcloud.com/28279
Lustre-commit: 052f76bf708414b3a127aa9602b4a69415c1cb2f

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Change-Id: I0c06fef9589478f40ef7e1eeacff2aec7013e562
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33974
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10461 tests: call exit in the skip routine 23/34023/2
James Nunez [Sun, 21 Jan 2018 22:31:31 +0000 (15:31 -0700)]
LU-10461 tests: call exit in the skip routine

There are many reasons to not run, or skip, a test; the test
may require a certain number of servers or a certain Lustre version.
In these cases, the skip() or skip_env() routine is called. When we
call skip, the intention is to exit the routine early. Thus, call
‘exit 0’ at the end of the skip() routine.

Some calls to skip() are changed to skip_env() when a test is being
skipped due to the Lustre configuration or test environment.

Backported from b2_12 branch to b2_10 (partial backport: sanity.sh
changes are not included here):
Lustre-change: https://review.whamcloud.com/30964
Lustre-commit: 33aad7829de56353cca5bbd082d95d6821a7be9c

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I42fd9535c0a803f334dfc5685f451a6bdc85e84b
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-on: https://review.whamcloud.com/34023
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-9874 osd-ldiskfs: simplify project transfer codes 97/32297/4
Wang Shilong [Tue, 27 Jun 2017 05:51:09 +0000 (13:51 +0800)]
LU-9874 osd-ldiskfs: simplify project transfer codes

Currently, osd-ldiskfs call __ldiskfs_ioctl_project()
to transfer project quota which is user ioctl for ext4 which
will start a transaction, and reserve credits, this is not
right logic with Lustre.

Lustre have started a transaction handle and credits should be
reserved during declare phase, so calling _ldiskfs_ioctl_project()
here will cause nested handle starting, which is not a problem for
JBD2 because it will attach current thread's handle if transaction
have been started, but in this case it will ignore credits
reservation.

Also Lustre don't need inode mutex protection for
project transfer, we don't need write inode in transfer codes,
it will be done when dirty inode is called. Setting attr
have reserved enough credits for project transfer, we need
fix agent inode transfering.

This patch makes codes logic clear, also fix credits
reservation for DNE agent inode transfering.

Lustre-commit: a9d3c9ba5360f46b2eaa5732a98c0ee836a927df
Lustre-change: https://review.whamcloud.com/28510

Change-Id: I6ab3c0fdc4cf456b102e49d9326840fd0e12ade0
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/32297
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11056 lwp: fix lwp reconnection issue 77/33977/4
Hongchao Zhang [Wed, 26 Dec 2018 17:22:27 +0000 (12:22 -0500)]
LU-11056 lwp: fix lwp reconnection issue

After the OST or MDT was restarted, the lwp reconnection can be
failed for -EALREADY because the connect count in the connecttion
request is less then the value saved in the corresponding export
at MDT0000, which could cause the system hang.

The patch also changes lustre_lwp_connect to use OBD_CONNECT_MDS_MDS
flag only when the connection is between MDTs.

Lustre-change: https://review.whamcloud.com/32536
Lustre-commit: 0814d5077343953115f50982a2e93cebb29bda68

Change-Id: I9ae7b4faadc65fdaa78458a06315b1739d144feb
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33977
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10985 mdt: properly handle unknown intent requests 21/32521/3
Oleg Drokin [Wed, 2 May 2018 04:48:46 +0000 (00:48 -0400)]
LU-10985 mdt: properly handle unknown intent requests

Invalid intent requests should be rejected early on,
so the later code does not make any assumptions about
validity of various structures like extended pills
and such.

Lustre-change: https://review.whamcloud.com/32237
Lustre-commit: 6a39600f641cc3e179b0149af5ff17ba44d2319f

Change-Id: I74de899ae6cea7017ae8ca9a8c488ca801a38c5d
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/32521
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
5 years agoLU-11567 utils: llog_reader print changelog index 18/34018/4
Olaf Faaland [Wed, 24 Oct 2018 21:23:50 +0000 (14:23 -0700)]
LU-11567 utils: llog_reader print changelog index

When processing changelog type llogs, print the changelog index number
with each changelog record.  This allows one to compare the records on
disk with the output of lfs changelog or relatives.

Lustre-change: https://review.whamcloud.com/#/c/33473/
Lustre-commit: 78b0c3ce5a45b4919d53ac6cc886d6a9e713e05d

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@ddn.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I0059cc34b39161462b3eadbb2512dc811c38705a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34018
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11658 lov: cl_cache could miss initialize 80/33980/4
Yang Sheng [Tue, 13 Nov 2018 20:17:09 +0000 (04:17 +0800)]
LU-11658 lov: cl_cache could miss initialize

The cl_cache may be missed initialize when we mount
a client with deactivate osc and then active it.

Lustre-change: https://review.whamcloud.com/33650
Lustre-commit: 42e83c44eb5a22cbacf1ed4c6d4d6b588e07faa9

Lustre-change: https://review.whamcloud.com/33983
Lustre-commit: 2c836398f6817064d18719021e4222cc652432b0

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I92cd44375d70624fb55ef7a0218e7178211a8687
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33980
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11186 ofd: fix for a final oid at sequence 11/33111/5
Alexander Boyko [Fri, 27 Jul 2018 13:10:23 +0000 (09:10 -0400)]
LU-11186 ofd: fix for a final oid at sequence

There was an error at the end of sequence range and last oid
0xffffffff can't be created. The 0xffffffff is a valid oid, and
sequence update happens only if it is created.

LustreError: 11756:0:(ofd_objects.c:217:ofd_precreate_objects())
lustre-OST0000:0xfffffffe:10737419264 hit the OBIF_MAX_OID (1<<32)!
LustreError: 11756:0:(ofd_dev.c:1764:ofd_create_hdl())
lustre-OST0000: unable to precreate: rc = -28

The patch fixes this error.

The conf-sanity 122 is added for checking sequence update.

Lustre-change: https://review.whamcloud.com/32891
Lustre-commit: b724079edc5b66e1046b5760a6bad3045e9a9260

Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I39ad66c05e8358591ca05fadabb2b46bee638070
Cray-bug-id: LUS-6222
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33111
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-8999 quota: fix issue of multiple call of seq start 67/32567/3
Hongchao Zhang [Thu, 24 May 2018 20:33:55 +0000 (16:33 -0400)]
LU-8999 quota: fix issue of multiple call of seq start

Multiple call of lprocfs_quota_seq_start could change the block
orders in the lower level of the quota tree, which will cause
quota entries to be skipped.

This patch also fix a problem in walk_tree_dqentry, which some
entries could be skipped for the "index" can be added even if
a valid quota entry has been found.

Lustre-commit: 07fdba2244e748e1721232abc52617209a544b87
Lustre-change: http://review.whamcloud.com/31721

Change-Id: I21f31647ebc15d34dc2021a2d8c5ad40fe128535
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/32567
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11068 build: remove invalid kernel srpm location 14/33314/3
Minh Diep [Fri, 1 Jun 2018 17:56:56 +0000 (10:56 -0700)]
LU-11068 build: remove invalid kernel srpm location

The location has never been existed

Lustre-change: https://review.whamcloud.com/32606
Lustre-commit: 8d150d1ca0db1e584511b795c03b5da7a2787b4d

Change-Id: I8958bbdb5c61284c55d6cc337ac92832f91ee08b
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33314
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10629 lod: Clear OST pool with setstripe 42/33742/2
Ben Evans [Wed, 28 Nov 2018 17:51:17 +0000 (09:51 -0800)]
LU-10629 lod: Clear OST pool with setstripe

When setstripe -d is run on a directory, we should
clear the OST pool along with all the other settings
Currently there is no way to clear an OST pool,
only change them.

This patch is back-ported from the following one:
Lustre-commit: 37f6357a5c9f4ad0e2269529e7001f5ab63689a5
Lustre-change: https://review.whamcloud.com/31364

Signed-off-by: Ben Evans <bevans@cray.com>
Cray-bug-id: LUS-5696
Change-Id: I50426ce79ab153a715d29cc5d54b0ce70726da41
Reviewed-on: https://review.whamcloud.com/33742
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11536 ofd: ofd_create_hdl may return 0 in case of ENOSPC 78/33978/2
Sergey Cheremencev [Mon, 25 Jun 2018 15:52:11 +0000 (18:52 +0300)]
LU-11536 ofd: ofd_create_hdl may return 0 in case of ENOSPC

ostid_set_id rewrites ofd_precreate_objects result after
"LU-6401 uapi: fix up lustre_ostid.h and lustre_fid.h".
This breakes the logic of osp_precreate_reserve() causing
osp_precreate_send() to return ESTALE instead of ENOSPC
when OST can't precreate objects.
osp_precreate_send() returns ESTALE because the result of
create is 0 while last created fid on OST is still the same
with local last_id:

fs1-OST0001-osc-MDT0000: precreate fid [0x100010000:0x571607f:0x0] <
local used fid [0x100010000:0x571607f:0x0]: rc = -116
fs1-OST0001-osc-MDT0000: precreate failed opd_pre_status -116
fs1-OST0001-osc-MDT0000: cannot precreate objects: rc = -116

Lustre-change: https://review.whamcloud.com/33390
Lustre-commit: 1f97bb8e7236971aaf3029fe3699db9baf721da1

Change-Id: I4dc057c201253cab14e63c1f06bd5b0d56b5ad2d
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Fixes: 34acfbc2bfe502d18c12ba35771bde7c4a0f7906
Reviewed-on: https://es-gerrit.dev.cray.com/153462
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33978
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11195 lod: Mark comps cached on replay of layout change 10/33110/3
Ann Koehler [Mon, 30 Jul 2018 21:02:59 +0000 (16:02 -0500)]
LU-11195 lod: Mark comps cached on replay of layout change

Replay of a layout change request on a PFL file leaves the object
in an unexpected state: Some components can have llc_stripe set
but ldo_comp_cached is not set in the object. The next layout
change request on the same object will LBUG when it tries to free
the comp entries.

The fix is to set ldo_comp_cached on replay so subsequent layout
change requests will use the in memory components rather than
fetching them from disk.

Lustre-change: https://review.whamcloud.com/32904
Lustre-commit: e021026d0c37d8806d16dbaad6a9d4f47844c999

Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: I8eaee5614c7f2f6e6a3f2c51de93a65422a3122b
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33110
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-10612 tests: reply_single.sh,test_48: No space left 64/33764/3
Elena Gryaznova [Tue, 6 Feb 2018 14:20:53 +0000 (17:20 +0300)]
LU-10612 tests: reply_single.sh,test_48: No space left

MDS need to have time to discover the OST state, attempt to
recover, fail and recover again.

Backport from b2_11 branch to b2_10:
Lustre-change: https://review.whamcloud.com/#/c/31182/
Lustre-commit: ee9d75f41743874cc6aebcfd4daa3c3c71e003cf

Author: gaurav mahajan <gaurav.mahajan@seagate.com>

Signed-off-by: gaurav mahajan <gaurav.mahajan@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=replay-single envdefinitions="ONLY=48"
Cray-bug-id: LUS-4384
Seagate-bug-id: MRP-2616
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Change-Id: I2b3cca70872b7c9f13c64b50e1b4373096fbc147
Reviewed-on: https://review.whamcloud.com/31182
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
(cherry picked from commit ee9d75f41743874cc6aebcfd4daa3c3c71e003cf)
Reviewed-on: https://review.whamcloud.com/33764
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-10818 obdecho: don't set ma_need in echo_attr_get_complex() 76/33976/2
Nikitas Angelinas [Fri, 31 Aug 2018 08:04:18 +0000 (11:04 +0300)]
LU-10818 obdecho: don't set ma_need in echo_attr_get_complex()

echo_attr_get_complex() copies ma_need to a local variable, masks
MA_* values other than MA_INODE if MA_INODE is set in ma_need,
and restores the saved value of ma_need before the function exits.
This does not seem to be useful, and triggers an assertion in
echo_big_lmm_get() when MA_LOV and/or MA_LMV is set in ma_need.

Signed-off-by: Nikitas Angelinas <nangelinas@cray.com>
Cray-bug-id: LUS-6252

Lustre-change: https://review.whamcloud.com/33097
Lustre-commit: 40f70cd4cb1bb33c754607862dece7c6c1c30d38

Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Change-Id: I3f5a01b57bdd83937f19fd1fa392b53f7b316455
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33976
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-10055 mdt: use max_mdsize in reply for layout intent 33/33133/4
Mikhal Pershin [Mon, 30 Oct 2017 16:45:42 +0000 (19:45 +0300)]
LU-10055 mdt: use max_mdsize in reply for layout intent

The LAYOUT intent reply LVB buffer size is set to a current
file layout, meanwhile it is not working when layout is changed
and the mdt_max_mdsize is better to use as size of reply buffer.
This buffer will be shrinked to the new layout size after all.

Without that change the new layout size may be bigger and layout
is not returned back, causing extra RPC from client.
The mdt_lvbo_fill() is changed also to update mdt_max_mdsize if
larger layout is found. The related message level is decreased
from D_ERROR to D_INFO.

Lustre-change: https://review.whamcloud.com/30004
Lustre-commit: 4f27911cadf10d0b2fd6451569e688233eaf50d1

Signed-off-by: Mikhal Pershin <mike.pershin@intel.com>
Change-Id: Iaac5dcb8b4c5aa2c050dddb5b3fb2662c59f133b
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33133
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11187 ldiskfs: don't mark mmp buffer head dirty 36/33336/2
Li Dongyang [Tue, 21 Aug 2018 00:10:45 +0000 (10:10 +1000)]
LU-11187 ldiskfs: don't mark mmp buffer head dirty

Marking mmp bh dirty before writing it will make writeback
pick up mmp block later and submit a write, we don't want the
duplicate write as kmmpd thread should have full control of
reading and writing the mmp block.
Another reason is we will also have random I/O error on
the writeback request when blk integrity is enabled, because
kmmpd could modify the content of the mmp block(e.g. setting
new seq and time) while the mmp block is under I/O requested
by writeback.

Linux-commit: fe18d649891d813964d3aaeebad873f281627fbc

Lustre-change: https://review.whamcloud.com/33038
Lustre-commit: dd02d32c978ad95c9e2a3703ad6be7511c257a4d

Test-Parameters: testgroup=review-ldiskfs testlist=mmp
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I5aa9fd384a4ea25ee52f1198528fae4ecc9c28c7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33336
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-7631 tests: wait_osts_up waits for MDS precreates 90/33690/2
Andreas Dilger [Mon, 5 Jun 2017 19:22:17 +0000 (13:22 -0600)]
LU-7631 tests: wait_osts_up waits for MDS precreates

Fix wait_osts_up() to wait for the MDS to finish orphan cleanup and
precreate some OST objects so that there isn't a race to get all of
the OSTs available for conf-sanity test_82a.

Backport from master (2.12) branch to b2_10:
Lustre-change: https://review.whamcloud.com/#/c/27441/
Lustre-commit: edb0fb241bb5e0cc95c240ed977abf7f234ee045

Test-Parameters: trivial testlist=replay-single,replay-single
Test-Parameters: testlist=conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I534f0a1f36c3d00f702684041bfa991a4a3ebbe5
Reviewed-on: https://review.whamcloud.com/27441
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
(cherry picked from commit edb0fb241bb5e0cc95c240ed977abf7f234ee045)
Reviewed-on: https://review.whamcloud.com/33690
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9230 ldlm: speed up preparation for list of lock cancel 30/33130/3
Yang Sheng [Mon, 10 Sep 2018 12:37:10 +0000 (20:37 +0800)]
LU-9230 ldlm: speed up preparation for list of lock cancel

Keep the skipped locks in lru list will cause serious
contention for ns_lock. Since we have to travel them
every time in the ldlm_prepare_lru_list(). So we will
use a cursor to record position that last accessed
lock in lru list.

Change-Id: Ibda36a90e54076cb785a65910b34300639b3e140
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/33130
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11461 scripts: Support symlink target 97/33597/3
Nathaniel Clark [Thu, 1 Nov 2018 15:13:26 +0000 (11:13 -0400)]
LU-11461 scripts: Support symlink target

Support if configured target is symlink to real device, for instance
/dev/disk/by-id/scsi-WWID.  Also check against bare target for
ZPOOL/DEVICE which will return an empty string when passed to
realpath.
Also fix usage function, so it prints usage and doesn't just error
out.

Lustre-change: https://review.whamcloud.com/33277
Lustre-commit: c36d70272541d3ba3dd9051e6f50cf89eaba639f

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I699b1fd36c1e53e99a8d0e6b691374eca42fccc9
Reviewed-by: Joe Grund <jgrund@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33597
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11582 llite: protect reading inode->i_data.nrpages 81/33681/4
Bobi Jam [Sun, 11 Nov 2018 08:41:21 +0000 (16:41 +0800)]
LU-11582 llite: protect reading inode->i_data.nrpages

truncate_inode_pages() looks up pages in the radix tree without
lock, and could miss finding pages removed from the radix tree
by __remove_mapping(), so that after calling truncate_inode_pages()
we need to read the nrpages of the inode->i_data with the protection
of tree_lock.

Since it could still be in the race window of __remove_mapping()->
__delete_from_page_cache()->page_cache_tree_delte(), before the
nrpages being decreased.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I44ba6bea3dec4f0a110d1ae2a749514ec7dd0d12
Reviewed-on: https://review.whamcloud.com/33681
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
5 years agoLU-10527 obdclass: don't recycle loghandle upon ENOSPC 50/33850/3
Bruno Faccini [Wed, 17 Jan 2018 15:22:58 +0000 (16:22 +0100)]
LU-10527 obdclass: don't recycle loghandle upon ENOSPC

In llog_cat_add_rec(), upon -ENOSPC error being returned from
llog_cat_new_log(), don't reset "cathandle->u.chd.chd_current_log"
to NULL.
Not doing so will avoid to have llog_cat_declare_add_rec() repeatedly
and unnecessarily create new+partially initialized LLOGs/llog_handle
and assigned to "cathandle->u.chd.chd_current_log", this without
llog_init_handle() never being called to initialize
"loghandle->lgh_hdr".

Also, unnecessary LASSERT(llh) has been removed in
llog_cat_current_log() as it prevented to gracefully handle this
case by simply returning the loghandle.
Thanks to S.Cheremencev (Cray) to report this.

Both ways to fix have been kept in patch as the 1st part allows for
better performance in terms of number of FS operations being done
with permanent changelog's ENOSPC condition, even if this covers
a somewhat unlikely situation.

Lustre-commit: 5761b9576d39c44c02455b86eb86ce1276930e60
Lustre-change: https://review.whamcloud.com/30897

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I526f788dc283fa7136ba518179d9337e1d5e3714
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33850
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoNew release 2.10.6 2.10.6 v2_10_6
Oleg Drokin [Wed, 12 Dec 2018 05:34:13 +0000 (00:34 -0500)]
New release 2.10.6

Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew tag 2.10.6-RC3. 2.10.6-RC3 v2_10_6-RC3 v2_6_10_6-RC3
Oleg Drokin [Fri, 30 Nov 2018 18:33:10 +0000 (13:33 -0500)]
New tag 2.10.6-RC3.

Third release candidate for 2.10.6 release.

Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11663 osd-zfs: write partial pages with correct offset 48/33748/3
Alex Zhuravlev [Tue, 27 Nov 2018 06:47:50 +0000 (09:47 +0300)]
LU-11663 osd-zfs: write partial pages with correct offset

otherwise non-aligned writes send wrong data to ZFS.

Lustre-change: https://review.whamcloud.com/33726
Lustre-commit: 6f9a0292eacb0d603b14cc03290a574cb7f0c846
Change-Id: I1ae1f361981d548307d74344a5694f3ef39c0609
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33748
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoNew tag 2.10.6-RC2. 2.10.6-RC2 v2_10_6-RC2
Oleg Drokin [Thu, 22 Nov 2018 04:46:55 +0000 (23:46 -0500)]
New tag 2.10.6-RC2.

Second release candidate for 2.10.6 release.

Change-Id: I5ffa43c138c1d021c9ced6c0a1bb3c7abc34cf8d
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11424 lnet: copy the correct amount of cpts to lnet_cpts 12/33312/2
James Simmons [Tue, 25 Sep 2018 03:28:57 +0000 (23:28 -0400)]
LU-11424 lnet: copy the correct amount of cpts to lnet_cpts

The incorrect size was used in the memory copy of the requested
cpts to net->lnet_cpts. This lead to the following in testing
RIP: 0010:lnet_match2mt.isra.8+0x2b/0x40 [lnet]

lnet_mt_of_attach+0x72/0x1b0 [lnet]
LNetMEAttach+0x60/0x1f0 [lnet]
ptl_send_rpc+0x26f/0xbb0 [ptlrpc]
libcfs_debug_msg+0x57/0x80 [libcfs]
ptlrpc_send_new_req+0x4c9/0x860 [ptlrpc]
ptlrpc_check_set.part.21+0x855/0x18b0 [ptlrpc]
? try_to_del_timer_sync+0x4d/0x80
? del_timer_sync+0x35/0x40
ptlrpcd_check+0x3ae/0x3f0 [ptlrpc]
ptlrpcd+0x2be/0x320 [ptlrpc]
? wait_woken+0x80/0x80

Changing the size from ncpts to ncpts * sizeof(*net->net_cpts)

Lustre-change: https://review.whamcloud.com/33229
Lustre-commit: 5afe99cac9a7be5bf776d68c65b2fe51b31591ae

Change-Id: I10832430e53ccc5b40ebce3ddfd2cf9ea330b0df
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33312
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10584 tests: sanityn test 51d cancel layout lock 93/33393/2
James Nunez [Wed, 17 Oct 2018 22:55:16 +0000 (16:55 -0600)]
LU-10584 tests: sanityn test 51d cancel layout lock

Modify sanityn test 51d to cancel layout locks directly
using cancel_lru_locks.

Author: Bobi Jam <bobijam@whamcloud.com>

Test-Parameters: trivial testlist=sanityn
Test-Parameters: mdsjob=lustre-master ossjob=lustre-master serverbuildno=3807 testlist=sanityn
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I603e99ad847d2726ff18c742db1f2d0b9f20e98b
Reviewed-on: https://review.whamcloud.com/33393
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoRevert "LU-5152 quota: enforce block quota for chgrp" 82/33682/2
Andreas Dilger [Fri, 16 Nov 2018 20:38:25 +0000 (13:38 -0700)]
Revert "LU-5152 quota: enforce block quota for chgrp"

This reverts commit 07412234ec60de20cb8d8e45d755297fe6da2d61.

This can cause deadlocks between the MDT and OST as seen in
LU-11465 and other related problems.  Reverting this patch will
mean file group changes to not be updated in group quota.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5ce48424f4d2011ce62e69047ace7f0b7c11a93b
Reviewed-on: https://review.whamcloud.com/33682
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11563 build: Only add l_tunedisk udev rule to server 88/33688/2
Nathaniel Clark [Wed, 24 Oct 2018 19:49:39 +0000 (15:49 -0400)]
LU-11563 build: Only add l_tunedisk udev rule to server

Split LU-9551 patch off into server only udev rules.
It just spits errors on the client since l_tunedisk is a server-side
only tool.

Lustre-change: https://review.whamcloud.com/33466
Lustre-commit: 0d11a314787bc795797a016262e9bcfe86e2193e

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Iee426588bcce611dc913cf89a4bcb733c364482b
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Jay J Lan <jay.j.lan@nasa.gov>
Reviewed-on: https://review.whamcloud.com/33688
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11525 kernel: new kernel [RHEL7.6 3.10.0-957.el7] 21/33621/3
Jian Yu [Mon, 12 Nov 2018 09:58:13 +0000 (01:58 -0800)]
LU-11525 kernel: new kernel [RHEL7.6 3.10.0-957.el7]

This patch makes changes to support new RHEL 7.6 release.

Test-Parameters: clientdistro=el7.6 ossdistro=el7.6 mdsdistro=el7.6

Change-Id: I1c6f1d065404fdfecf9ea978a2b161b4f90d3ab2
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33621
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
5 years agoLU-11448 kernel: kernel update RHEL7.5 [3.10.0-862.14.4.el7] 96/33496/4
Jian Yu [Mon, 12 Nov 2018 08:00:18 +0000 (00:00 -0800)]
LU-11448 kernel: kernel update RHEL7.5 [3.10.0-862.14.4.el7]

Update RHEL7.5 kernel to 3.10.0-862.14.4.el7.

This patch is back-ported from the following one:
Lustre-commit: e734d9e803443139ccb160bbd4eae05ecc9627e5
Lustre-change: https://review.whamcloud.com/33254

Change-Id: I4901102347a14d23645547efc84857868acec0f7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33496
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
5 years agoLU-11215 tests: replace "large_xattr" with "ea_inode" 03/33503/2
Li Dongyang [Sun, 28 Oct 2018 18:14:05 +0000 (11:14 -0700)]
LU-11215 tests: replace "large_xattr" with "ea_inode"

Change the test scripts over to using the "ea_inode" name, since
this is what the upstream e2fsprogs is using.  The "large_xattr"
feature name was only ever used in the Lustre-patched e2fsprogs.

Don't try to turn off "ea_inode" feature on the targets anymore,
it's not supported by upstream e2fsprogs.

e2fsprogs commit: 5b72578279fe2470e682692a15d70a43d9289e0f

This patch is back-ported from the following one:
Lustre-commit: 900a1b6e4271a6c1903d6723082a01c98defe7b2
Lustre-change: https://review.whamcloud.com/33012

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I83bd303827fa28050d1d6d2416b2d630dc94ec12
Reviewed-on: https://review.whamcloud.com/33503
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
5 years agoLU-11412 kernel: kernel update [SLES12 SP3 4.4.155-94.50] 13/33313/4
Jian Yu [Thu, 27 Sep 2018 17:07:21 +0000 (10:07 -0700)]
LU-11412 kernel: kernel update [SLES12 SP3 4.4.155-94.50]

Update SLES12 SP3 kernel to 4.4.155-94.50.

Test-Parameters: envdefinitions=CONF_SANITY_EXCEPT=103 \
mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
clientdistro=sles12sp3 ossdistro=sles12sp3 mdsdistro=sles12sp3 \
testlist=sanity,conf-sanity,sanity-sec

Lustre-change: https://review.whamcloud.com/33236
Lustre-commit: 937d72acf48f9d7cc2e382ac98c1d37c1f5fb1df

Change-Id: I47b72420664fe614a2da7c864c401a9729d96c55
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33313
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
5 years agoLU-11500 kernel: kernel update RHEL6.10 [2.6.32-754.6.3.el6] 48/33348/2
Jian Yu [Wed, 10 Oct 2018 18:04:54 +0000 (11:04 -0700)]
LU-11500 kernel: kernel update RHEL6.10 [2.6.32-754.6.3.el6]

Update RHEL6.10 kernel to 2.6.32-754.6.3.el6 for Lustre client.

Test-Parameters: clientdistro=el6.10

Change-Id: I65f5c77f2fb9c086ce756ff4895fda71772128ba
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33348
Tested-by: Jenkins
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
5 years agoLU-10579 mdd: emit changelogs for security and trusted xattrs 48/33048/3
John L. Hammond [Tue, 21 Aug 2018 21:14:36 +0000 (16:14 -0500)]
LU-10579 mdd: emit changelogs for security and trusted xattrs

In mdd_xattr_changelog_type() include security.* and trusted.* among
the xattr names for which we emit a changelog record. This was done in
2.11 by https://review.whamcloud.com/28251 LU-9727 lustre: add
CL_GETXATTR for Changelogs.

Test-Parameters: trivial testlist=lustre-rsync-test
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I08cf6a293ac33dfb4b15a04000e301c516d7bd95
Reviewed-on: https://review.whamcloud.com/33048
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
5 years agoNew tag 2.10.6-RC1. 2.10.6-RC1 v2_10_6-RC1
John L. Hammond [Fri, 5 Oct 2018 16:58:58 +0000 (11:58 -0500)]
New tag 2.10.6-RC1.

First release candidate for 2.10.6 release.

Change-Id: I60b19bda5c102d980638555eacd725cb3c712538
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
5 years agoLU-11160 build: Fix uuid / blkid dependency 47/33147/2
Nathaniel Clark [Thu, 19 Jul 2018 19:26:27 +0000 (15:26 -0400)]
LU-11160 build: Fix uuid / blkid dependency

UUID dependency stems from libblkid, so only link with uuid if blkid
is present.

Lustre-change: https://review.whamcloud.com/32842
Lustre-commit: b6c2ffa661f5aceb50fe03fbc8cfd7cdf7966ae2

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: If1cc293cc48210a065f8910ea655615b11268b5c
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33147
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>