Whamcloud - gitweb
fs/lustre-release.git
5 years agoLU-11999 dne: performance improvement for file creation 05/34505/2
Jinshan Xiong [Sun, 24 Feb 2019 22:32:41 +0000 (14:32 -0800)]
LU-11999 dne: performance improvement for file creation

This is to remove an obsoleted code where it causes drastic
performance degradation. This code is written before PERM lock
is introduced, and it requests UPDATE lock at path walk for
remote directory, which will be cancelled at later file creation.

Tests result before and after this patch is applied:

Test case:
rm -rf /mnt/lustre_purple/testdir
lfs mkdir -i 0 /mnt/lustre_purple/testdir
lfs mkdir -i 2 /mnt/lustre_purple/testdir/dir2
./lustre-release/lustre/tests/createmany -o \
/mnt/lustre_purple/testdir/dir2/f 10000

Before the patch is applied:
total: 10000 open/close in 12.82 seconds: 780.22 ops/second

After the patch is applied:
total: 10000 open/close in 4.89 seconds: 2044.75 ops/second

Lustre-change: https://review.whamcloud.com/34291
Lustre-commit: bfbd062e6b177cf934b75d6be2db695b9fe1648b

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Change-Id: Ib474dc28d6edc7d15801b6821edc0e1d108bb4b6
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34505
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11913 utils: allow "mq-deadline" as scheduler 26/34426/2
Andreas Dilger [Fri, 1 Feb 2019 20:10:40 +0000 (13:10 -0700)]
LU-11913 utils: allow "mq-deadline" as scheduler

Allow the "mq-deadline" scheduler for multi-queue block devices, in
addition to just "noop" and "deadline".  Explicitly add "deadline"
as a valid option, in case the default scheduler is changed.

Lustre-change: https://review.whamcloud.com/34163
Lustre-commit: 4326ab53b7142e474c75b46da2361d148a2f7ce8

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2cb0878188aea43f88c503ea70a699be083ebbe5
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34426
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11720 spec: srpm should be free of kernel requiements 10/34310/2
Nathaniel Clark [Mon, 3 Dec 2018 19:04:37 +0000 (14:04 -0500)]
LU-11720 spec: srpm should be free of kernel requiements

This moves the fix for LU-9731 into spec file and out of lbuild.
This lets "make rpms" benefit from the fix.
This also prevents the srpm from being incorrectly locked to the
kernel present when lbuild was used to create it (via
kmp-lustre.preamble).

Lustre-change: https://review.whamcloud.com/33771
Lustre-commit: 3c280a95736a884bc2f36dad674505f1d5b00982

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I15f61c0e37182c0efbea3566d43b1e89f180d3e5
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34310
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8384 scripts: Add scripts to systemd for EL7 03/34503/2
Dmitry Eremin [Fri, 8 Jul 2016 21:15:37 +0000 (00:15 +0300)]
LU-8384 scripts: Add scripts to systemd for EL7

When rebooting a lustre client where Lustre filesystem is still
mounted, the shutdown hangs. This patch create a systemd service
that unmount the Lustre filesystems and unload the Lustre modules
when system is shutdown.

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/21457
Lustre-commit: 495deddfbb43f247b2fa9dd2da5743abc89cd862

Change-Id: I1cfe84684e23b8861743241dfbc4d6e320ace4a6
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Gregoire Pichon <gregoire.pichon@atos.net>
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34503
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11849 utils: fix to make exclude projid works 07/34407/2
Wang Shilong [Thu, 10 Jan 2019 15:32:14 +0000 (23:32 +0800)]
LU-11849 utils: fix to make exclude projid works

We intended to use projid not uid here, fix it.
Also add ! --projid options test to cover this.

Lustre-change: https://review.whamcloud.com/34005
Lustre-commit: db9965ce33365c2645827b06af21f8f5918ea2bb

Test-Parameters: trivial testlist=sanity-quota
Change-Id: I64c3f1c68885947d0e91626525ee037756e1d7d8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/34407
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11964 mdc: prevent glimpse lock count grow 04/34504/2
Mikhail Pershin [Thu, 14 Feb 2019 21:51:00 +0000 (00:51 +0300)]
LU-11964 mdc: prevent glimpse lock count grow

DOM locks matching tries to ignore locks with
LDLM_FL_KMS_IGNORE flag during ldlm_lock_match() but
checks that after ldlm_lock_match() call. Therefore if
there is any lock with such flag in queue then all other
locks after it are ignored and new lock is created causing
big amount of locks on single resource in some access
patterns.
Patch extends lock_matches() function to check flags to
exclude and adds ldlm_lock_match_with_skip()p to use that
when needed.
Corresponding test was added in sanity-dom.sh

Test-Parameters: testlist=sanity-dom

Lustre-change: https://review.whamcloud.com/34261
Lustre-commit: b915221b6d0f3457fd9dd202a9d14c5f8385bf47

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic45ca10f0e603e79a3a00e4fde13a5fae15ea5fc
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34504
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11418 mdd: delete name if orphan doesn't exist 26/34326/3
Lai Siyao [Tue, 23 Oct 2018 11:17:20 +0000 (19:17 +0800)]
LU-11418 mdd: delete name if orphan doesn't exist

mdd_orphan_destroy() should delete name if orphan object doesn't
exist, otherwise the orphan clean thread will try to destroy this
orphan in dead loop.

add sanity test_811.

Lustre-change: https://review.whamcloud.com/33661
Lustre-commit: fffef5c29e3bdf0f96168abc3d0488bad06f33bb

Fixes: 5d89450b462f ("LU-11516 mdd: do not assert on missing orphan")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id22b2fab0ac87dfb81ca9f01d8ed0338f1b12120
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34326
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11827 llog: protect cathandle in llog_cat_declare_add_rec 55/34455/3
Vladimir Saveliev [Sat, 22 Dec 2018 00:31:45 +0000 (03:31 +0300)]
LU-11827 llog: protect cathandle in llog_cat_declare_add_rec

llog_cat_declare_add_rec() calls llog_cat_prep_log() passing
&cathandle->u.chd.chd_current_log and
&cathandle->u.chd.chd_next_log. Then it has to protect cathandle in
order to avoid race with llog_cat_current_log() when it decides to
change cathandle->u.chd.chd_current_log and
cathandle->u.chd.chd_next_log.

Lustre-change: https://review.whamcloud.com/33914
Lustre-commit: 59a62ada2e18174e5611730e8bcf5ba3165ca2b9

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-6804
Change-Id: I689efb40452af180f137aff35ccabe132a24180a
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34455
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11206 tests: Use import_ready to check IDLE 71/34471/3
Patrick Farrell [Mon, 11 Feb 2019 17:47:09 +0000 (12:47 -0500)]
LU-11206 tests: Use import_ready to check IDLE

When checking if a client/OST import is up, we have to
check for IDLE as well as FULL.

wait_osc_import_ready is provided for this, but a few spots
don't use it, so they occasionally fail.

Lustre-change: https://review.whamcloud.com/34225
Lustre-commit: 3ed6b8c2ea27b1a3a9fa073e19d77d7c317ae69f

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I826659a7f5953dee4e4551c1177479ef742b5589
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34471
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11330 osd-zfs: hash for ./.. must be 0 51/34451/4
Alex Zhuravlev [Thu, 24 Jan 2019 05:04:09 +0000 (08:04 +0300)]
LU-11330 osd-zfs: hash for ./.. must be 0

do not use current iterator position as hash source for dot and dotdot.
instead just return 0 as hash for these entries.

Lustre-commit: fb75af7d45d1217c877f75c4296f9df0cc731604
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5ee439b237e8ed98d295f5672b1d0e8a6b48a55b
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Lustre-change: https://review.whamcloud.com/34098
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34451
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-9706 dt: remove dt_txn_hook_commit() 96/34496/2
Alex Zhuravlev [Thu, 7 Feb 2019 09:33:12 +0000 (12:33 +0300)]
LU-9706 dt: remove dt_txn_hook_commit()

it's not used and it's not safe as dt_txn_callback_del()
and dt_txn_callback_add() can race with commit callbacks.

Lustre-change: https://review.whamcloud.com/34212
Lustre-commit: e763467ebe00913e8d03f855dc4b918b95099931

Change-Id: Ib80b0f69be008b4f895586dde35d1a5833a1a861
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34496
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11243 lod: fix assertion and hang upon lod_add_device failure 50/34450/3
Wang Shilong [Mon, 10 Dec 2018 05:45:33 +0000 (13:45 +0800)]
LU-11243 lod: fix assertion and hang upon lod_add_device failure

There are two problems:

See following assertion:

    lod_add_device() lustre-OSTe42a-osc-MDT0000:
                     can't set up pool, failed with -12
    osp_disconnect() ASSERTION( imp != ((void *)0) ) failed:
    osp_disconnect() LBUG
    CPU: 1 PID: 10059 Comm: llog_process_th

Problem is obd_disconnect() will cleanup @imp and set NULL.
 ->osp_obd_disconnect
    ->class_manual_cleanup
       ->class_process_config
          ->class_cleanup
             ->obd_precleanup
                ->osp_device_fini
                   ->client_obd_cleanup

While ldo_process_config() will try to access @imp again:
 ->ldo_process_config
    ->osp_shutdown
       ->osp_disconnect
          ->LASSERT(imp != NULL)

Another problem is if we failed before obd_connect().
we will hang on with mount:
 ->ldo_process_config
    ->osp_shutdown
       ->osp_disconnect
          ->ptlrpc_disconnect_import
             ->rc = l_wait_event(imp->imp_recovery_waitq,
                                 !ptlrpc_import_in_recovery(imp), &lwi);

Since connect is not called, imp state will stay LUSTRE_IMP_NEW.
Fix this by check whether we are in recovery properly, only consider
we are in recovery if we are in following states:

 LUSTRE_IMP_CONNECTING = 4,
 LUSTRE_IMP_REPLAY     = 5,
 LUSTRE_IMP_REPLAY_LOCKS = 6,
 LUSTRE_IMP_REPLAY_WAIT  = 7,
 LUSTRE_IMP_RECOVER    = 8,

Lustre-change: https://review.whamcloud.com/32994
Lustre-commit: f28353b3d810cfbec018a263556ceac84ab9413e

Change-Id: I2113b95a421bae7117f3057d5f0fdf78db95caa3
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34450
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11555 utils: ZFS check multihost enabled in read_ldd() 00/34300/5
Nathaniel Clark [Sat, 3 Nov 2018 04:03:43 +0000 (00:03 -0400)]
LU-11555 utils: ZFS check multihost enabled in read_ldd()

For ZFS check that multihost is enabled if failover host is defined.
Print a warning if it's not.

Lustre-change: https://review.whamcloud.com/33491
Lustre-commit: 5e62552e7fc6e9da4068bb29f62eb2cf7a42970e

Test-Parameters: trivial mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Iddb5871afc6fb6808a25921c8d3e8516d675f15c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34300
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11752 osc: pass client page size during reconnect too 85/34485/2
Mikhail Pershin [Thu, 13 Dec 2018 10:11:05 +0000 (13:11 +0300)]
LU-11752 osc: pass client page size during reconnect too

Client page size is reported to the server in ocd_grant_blkbits
and server returns back device blocksize. During reconnect that
ocd_grant_blkbits contains server device blocksize which is used
by server as client page size wrongly.

Patch sets ocd_grant_blkbits to the client page size again during
reconnect so server will get expected information.

Lustre-change: https://review.whamcloud.com/33847
Lustre-commit: 5bec8f95cc1028d207e55e659a27d80081864a83

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I14bba1d025e4e9fb99fd4bae4002463439ac265c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34485
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12018 quota: do not start a thread under memory pressure 64/34464/4
Alex Zhuravlev [Tue, 26 Feb 2019 07:31:53 +0000 (10:31 +0300)]
LU-12018 quota: do not start a thread under memory pressure

this leads to a deadlock as kthreadd creating new threads
can get stuck waiting for memory as well:

PID: 2 TASK: ffff88015d1e0fb0 CPU: 3 COMMAND: "kthreadd"

Lustre-change: https://review.whamcloud.com/34328
Lustre-commit: 94b11d5a7c55f4f6aff918c3b565b74cb18d04fb

Change-Id: I88f14da24ea64dcc02a9fd1f4a9c03f5771f8fda
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34464
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11773 utils: add PFL flags support to YAML API 54/34454/3
Patrick Farrell [Thu, 13 Dec 2018 20:21:31 +0000 (14:21 -0600)]
LU-11773 utils: add PFL flags support to YAML API

The setstripe YAML interface currently ignores the
lcme_flags field. This means it doesn't work correctly with
some FLR layouts.

Fixing this is a trivial matter of making the YAML layout
generator read & use the lcme_flags field.

Lustre-change: https://review.whamcloud.com/33852
Lustre-commit: b71766311daa0faf3560a2435778f7b2de1e3ad6

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: If15999aa58ac3e31da677bd5d1ef8b063b46b1e5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34454
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-6142 lod: Fix style issues for lod_dev.c 52/34452/3
Arshad Hussain [Fri, 16 Nov 2018 23:48:41 +0000 (05:18 +0530)]
LU-6142 lod: Fix style issues for lod_dev.c

This patch fixes issues reported by checkpatch for file
lustre/lod/lod_dev.c

Lustre-change: https://review.whamcloud.com/33594
Lustre-commit: 263401f804eb108da2b09ba95bbd441857281c95

Change-Id: I72eaa79a12769e61889e567e5f28fdf3e8045c94
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34452
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11974 llapi: improve llapi_layout_get_by_xattr(3) API 63/34463/3
Li Xi [Tue, 19 Feb 2019 02:33:20 +0000 (10:33 +0800)]
LU-11974 llapi: improve llapi_layout_get_by_xattr(3) API

llapi_layout_get_by_xattr() assumes that the lum has already
been properly swapped by llapi_layout_swab_lov_user_md().
However, llapi_layout_swab_lov_user_md() function is not
exported, so external tool won't be able to use it.

Instead of exporting a lot of APIs, this patch include the
swab functions into llapi_layout_get_by_xattr() and add an
argument flags to the API.

Lustre-change: https://review.whamcloud.com/34276
Lustre-commit: 89e43812da871bb560f3c50a0c36713ea7788e0a

Change-Id: I9fbf0f0ba66660d2f382fb20b03f069c1a7afad5
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34463
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11894 lnet: check for asymmetrical route messages 57/34457/2
Sebastien Buisson [Mon, 28 Jan 2019 15:16:42 +0000 (00:16 +0900)]
LU-11894 lnet: check for asymmetrical route messages

Asymmetrical routes can be an issue when debugging network,
and allowing them also opens the door to attacks where hostile
clients inject data to the servers.

In order to prevent asymmetrical routes, add a new lnet kernel
module option named 'lnet_drop_asym_route'. When set to non-zero,
lnet_parse() will check if the message received from a remote peer
is coming through a router that would normally be used by this node
to reach the remote peer. If it is not the case, then it means we
are dealing with an asymmetrical route message, and the message will
be dropped.

The check for asymmetrical route can also be switched on/off with
the command 'lnetctl set drop_asym_route 0|1'. And this parameter is
exported/imported in Yaml.

Lustre-change: https://review.whamcloud.com/34119
Lustre-commit: 4932febc121349d855ac9934c538ce688c140afa

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I06fb23d9e46984d79c14fa9b53b2fa04ce3c50c5
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34457
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12020 llite: make sure name pack atomic 65/34465/2
Wang Shilong [Tue, 26 Feb 2019 14:38:29 +0000 (22:38 +0800)]
LU-12020 llite: make sure name pack atomic

We are trying to access dentry name directly and pass it
down without holding @d_lock, this is racy and possibly
make us trigger assertions:

(mdc_lib.c:137:mdc_pack_name()) ASSERTION( lu_name_is_valid_2(buf, cpy_len) ) failed:

Fix the problem by allocting memory and copy name with @d_lock
held.

Lustre-change: https://review.whamcloud.com/34330
Lustre-commit: f575b6551b2b8690894baeab95d6fe35e57e9418

Change-Id: Iae0066661f42e8fca9358cbedd9cb21828779bbb
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34465
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11658 lov: cl_cache could miss initialize 06/34306/3
Yang Sheng [Tue, 13 Nov 2018 20:17:09 +0000 (04:17 +0800)]
LU-11658 lov: cl_cache could miss initialize

The cl_cache may be missed initialize when we mount
a client with deactivate osc and then active it.

Lustre-change: https://review.whamcloud.com/33650
Lustre-commit: 42e83c44eb5a22cbacf1ed4c6d4d6b588e07faa9

And followup patch:
Lustre-change: https://review.whamcloud.com/33983
Lustre-commit: c69e34ce0ed5759fbef20f5aae7f47ead5598094

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I92cd44375d70624fb55ef7a0218e7178211a8687
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34306
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11689 lfs: make sure project proceed all dirs 07/34307/3
Wang Shilong [Thu, 22 Nov 2018 01:23:36 +0000 (09:23 +0800)]
LU-11689 lfs: make sure project proceed all dirs

Leftover fix since LU-10986 lfs: make lfs project tolerant errors
We should proceed other dirs if we hit errors, otherwise,
some dirtree like following will fail if aaaa not exists.

testdir/
├── subdir
│   └── 1
├── bbbb -> aaaa
└── cccc

Also remove extra error output since we have output failing
messages inside every action function.

Lustre-change: https://review.whamcloud.com/33707
Lustre-commit: e022922fb4a2429d0c2488a13ad8127c068aa2b8

Fixes: d189024bd306 ("LU-10986 lfs: make lfs project tolerant errors")

Test-Parameters: trivial testlist=sanity-quota,sanity-quota
Change-Id: I0062dbc3f4d1925c9e9e1a509ee35ac569bd9b74
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34307
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11401 tests: add version check sanity-flr tests 98/34298/4
James Nunez [Tue, 15 Jan 2019 00:48:17 +0000 (17:48 -0700)]
LU-11401 tests: add version check sanity-flr tests

sanity-flr test 48 and 203 was added to Lustre tag 2.11.55.
Thus, we need to check that the server version is 2.11.55 or
later before running test 48 and 203.

sanity-flr test 0h checks for a file inheriting the directory
layout. sanity-flr test 37 added ‘lfs mirror write’ functionality.
Inheritance was fixed and ‘lfs mirror write’ was added in Lustre
tag 2.11.57. Thus, we need to check that the server version is
2.11.57 or later before running test 0h and 37.

Lustre-change: https://review.whamcloud.com/33955
Lustre-commit: fa3b858d5c5b9124591e05a5dcdd98a3ee3619c6

Test-Parameters: trivial testlist=sanity-flr
Test-Parameters: serverjob=lustre-b2_11 serverdistro=el7 serverbuildno=2 testlist=sanity-flr
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I94c68e900d60e2b97d7f74c6629ee54bcb3a5480
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34298
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11402 tests: add version check sanity-quota 60 94/34094/3
James Nunez [Wed, 2 Jan 2019 22:38:58 +0000 (15:38 -0700)]
LU-11402 tests: add version check sanity-quota 60

sanity-quota test 60 was added to Lustre tag 2.11.53. Thus,
we need to check that the server is 2.11.53 or later before
running test 60.

This patch is back-port from:
Lustre-commit: d187a78afc960849c3eb86a1a0559c9ba00e8cdf
Lustre-change: https://review.whamcloud.com/33418

Test-Parameters: trivial serverjob=lustre-b2_10 serverbuildno=152 testlist=sanity-quota
Test-Parameters: testlist=sanity-quota

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I5db738f0776b5615165977df1708f197f215b994
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34094
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11607 tests: create routine to get Lustre env 19/34319/8
James Nunez [Fri, 28 Dec 2018 16:23:32 +0000 (09:23 -0700)]
LU-11607 tests: create routine to get Lustre env

The Lustre tests in the test suites make repeated calls
to a small number functions that relate to the Lustre
environment. Some examples are the version of the Lustre
server or client and the file system type of the server.

Collect these calls into a routine called get_lustre_env()
in the test-framework.sh library and replace calls in
sanity with the global variables.

Lustre-change: https://review.whamcloud.com/33938
Lustre-commit: 4eb4479b0ea050d99033a9bac9994d2f1509200c

This patch corrected a bracket issue with sanity test 156,
but made the test fail for ZFS testing. Thus, we are
backporting a second patch into this one so this patch
passes testing.

Lustre-change: https://review.whamcloud.com/34114
Lustre-commit: 42c4dab3c817f9f03efe457fd33e946ed68fab14

Test-Parameters: trivial

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I01dd00dd50cca39c964c5fd8abc3f51ab3c8e6b8
Reviewed-on: https://review.whamcloud.com/33938
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
(cherry picked from commit 4eb4479b0ea050d99033a9bac9994d2f1509200c)
Reviewed-on: https://review.whamcloud.com/34319
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
5 years agoLU-12065 lnd: increase CQ entries 74/34474/3
Amir Shehata [Wed, 20 Mar 2019 18:10:34 +0000 (11:10 -0700)]
LU-12065 lnd: increase CQ entries

Several sites have reported RDMA timeouts. Most of the timeouts
are occurring for transmits on the active_tx queue. Transmits are
placed on the active_tx queue until a completion is received. If
there isn't enough CQ entries available, it's possible for a
completions events to be delayed, causing these timeouts.

Lustre-change: https://review.whamcloud.com/34473
Lustre-commit: bf3fc7f1a7bf82c02181c64810bc3f93b3303703

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9edad734b5860ce20af4977b4c1cdc07f25f078e
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34474
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-10070 tests: Fix replay-single test_85b 70/34470/2
Patrick Farrell [Tue, 4 Dec 2018 21:16:50 +0000 (15:16 -0600)]
LU-10070 tests: Fix replay-single test_85b

test_85b of replay single sets a default striping on $DIR
and does not remove it.  This makes it impossible to
correctly test self-extending layouts, so fix this first.

This patch is back-port from:
Lustre-commit: 0b9fb772e68db7cbf0c8a755092c1d8b5de6b83d
Lustre-change: https://review.whamcloud.com/33777

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I0057c8403e3dae2437cf0c8810af8086e2971c35
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/34406
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34470
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11770 osc: allow build without blk_integrity or crc-t10pi 56/34156/4
Andreas Dilger [Wed, 26 Dec 2018 09:05:37 +0000 (02:05 -0700)]
LU-11770 osc: allow build without blk_integrity or crc-t10pi

Allow the client to build if blk_integrity or crc-t10pi is not
enabled in the kernel.

Lustre-change: https://review.whamcloud.com/33923
Lustre-commit: e0fb3133372e5bff434ac7a467304d9ba954bac6

Fixes: ccf3674c9ca ("LU-10472 osd-ldiskfs: T10PI between RPC and BIO")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I97c4e75ad084e99927bcb41cf0df8a680525a5b1
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34156
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11790 ldiskfs: add terminating u32 when expanding inodes 14/34314/2
Li Dongyang [Wed, 19 Dec 2018 03:03:14 +0000 (14:03 +1100)]
LU-11790 ldiskfs: add terminating u32 when expanding inodes

In ext4_expand_extra_isize_ea(), we calculate the total size of the
xattr header, plus the xattr entries so we know how much of the
beginning part of the xattrs to move when expanding the inode extra
size.  We need to include the terminating u32 at the end of the xattr
entries, or else if there is uninitialized, non-zero bytes after the
xattr entries and before the xattr values, the list of xattr entries
won't be properly terminated.

Lustre-change: https://review.whamcloud.com/33893
Lustre-commit: 7c800e460661972925a7acab51f023d0b38161b5

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I247b935b3cf315481dc4658133a7eee02b6350e9
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34314
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11647 ptlrpc: always unregister bulk 05/34305/2
Hongchao Zhang [Thu, 15 Nov 2018 16:21:15 +0000 (11:21 -0500)]
LU-11647 ptlrpc: always unregister bulk

In ptlrpc_check_set, the bulk should be unregistered before
ptl_send_rpc in any case.

Lustre-change: https://review.whamcloud.com/22378
Lustre-commit: 21c53b18a1bc0e36d2ecd1fb731f0dc6403902ee

Change-Id: Icf963002f934b43ccbb9d6ef02ba7f9d11f297f8
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34305
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8130 libcfs: don't include rhashtable if unavailable 20/34020/2
Andreas Dilger [Fri, 14 Dec 2018 22:43:41 +0000 (15:43 -0700)]
LU-8130 libcfs: don't include rhashtable if unavailable

Don't include <linux/rhashtable.h> if it is not available.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I80b2ee63fb2a438399359f8052a5063429dd6506
Reviewed-on: https://review.whamcloud.com/34020
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11947 scripts: handle ZFS targets in Lustre RA 16/34316/2
Nathaniel Clark [Fri, 8 Feb 2019 18:02:28 +0000 (13:02 -0500)]
LU-11947 scripts: handle ZFS targets in Lustre RA

Fixes a regression introduced in LU-11461
This handles the case of realpath of target being an empty string.

Fixes: c36d70272541 ("LU-11461 scripts: Support symlink target")

Lustre-change: https://review.whamcloud.com/34217
Lustre-commit: aabcfb0af701d641bbe18336b22c7288c96c7115

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I1bcb85908019e968ac0d69e437db217594a6565e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34316
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11579 llite: remove cl_file_inode_init() LASSERT 02/34302/2
Andreas Dilger [Mon, 29 Oct 2018 06:42:46 +0000 (00:42 -0600)]
LU-11579 llite: remove cl_file_inode_init() LASSERT

If there is some corruption or other reason that the file layout
cannot be used, the first call to cl_file_inode_init() will fail.
If it is called a second time on the same file then it will hit
an LASSERT() since I_NEW is no longer set on the inode.

It would be good to handle the error in lov_init_raid0() better,
but we still want to avoid this LASSERT() if there is an error.

Convert the LASSERT() in cl_file_inode_init() into a CERROR() and
error return.  This is being triggered due to corruption on the
server, but that shouldn't cause the client to assert.

    lov_dump_lmm_common() oid 0xdf4e:311367, magic 0x0bd10bd0
    lov_dump_lmm_common() stripe_size 1048576, stripe_count 4
    lov_dump_lmm_objects() stripe 0 idx 10 subobj 0x0:151194471
    lov_dump_lmm_objects() stripe 1 idx 12 subobj 0x0:152477530
    lov_dump_lmm_objects() stripe 2 idx 25 subobj 0x0:151589797
    lov_dump_lmm_objects() stripe 3 idx 2 subobj 0x0:150332564
    lov_init_raid0() fsname-clilov: OST0019 is not initialized
    cl_file_inode_init() Failure to initialize cl object
        [0x20004c047:0xdf4e:0x0]: -5

    cl_file_inode_init() ASSERTION(inode->i_state & (1 << 3) ) failed
    cl_file_inode_init() LBUG
    Pid: 37233, comm: ll_sa_4709 3.10.0-862.14.4.el7.x86_64 #1 SMP
    Call Trace:
    libcfs_call_trace+0x8c/0xc0 [libcfs]
    lbug_with_loc+0x4c/0xa0 [libcfs]
    cl_file_inode_init+0x2ac/0x300 [lustre]
    ll_update_inode+0x315/0x600 [lustre]
    ll_iget+0x163/0x350 [lustre]
    ll_prep_inode+0x232/0xc80 [lustre]
    sa_handle_callback+0x3a4/0xf70 [lustre]
    ll_statahead_thread+0x40e/0x2080 [lustre]

Instead, return an IO error instead of killing the client.

Lustre-change: https://review.whamcloud.com/33505
Lustre-commit: 0baa3eb1a4abe6e1e882cf03b0edfabda20142b7

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8a6eb24df09e7e158b61f02e2517132893ebbe5
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34302
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11736 utils: don't set max_sectors_kb on MDT/MGT 11/34311/2
Andreas Dilger [Thu, 6 Dec 2018 00:15:05 +0000 (17:15 -0700)]
LU-11736 utils: don't set max_sectors_kb on MDT/MGT

The max_sectors_kb tunable should not be applied to MDT and MGT
devices. This tuning is needed for efficiency of large IOs for
spinning disks, but is not needed for SSDs or regular IO. It can
cause problems with DM Multipath configurations for minimal
benefits, so should be limited to OST devices.

This only applies to ldiskfs backend filesystems, no such tuning
is currently done for any ZFS devices.

Lustre-change: https://review.whamcloud.com/33796
Lustre-commit: 2f8d7b4679de3fa467040aa61733f262714e39c9

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I496603da13aae042f63cc37c0dea221a393ebbe5
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34311
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-7631 tests: add debug info to conf-sanity 82a 94/34294/2
James Nunez [Tue, 20 Nov 2018 15:38:26 +0000 (08:38 -0700)]
LU-7631 tests: add debug info to conf-sanity 82a

In the routine check_stripe_count, the different error
messages need to be modified so when an error occurs,
a user can tell what error was hit. Also, print precreated
object information at the beginning of the test and on
error.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=conf-sanity envdefinitions=ONLY=82a
Test-Parameters: mdscount=1 mdtcount=1 testlist=conf-sanity envdefinitions=ONLY=82a

Lustre-change: https://review.whamcloud.com/33689
Lustre-commit: e76683a5bd540cacd2271a969aa9acd9bf790ccf

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ifc75d52d38d9cb401118ef7baa4014bddf6298f2
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34294
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11757 lod: use calculated stripe count 13/34313/2
Andriy Skulysh [Mon, 3 Dec 2018 14:45:18 +0000 (16:45 +0200)]
LU-11757 lod: use calculated stripe count

lod_prep_md_striped_create() tries to allocat big
chunk of memory because
lum->lum_stripe_count == -1 and is converted to __u32.

ldo_dir_stripe_count was calculated already in lod_ah_init()

Lustre-change: https://review.whamcloud.com/33829
Lustre-commit: 622a94d5e27ed3e596918863c08b304a6be9a646

Change-Id: Id99d9e024638dfb1b34262840d2e543c808a9cdc
Cray-bug-id: LUS-6694
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34313
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11737 lfsck: do not ignore dryrun 12/34312/2
Alex Zhuravlev [Tue, 11 Dec 2018 11:26:13 +0000 (14:26 +0300)]
LU-11737 lfsck: do not ignore dryrun

lfsck_layout_recreate_lovea() shouldn't ignore dryrun.

Lustre-change: https://review.whamcloud.com/33826
Lustre-commit: 875f3fc03aa15049892fe19d6a4fc1132848fced

Change-Id: Ia8bafc13f148b03573dee5db26b6aff9386b5b5f
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34312
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11625 ofd: handle upgraded filter_fid properly 04/34304/2
Andreas Dilger [Wed, 7 Nov 2018 02:40:18 +0000 (19:40 -0700)]
LU-11625 ofd: handle upgraded filter_fid properly

Since there have been several iterations of struct filter_fid stored
on disk, the current code wasn't checking for all of the possible
cases when trying to decide what action to take when accessing and
upgrading the xattr for new capabilities.

Properly check for the various different struct filter_fid sizes and
handle them appropriately.  Add a more verbose description of the
various cases so that this is more clear to others in the future.

Add decoding of filter_fid fields added for FLR in 2.11.

We should already be testing for upgrading the filter_fid xattr
from different OST versions in conf-sanity test_32d.

Lustre-change: https://review.whamcloud.com/33627
Lustre-commit: 381a8cdd527ce4deccfc3f7eb461892f6f2f3fff

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifef2292296236cb06ff7e8cd50caff4b133ebbe5
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34304
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11620 lfsck: change llsd_rb_lock to rwsemaphore 03/34303/2
Lai Siyao [Sat, 20 Oct 2018 20:50:49 +0000 (04:50 +0800)]
LU-11620 lfsck: change llsd_rb_lock to rwsemaphore

llsd_rb_lock is taken in ->init, and released in ->fini, but during
this period it may getxattr which can sleep. Change it to rwsemaphore.

Lustre-change: https://review.whamcloud.com/33603
Lustre-commit: 925ce153979d6ac793a65e193181ec14a8281640

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idc68eb886e60dc45ccfb7ac9bf5bf06db42d690d
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34303
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11712 osd-ldiskfs: Wrap blk integrity with config check 09/34309/2
Chris Horn [Wed, 28 Nov 2018 21:00:25 +0000 (15:00 -0600)]
LU-11712 osd-ldiskfs: Wrap blk integrity with config check

Build is currently broken for kernels without
CONFIG_BLK_DEV_INTEGRITY. Build failure introduced by LU-11096
commit c8505c2e70d03ba20edf9fcbf431888e87a21147
https://review.whamcloud.com/#/c/32725/
Use of blk integrity should be wrapped in the config check for
HAVE_BLK_INTEGRITY_ENABLED

Lustre-change: https://review.whamcloud.com/33745
Lustre-commit: fd193758bb95e3fbb4cd04e88f0d964f9cb510cf

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Iac9e4a2572024c026132c87c11042cf353b14d48
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34309
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11878 tests: don't fork-bomb sanity test_103b 02/34202/2
Andreas Dilger [Tue, 22 Jan 2019 07:53:44 +0000 (00:53 -0700)]
LU-11878 tests: don't fork-bomb sanity test_103b

Running sanity test_103b may start up to 512 parallel threads for
running the test, each of which starts two bash processes and lfs
or rm processes.

For the VMs running in our testbed (esp. ARM with 64KB PAGE_SIZE)
this can trigger the OOM killer and cause the test to fail if bash
is killed.  Limit the number of started bash processes to avoid this.

Lustre-commit: 42c5c9c2ca3e44cb1c3e8ecb144bdd20fb35cddb
Lustre-change: https://review.whamcloud.com/34082

Fixes: 543f1fbe260 ("LU-10830 utils: fix create mode for lfs")
Test-Parameters: trivial clientarch=aarch64
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I82c322013da91d717924e2c664fa57ad4e3ebbe5
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34202
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-11834 llite: fix temporary instance buffer size 15/34315/2
Andreas Dilger [Wed, 2 Jan 2019 22:12:11 +0000 (15:12 -0700)]
LU-11834 llite: fix temporary instance buffer size

The formatting of the cfg_instance variable was changed in LU-11809
to always use a fixed "%016llu" format, but the temporary buffer
allocations for the instance string were not changed to match the
printed value.

This results in string truncation in some situations.  Change the temp
buffer size to always have 16 bytes for the instance instead of using
sizeof(cfg_instance).

Fixes: cd294a1255 ("LU-11809 llite: don't use %p for cfg_instance")

Lustre-change: https://review.whamcloud.com/33951
Lustre-commit: 1db90b29ad676c2cf1888ef5a7c623161ff23bf9

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I5eca99afa2787cc57e739489b252b12af68cab07
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34315
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-10143 osd-zfs: allocate sequence in advance 95/34295/2
Alex Zhuravlev [Sun, 20 Jan 2019 05:38:26 +0000 (08:38 +0300)]
LU-10143 osd-zfs: allocate sequence in advance

on the controller, so that we have it ready before any potential
read-only makeup. this is what osd-ldiskfs is doing already.

Lustre-change: https://review.whamcloud.com/34069
Lustre-commit: 51c449b73994f2bba98ee27ac77f90c9aa846e88

Change-Id: I3d27f112b0d013ac923c5d250b296b5528b8112d
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34295
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11721 tests: wait for statfs to update on DNE 21/34321/2
Andreas Dilger [Fri, 1 Feb 2019 21:07:01 +0000 (14:07 -0700)]
LU-11721 tests: wait for statfs to update on DNE

Wait for the statfs to update properly when there are multiple
MDTs so that the test doesn't gratuitously fail.

Fixes: 757403191c3 ("LU-11721 utils: print used inodes ratio ...")
LU-11721 tests: wait for statfs to update on DNE

Lustre-commit: 263e80f4572b49044407b09f8a3e393677eafb5d
Lustre-change: https://review.whamcloud.com/34164

Test-Parameters: trivial testlist=sanity mdscount=2 mdtcount=4 ostcount=7
Test-Parameters: testlist=sanity fstype=zfs mdscount=2 mdtcount=4
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia75f7bd4d3027c91f10ce990730b2bd7123ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34321
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11721 utils: print used inodes ratio when using "lfs df -i" 36/34136/2
Nikitas Angelinas [Thu, 29 Nov 2018 23:23:05 +0000 (15:23 -0800)]
LU-11721 utils: print used inodes ratio when using "lfs df -i"

"lfs df -i" prints the used blocks percentage, instead of the used
inodes percentage. Fix this by allowing obd_statfs_ratio() to
distinguish when "-i" is used.

Round up the ratio returned from obd_statfs_ratio() in a ceiling
manner, to match the output of df(1). Add a sanity test to check
that the outputs from df(1) and lfs df match.

Lustre-commit: 757403191c37db75ed35b02c971846dced5d5119
Lustre-change: https://review.whamcloud.com/33758

Signed-off-by: Nikitas Angelinas <nangelinas@cray.com>
Cray-bug-id: LUS-6748
Test-Parameters: trivial
Change-Id: I0b31ecb7371875c93bc07dda1f1c89e04d5b4576
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34136
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11696 utils: "lfs getsom" returns "24" to userspace 08/34308/2
Qian Yingjin [Mon, 26 Nov 2018 02:11:06 +0000 (10:11 +0800)]
LU-11696 utils: "lfs getsom" returns "24" to userspace

The "lfs getsom" command always returns "24" to userspace because
"rc = 24" (sizeof(struct lustre_som_attrs)) after fetching the
xattr from the kernel.
In this patch, rc is set to 0 if the lfs_getsom()->lgetxattr()
call returns a positive value.

Lustre-change: https://review.whamcloud.com/33714
Lustre-commit: 9209e0e9428b6671c6bab9f901e04fdf5b29abc5

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ie3151f67b5ce2b5b2bc35a4b6528ba9a20a5db9f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34308
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-10384 mgs: replace_nids large string and failover support 96/34296/2
Artem Blagodarenko [Mon, 18 Dec 2017 17:09:15 +0000 (20:09 +0300)]
LU-10384 mgs: replace_nids large string and failover support

Replace_nids uses nids list as new UUID. UUID string
length is limited by 38 symbols. So new nids list need
to be less then 38 symbols.

With this patch first nid in list string representation
is used for UUID as this done for failover nids.

Replace nids finds records for given device and regenerates
lines that contain old nids. add_uuid and add_conn lines for
failover used to be deleted during replace_nids which breaks
failover configuration.

This patch adds failover support to replace_nids command.
For example:

lctl replace_nids lustre-MDT0000 nid1,nid2:nid3,nid4:nid5,nid6

nid3,nid4 - nids from first failover node
nid5,nid6 - nids from second failover node

Lustre-change: https://review.whamcloud.com/30624
Lustre-commit: 09da1564d3794ca7b82e1c1791da253bee6178d4

Signed-off-by: Artem Blagodarenko <c17828@cray.com>
Cray-bug-id: MRP-4505
Change-Id: I4e9a35e8fa8781909ecbaa74785700f4ca04cf92
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34296
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11739 lod: don't inherit default layout from root directory 35/34135/4
Jian Yu [Thu, 21 Feb 2019 20:15:12 +0000 (12:15 -0800)]
LU-11739 lod: don't inherit default layout from root directory

There is no need to inherit the default directory layout from
the root directory when subdirectories are created therein.
This consumes xattr space on the subdirectories, and makes it
more complex to change the filesystem default layout in the future.

This patch fixes the above issue in lod_ah_init() to check if
the parent directory is the root directory and not copy
the default layout xattr to the new subdirectory.

Lustre-change: https://review.whamcloud.com/33956
Lustre-commit: 0a988cae95f99fee1a9c0d489ce00d0954d2a68e

Lustre-change: https://review.whamcloud.com/34175
Lustre-commit: ad1a74527f0ec59510bfa124b8280617a2b93840

Change-Id: Ie0d286785bdbcd73e2ae60b429e66d5d54b44eef
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34135
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11161 tests: start running sanity 160g again 97/34297/2
James Nunez [Mon, 7 Jan 2019 18:06:36 +0000 (11:06 -0700)]
LU-11161 tests: start running sanity 160g again

sanity test 160g was failing when run in a DNE configuration
and we stopped running this test meaning added to the
ALWAYS_EXCEPT list. The problem is that the test did not
write enough files to exceed changelog idle index threshold
for deregistering users.

Start running sanity test 160g with DNE testing again.

Lustre-change: https://review.whamcloud.com/33994
Lustre-commit: 22676740969314b1b08a31c24e5ebc4c403e08f2

Test-Parameters: trivial
Test-Parameters: ostfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I286ef8eb7c4638ff8f357db54c4926d5a2f20ac4
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34297
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-8346 tests: remove spaces around fail_val 26/34226/2
James Nunez [Fri, 1 Feb 2019 03:40:59 +0000 (20:40 -0700)]
LU-8346 tests: remove spaces around fail_val

conf-sanity test 93 tries to set fail_loc and fail_val
with the command 'lctl set_param fail_val = 10 fail_loc...'.
fail_val should have no spaces before and after the
equals sign.

This patch is backport from master branch:
Lustre-commit: Iaa2bff1750a2afa96a73a452a0c098ae92f7616c
Lustre-change: https://review.whamcloud.com/34155

Test-Parameters: trivial mdscount=2 mdtcount=4 envdefinitions=ONLY=93 testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Iaa2bff1750a2afa96a73a452a0c098ae92f7616c
Reviewed-on: https://review.whamcloud.com/34226
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11605 osp: max_create_count and create_count changes 62/34162/2
Sergey Cheremencev [Wed, 29 Aug 2018 19:20:36 +0000 (22:20 +0300)]
LU-11605 osp: max_create_count and create_count changes

Setting max_create_count to 0 causes setting create_count
to 0. Set create_count to OST_MIN_PRECREATE when setting
back max_create_count.
Without the patch create_count remains equal to 0 despite
on changing max_create_count to something != 0.
This causes create to stuck in osp_precreate_reserve
because osp_precreate_send doesn't send new request to OST.
To understand the number of objects to precreate(grow) it
uses opd_pre_create_count that is equal to 0.

Lustre-commit: a531ab5f38a6da1de7948df979ae839aa847a370
Lustre-change: https://review.whamcloud.com/33559

Cray-bug-id: LUS-6435
Change-Id: I940c48f91e9c7d49b766bd85ea271ce229424c7f
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34162
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11750 krb5: krb5int_derive_key has 'hash' extra parameter 61/33961/4
Sebastien Buisson [Mon, 10 Dec 2018 16:57:55 +0000 (01:57 +0900)]
LU-11750 krb5: krb5int_derive_key has 'hash' extra parameter

From Kerberos 5 release 1.15, and introduction of support for
aes-sha2, krb5int_derive_key() groks an additional 'hash' parameter.

Lustre-change: https://review.whamcloud.com/33817
Lustre-commit: 4d1d6ed7849b0532e44f2fd742d4e07b649d6f66

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7c6ea5ac2d6844371b254b7361d28c462afe5afa
Reviewed-on: https://review.whamcloud.com/33961
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11568 ldlm: Remove use of SLAB_DESTROY_BY_RCU for ldlm lock slab 34/34434/2
Oleg Drokin [Thu, 31 Jan 2019 18:42:43 +0000 (13:42 -0500)]
LU-11568 ldlm: Remove use of SLAB_DESTROY_BY_RCU for ldlm lock slab

Whatever it was doing does not appear to be necessary anymore
as evidenced with newer kernels where the define was removed,
but disabled in Lustre instead.
Another important reason to remove it is because rhel7.3+ seems
to have broken this RCU functionality and leads to frequent use
after frees.

This patch is back-port from:
Lustre-commit: 82d014e71e14671e876055851a0d37e98b4cc079
Lustre-change: https://review.whamcloud.com/34147

Change-Id: I50991b9daf4ef06b24cb65d7a04a5e9b86706d36
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34434
Tested-by: Jenkins
5 years agoNew release 2.12.0 2.12.0 v2_12_0
Oleg Drokin [Fri, 21 Dec 2018 21:29:22 +0000 (16:29 -0500)]
New release 2.12.0

Change-Id: Icc5da4f5d1d032982a144ee5d13f214b04389d76
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew release candidate 2.12.0-RC4 2.12.0-RC4 v2_12_0-RC4
Oleg Drokin [Fri, 21 Dec 2018 19:43:57 +0000 (14:43 -0500)]
New release candidate 2.12.0-RC4

Change-Id: I5aa92f8a4232f5293e3253223d9447d23b3b0337
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11809 llite: don't use %p to generate cfg_instance 00/33900/4
Andreas Dilger [Thu, 20 Dec 2018 00:48:54 +0000 (17:48 -0700)]
LU-11809 llite: don't use %p to generate cfg_instance

In kernel 4.15 and later, using "%p" in a string format to
print a kernel pointer will result in the pointer being
hashed with a random value, and the high bytes will be masked
on 64-bit CPUs to prevent leaking kernel address-space info
to userspace to defeat ASLR.  In early boot, the "%p" pointer
may resolve to "        (ptrval)", if there is not enough
entropy in the system to generate a random hash value.

The superblock pointer is used on the client to uniquely
identify all of the OBD devices connected to it, and to
find the configuration llog that was used to mount the
filesystem, so that it can also be used at unmount time.
The sb pointer is also used in the OBD device names, and
the "        (ptrval)" expansion breaks /sys filenames,
and also breaks the uniqueness of the config instance.

On the server, there is also a pointer value used for the
FLDB SEQ servers of the OSTs.

For the short term, bypass the "%p" hashing, so that mount
continues to work properly, and this can be resolved in a
later patch to change ll_get_cfg_instance() to provide a
unique value that is not directly a kernel pointer.

In llapi_getname() don't depend on the cfg_instance being
exactly 16 characters long, if this changes in the future.

Test-Parameters: clientdistro=ubuntu1804 testlist=sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I166de0248af8fe57535628a64bb770a4e03ebbe5
Reviewed-on: https://review.whamcloud.com/33900
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11783 build: remove lustre_user.h deprecation warning 72/33872/3
Andreas Dilger [Fri, 14 Dec 2018 20:37:55 +0000 (13:37 -0700)]
LU-11783 build: remove lustre_user.h deprecation warning

The "lustre/lustre_user.h" header has been in use for many years.
The patch https://review.whamcloud.com/25246 "LU-6401 uapi: migrate
remaining uapi headers to uapi directory" moved the header to
"linux/lustre/lustre_user.h" and left a stub "lustre/lustre_user.h"
behind that generates a compiler warning that this header is
deprecated.

However, no window was given between the introduction of the new
header and the deprecation of the old header, which makes it harder
for applications to smoothly transition to the new header location.
Also, installing Lustre headers into the "linux/" directory before
Lustre is actually part of the kernel may potentially cause problems.

Disable the deprecation warning in the old header for several
releases, until the new header location has been available for a good
time and it is safe for applications to assume that it is available.

Test-Parameters: trivial
Fixes: 6712478e79588e73e28c7ccac3afc7ac2368a4f3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If5a62587e2d3627178a0f7a09c3a4c10801cab07
Reviewed-on: https://review.whamcloud.com/33872
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew release candidate 2.12.0-RC3 2.12.0-RC3 v2_12_0-RC3
Oleg Drokin [Mon, 17 Dec 2018 19:44:17 +0000 (14:44 -0500)]
New release candidate 2.12.0-RC3

Change-Id: I841a274fd9ab3b35c8c3a5b93e33469a9e230af5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11753 obdclass: lu_dirent record length missing '0' 65/33865/3
Lai Siyao [Sun, 9 Dec 2018 12:21:27 +0000 (20:21 +0800)]
LU-11753 obdclass: lu_dirent record length missing '0'

In lu_dirent packing, a '0' is appended after name, but it's not
counted in size calcuation, which may cause crash.

Add sanity test_230l.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iab4947dea8e26ea798d5f64e218268200a5fabe8
Reviewed-on: https://review.whamcloud.com/33865
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11753 obdclass: index_page support variable length rec 37/33837/5
Lai Siyao [Sat, 8 Dec 2018 22:08:14 +0000 (06:08 +0800)]
LU-11753 obdclass: index_page support variable length rec

mdd_dir_is_empty() may readdir from other MDT if directory
is striped or remote, in this case, it will issue OBD_IDX_READ
RPC to fetch dir page, and on remote MDT dt_index_page_build()
is called to build page, but this function doesn't support
variable length record, so it may miscalculate offset in
reading, which may cause crash.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia25a0aca52fb1323ea64a7ff72bf6022754af32c
Reviewed-on: https://review.whamcloud.com/33837
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11729 tests: skip sanity test 810 for ARM 64/33864/2
Andreas Dilger [Fri, 14 Dec 2018 06:48:40 +0000 (23:48 -0700)]
LU-11729 tests: skip sanity test 810 for ARM

Skip sanity.sh test_810 for ARM clients as it always failing.

Test-Parameters: trivial clientarch=aarch64 testlist=sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I84117aebb277d4ddcb7787b715587e330f3ebbe5
Reviewed-on: https://review.whamcloud.com/33864
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11770 osd-ldiskfs: preserve bio_integrity API 40/33840/5
Andreas Dilger [Wed, 12 Dec 2018 23:08:18 +0000 (16:08 -0700)]
LU-11770 osd-ldiskfs: preserve bio_integrity API

Preserve the existing kernel API for bio_integrity when the T10-PI
patches are applied, so that any other code that may be using this
interface do not break.

In particular, keep the EXPORT_SYMBOL(bio_integrity_alloc) and
EXPORT_SYMBOL(bio_integrity_prep) in place to avoid module breakage.

In struct bio_integrity_payload put the *bip_generate_fn and
*bip_verify_fn pointers after *bip_vec, since bip_vec is the last
field directly accessed by callers.

In struct blk_integrity_exchg the bi_idx field only needs to be an
unsigned short since the bio->bi_idx and bio->bi_vcnt values used
with it are also unsigned short.  This saves 8 bytes of padding in
the struct and puts the added bi_bio field at the end to preserve
the structure field alignment for external callers.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9e29723d5d581a65b1c2ca2611d012c05b953514
Reviewed-on: https://review.whamcloud.com/33840
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11753 utils: print out DNE2 directory hash flags 43/33843/2
Andreas Dilger [Thu, 13 Dec 2018 00:37:50 +0000 (17:37 -0700)]
LU-11753 utils: print out DNE2 directory hash flags

There may be flags stored in the lmv_hash_type field, such as
"LMV_HASH_FLAG_MIGRATION" that is set while the directory is
being migrated.  Print out the flag from "lfs getdirstripe".

This is still missing support for "lfs find" to find directories
that have incomplete migration.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib6b362f9eb993b5fa0562b3a51b54eaee1ccab07
Reviewed-on: https://review.whamcloud.com/33843
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11783 utils: fix warnings when lustre_user.h included 76/33876/4
Andreas Dilger [Fri, 14 Dec 2018 22:53:25 +0000 (15:53 -0700)]
LU-11783 utils: fix warnings when lustre_user.h included

Checking for lustre/lustre_user.h in a configure script
generates a warning because of the included <sys/quota.h>

  checking lustre/lustre_user.h usability... no
  checking lustre/lustre_user.h presence... yes
  WARNING: present but cannot be compiled
  WARNING: check for missing prerequisite headers?
  WARNING: see the Autoconf documentation
  WARNING: section "Present But Cannot Be Compiled"
  WARNING: proceeding with the preprocessor's result
  WARNING: in the future, the compiler will take precedence

Looking into config.log it shows:

  In file included from /usr/include/lustre/lustre_user.h:59,
                   from conftest.c:91:
  /usr/include/sys/quota.h:221: error: expected declaration
    specifiers or '...' before 'caddr_t'

Since we don't really need much from the <sys/quota.h> header,
add conditional #defines for the few needed fields.

The FASYNC constant is not declared everywhere in userspace,
provide a compat declaration if unavailable.

Fix an unused variable warning in ll_dir_ioctl().

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9cd2b0fcbaf16fe8a5a4a7a0309aada3a72cab07
Reviewed-on: https://review.whamcloud.com/33876
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew release candidate 2.12.0-RC2 2.12.0-RC2 v2_12_0-RC2
Oleg Drokin [Sat, 8 Dec 2018 05:41:28 +0000 (00:41 -0500)]
New release candidate 2.12.0-RC2

Change-Id: I84a1bcb460331928bd4987f33232c22a40b3d58c
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11740 kernel: new kernel [RHEL7.6 4.14.0-115.2.2.el7a] 04/33804/2
Minh Diep [Fri, 7 Dec 2018 15:28:32 +0000 (07:28 -0800)]
LU-11740 kernel: new kernel [RHEL7.6 4.14.0-115.2.2.el7a]

This patch support new RHEL 7.6 release on ARM

Test-Parameters: trivial

Change-Id: I79424f0759b79e0a2f45ea5337c3577f832dccb1
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33804
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11684 config: fix conf-sanity test_123 20/33720/3
Ben Evans [Tue, 20 Nov 2018 20:40:20 +0000 (15:40 -0500)]
LU-11684 config: fix conf-sanity test_123

conf_param parameters go into FSNAME-MDT/OST/client files
set_parm -P parameters go into "params"

Change the test to set the conf_param parameter files first
followed by the set_param -P parameters, since there may be
overlap.  In this case, the test infrastructure is using
conf_param to set jobid_var, and test_123 is using set_param.
This causes collision and occasional failure.

Test-Parameters: trivial testlist=conf-sanity

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I5cbdbaf6cc0c1c55a870bd587e89b2cbdaf77c29
Reviewed-on: https://review.whamcloud.com/33720
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoRevert "LU-11152 lnd: test fpo_fmr_poool pointer instead of special bool" 02/33802/3
Amir Shehata [Thu, 6 Dec 2018 20:52:22 +0000 (20:52 +0000)]
Revert "LU-11152 lnd: test fpo_fmr_poool pointer instead of special bool"

This reverts commit 9b790ba0f5606c0a91563828fa43f5e4ae210425.

Change-Id: Ibca8e813ec7372510709578e33309140e8fc7b5f
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33802
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11734 lnet: handle multi-md usage 94/33794/2
Amir Shehata [Wed, 5 Dec 2018 21:57:11 +0000 (13:57 -0800)]
LU-11734 lnet: handle multi-md usage

The MD can be used multiple times. The response tracker needs to have
the same lifespan as the MD. If we re-use the MD and a response
tracker has already been attached to it, then we'll update the
deadline for the response tracker. This means the deadline on the MD
is for its last user.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I681630c3d599f66c007926525708e3004b343455
Reviewed-on: https://review.whamcloud.com/33794
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11440 misc: require ldiskfsprogs-1.44.3.wc1 or later 66/33766/4
Andreas Dilger [Sat, 1 Dec 2018 18:07:30 +0000 (11:07 -0700)]
LU-11440 misc: require ldiskfsprogs-1.44.3.wc1 or later

Require a current version of ldiskfsprogs to include support for
project quotas and large_dir.  Upstream now also includes ea_inode
support and many bug fixes in xattr verification and repair.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I24eeb1c30d5c7b1daa1ad7d5d2f603d273054035
Reviewed-on: https://review.whamcloud.com/33766
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew Release candidate 2.12.0-RC1 2.12.0-RC1 v2_12_0-RC1
Oleg Drokin [Wed, 5 Dec 2018 03:01:23 +0000 (22:01 -0500)]
New Release candidate 2.12.0-RC1

Change-Id: I5464c8f19a06e3dd35b7704bf4359726514e62ad
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11389 lnet: increase lnet transaction timeout 31/33231/8
Sonia Sharma [Mon, 17 Sep 2018 17:50:42 +0000 (13:50 -0400)]
LU-11389 lnet: increase lnet transaction timeout

Increase the new LNet Health transaction timeout to the original
50s value, to avoid spurious lnet-selftest failures and expected
false timeouts under load.

Fix the lnet_transaction_timeout module parameter description.

Test-Parameters: trivial clientarch=aarch64 testlist=lnet-selftest,lnet-selftest,lnet-selftest,lnet-selftest
Test-Parameters: clientarch=aarch64 testlist=lnet-selftest,lnet-selftest,lnet-selftest,lnet-selftest

Signed-off-by: Sonia Sharma <sharmaso@whamcloud.com>
Change-Id: Ic9df69ef8c7a4085815b54dc6741f37a73d36a75
Reviewed-on: https://review.whamcloud.com/33231
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
5 years agoLU-10576 tests: allow log files to be created/removed 77/33677/4
Andreas Dilger [Fri, 16 Nov 2018 21:41:28 +0000 (14:41 -0700)]
LU-10576 tests: allow log files to be created/removed

Allow an llog file to be created or removed during the course of
the test, as this can happen due to internal housekeeping activity.

Also ensure that background cleanup has finished with ZFS before
fetching the number of objects from the MDT.

Test-Parameters: trivial
Test-Parameters: testlist=sanity mdscount=2 mdtcount=4 mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0a0968cfcd90c7493c67b54ba8a7f326163ebbe5
Reviewed-on: https://review.whamcloud.com/33677
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11697 ost: do not reuse T10PI guards of unaligned page write 52/33752/6
Li Xi [Thu, 29 Nov 2018 14:51:44 +0000 (09:51 -0500)]
LU-11697 ost: do not reuse T10PI guards of unaligned page write

If the write is partial page, the guards of RPC checksum should not
be reused for bio submission since the data might not be full-sector.
The bio guards will be generated later based on the full sectors. If
the sector size is 512B rather than 4 KB, or the page size on OST is
larger than 4KB, this might drop some useful guards for partial page
write, but it will only add minimal extra time of checksum calculation.

Change-Id: I868342df87c28ea91f5f8364fe377277595ecf6d
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/33752
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11697 osc: wrong page offset for T10PI checksum 27/33727/5
Li Xi [Tue, 27 Nov 2018 07:20:31 +0000 (02:20 -0500)]
LU-11697 osc: wrong page offset for T10PI checksum

The page offset might could be non-zero value. Thus, when
calculating T10PI checksum, the offset should be correct value.

Change-Id: Ib32584eb47ea55ec3804e531ac02ffd252411886
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/33727
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11663 osd-zfs: write partial pages with correct offset 26/33726/4
Alex Zhuravlev [Tue, 27 Nov 2018 06:47:50 +0000 (09:47 +0300)]
LU-11663 osd-zfs: write partial pages with correct offset

otherwise non-aligned writes send wrong data to ZFS.

Change-Id: I1ae1f361981d548307d74344a5694f3ef39c0609
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33726
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11652 kernel: kernel update [SLES12 SP3 4.4.162-94.69] 37/33637/3
Jian Yu [Wed, 21 Nov 2018 08:41:58 +0000 (00:41 -0800)]
LU-11652 kernel: kernel update [SLES12 SP3 4.4.162-94.69]

Update SLES12 SP3 kernel to 4.4.162-94.69.

Test-Parameters: clientdistro=sles12sp3 ossdistro=sles12sp3 mdsdistro=sles12sp3

Change-Id: Iea1ec6def609059d67053d25360e8e986f2adbd9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33637
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11668 mdt: check parent type in rename/migrate 09/33709/2
Lai Siyao [Thu, 25 Oct 2018 21:58:49 +0000 (05:58 +0800)]
LU-11668 mdt: check parent type in rename/migrate

Check parent existence and type in rename and migrate to avoid
potential race.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I583e2f5a6f47073601e36c06890a6b22dfc734ad
Reviewed-on: https://review.whamcloud.com/33709
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11668 debug: print object type in mdd_parent_fid 00/33700/4
Lai Siyao [Thu, 25 Oct 2018 16:10:34 +0000 (00:10 +0800)]
LU-11668 debug: print object type in mdd_parent_fid

mdd_parent_fid() get parent fid for directory, but racer shows
the passed in object is not directory, print its type to help
debug.

Test-Parameters: trivial
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Peter Jones <pjones@whamcloud.com>
Change-Id: I4b3eedb159fc0efccc15e35cb59010fe02fa9e01
Reviewed-on: https://review.whamcloud.com/33700
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
5 years agoLU-10114 hsm: increase upper limit of maximum HSM backends registered with MDT 97/32197/32
Teddy Zheng [Tue, 13 Nov 2018 12:42:34 +0000 (07:42 -0500)]
LU-10114 hsm: increase upper limit of maximum HSM backends registered with MDT

Lustre only supports at most 32 HSM backends, which limits HSM to be applied
to other features, such as LPCC. This patch breaks the limitation by allowing
the system take any interger number as a valid archive-id.

Test-Parameters: clientjob=lustre-b2_10 clientbuildno=136 testlist=sanity-hsm
Test-Parameters: mdsjob=lustre-b2_10 ossjob=lustre-b2_10 serverbuildno=136 testlist=sanity-hsm

Change-Id: I9523b92500b962db3e45a2bd6a67dba54eef5335
Signed-off-by: Teddy Zheng <teddy@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/32197
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-5152 quota: disable sync chgrp to OSTs 05/33705/2
Hongchao Zhang [Wed, 14 Nov 2018 23:10:46 +0000 (18:10 -0500)]
LU-5152 quota: disable sync chgrp to OSTs

The syschronous chgrp to OSTs introduced by the previous patch
(the commit is 8a71fd5061bd073e055e6cbba1d238305e6827bb) causes
deadlock between MDT and OST, this patch disable it for now and
leave the updated patch to fix it.

Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: I5ce48424f4d2011ce62e69047ace7f0b7c3ebbe5
Reviewed-on: https://review.whamcloud.com/33705
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11653 hsm: copytool registration wakes the coordinator 49/33649/8
Quentin Bouget [Mon, 12 Nov 2018 19:50:20 +0000 (20:50 +0100)]
LU-11653 hsm: copytool registration wakes the coordinator

When a copytool registers to the MDS, it is possible there are
pending requests in the coordinator's llog that previously could not
be sent (either because there were not any copytools, or not any
compatible copytools).

With this patch, the coordinator will process those requests on its
next wake up (which happens every second).

Test-Parameters: trivial
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=zfs testlist=sanity-hsm
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=ldiskfs testlist=sanity-hsm
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=zfs testlist=sanity-hsm
Test-Parameters: mdscount=2 mdtcount=4 mdtfilesystemtype=ldiskfs testlist=sanity-hsm

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: Ie49b40d312f2f3e0d9c85dee27bb8813dc4dde40
Reviewed-on: https://review.whamcloud.com/33649
Tested-by: Jenkins
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11541 build: Use correct kernel version for DKMS MLNX OFED. 02/33702/3
Ake Sandgren [Wed, 21 Nov 2018 18:00:53 +0000 (19:00 +0100)]
LU-11541 build: Use correct kernel version for DKMS MLNX OFED.

Check for $O2IBPATHS/${LINUXRELEASE} before $O2IBPATHS/default.
The "default" link is only created on the first dkms build of OFED.
So if doing dkms build for another kernel it may not be pointing
to the correct kernel.

Test-Parameters: trivial

Signed-off-by: Ake Sandgren <ake.sandgren@hpc2n.umu.se>
Change-Id: If8054ee64cf8795ed0e3ee50a8ef9ced067059d7
Reviewed-on: https://review.whamcloud.com/33702
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11567 utils: llog_reader print changelog index 73/33473/8
Olaf Faaland [Wed, 24 Oct 2018 21:23:50 +0000 (14:23 -0700)]
LU-11567 utils: llog_reader print changelog index

When processing changelog type llogs, print the changelog index number
with each changelog record.  This allows one to compare the records on
disk with the output of lfs changelog or relatives.

Test-Parameters: trivial
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I0059cc34b39161462b3eadbb2512dc811c38705a
Reviewed-on: https://review.whamcloud.com/33473
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11329 utils: Add maintainer entries 84/33684/5
Patrick Farrell [Mon, 19 Nov 2018 15:32:12 +0000 (09:32 -0600)]
LU-11329 utils: Add maintainer entries

Add clio reviewer entry.

Add Grant subsystem section.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I461b39fe0b65f8d48b3912086927c945b2da3db7
Reviewed-on: https://review.whamcloud.com/33684
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
5 years agoLU-11645 tests: fix sanity-sec test 31 22/33622/2
Sebastien Buisson [Thu, 8 Nov 2018 07:55:01 +0000 (16:55 +0900)]
LU-11645 tests: fix sanity-sec test 31

In sanity-sec test 31, command to add new LNet network ${NETTYPE}999
may fail if servers have interface names different from the one used
on the client.
So fix the command so that it is run directly on each node.

Test-Parameters: trivial envdefinitions=ONLY=31 testlist=sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibf9101524d94188b3beae3debe45e2ba151999ca
Reviewed-on: https://review.whamcloud.com/33622
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11572 tests: make sanity-hsm test_260c reliable 78/33478/5
Quentin Bouget [Tue, 13 Nov 2018 21:52:24 +0000 (16:52 -0500)]
LU-11572 tests: make sanity-hsm test_260c reliable

When the coordinator restarts, its first run does housekeeping.
For the following `loop_period' seconds, it will not run in
housekeeping mode.

This patch uses this to improve the reliability of sanity-hsm
test_260c.

Test-Parameters: trivial testlist=sanity-hsm,sanity-hsm,sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I8eeb8b6856d91b69495d592cdd1cb5f091b1cc2b
Reviewed-on: https://review.whamcloud.com/33478
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11519 hsm: improve the testing of hsm.max_requests 90/33590/5
Quentin Bouget [Tue, 6 Nov 2018 12:48:33 +0000 (12:48 +0000)]
LU-11519 hsm: improve the testing of hsm.max_requests

Modify sanity-hsm::test_250() to send more diverse HSM request.

This brings better code coverage (it would have caught the bug
reported in LU-11519).

Test-Parameters: trivial testlist=sanity-hsm,sanity-hsm,sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I0142b46dc804f649c33deb81efea7f68d1e29afa
Reviewed-on: https://review.whamcloud.com/33590
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11662 llite: handle -ENODATA in ll_layout_fetch() 65/33665/3
John L. Hammond [Thu, 15 Nov 2018 17:08:57 +0000 (11:08 -0600)]
LU-11662 llite: handle -ENODATA in ll_layout_fetch()

In ll_layout_fetch() handle -ENODATA returns from mdc_getxattr(). This
is needed for interop and restores the behavior from before commit
0f42b388432c4b898857660197ef13a40a82cd9d (LU-11380 mdc: move empty
xattr to mdc layer) landed.

Test-Parameters: clientjob=lustre-b2_10 clientbuildno=136 testlist=sanity-hsm
Test-Parameters: mdsjob=lustre-b2_10 ossjob=lustre-b2_10 serverbuildno=136 testlist=sanity-hsm
Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I1fb85faf35c7d2303a1f61a5a9c5922988739817
Reviewed-on: https://review.whamcloud.com/33665
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11595 mdt: fix read-on-open for big PAGE_SIZE 06/33606/7
Mikhail Pershin [Wed, 7 Nov 2018 13:31:57 +0000 (16:31 +0300)]
LU-11595 mdt: fix read-on-open for big PAGE_SIZE

Client PAGE_SIZE can be larger than server one so data returned
from server along with OPEN can be misaligned on client.

Patch replaces assertion on client with check and graceful exit,
changes MDC_DOM_DEF_INLINE_REPSIZE to be PAGE_SIZE at least and
updates mdt_dom_read_on_open() to return file tail for maximum
possible page size that can fit into reply.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic2c54b95c814d3b6df3b527527cac08488060651
Reviewed-on: https://review.whamcloud.com/33606
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11582 llite: protect reading inode->i_data.nrpages 39/33639/3
Bobi Jam [Sun, 11 Nov 2018 08:41:21 +0000 (16:41 +0800)]
LU-11582 llite: protect reading inode->i_data.nrpages

truncate_inode_pages() looks up pages in the radix tree without
lock, and could miss finding pages removed from the radix tree
by __remove_mapping(), so that after calling truncate_inode_pages()
we need to read the nrpages of the inode->i_data with the protection
of tree_lock.

Since it could still be in the race window of __remove_mapping()->
__delete_from_page_cache()->page_cache_tree_delte(), before the
nrpages being decreased.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I44ba6bea3dec4f0a110d1ae2a749514ec7dd0d12
Reviewed-on: https://review.whamcloud.com/33639
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11466 mdt: Skip SOM xattr update for DoM-only files 31/33331/11
Qian Yingjin [Wed, 10 Oct 2018 07:55:54 +0000 (15:55 +0800)]
LU-11466 mdt: Skip SOM xattr update for DoM-only files

When scan the MDT image, DoM-only file can be specialized handled,
the size and blocks can be got directly from the inode on MDT, no
need SOM xattr anymore.
Thus, there is no need to store the SOM xattr for DoM-only files.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I0f871cde38fc846460dd1b6f92509dee9ea90bfc
Reviewed-on: https://review.whamcloud.com/33331
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9793 ptlrpc: Do not map unrecognized ELDLM errnos to EIO 71/33471/3
Ann Koehler [Tue, 13 Nov 2018 16:47:20 +0000 (11:47 -0500)]
LU-9793 ptlrpc: Do not map unrecognized ELDLM errnos to EIO

The lustre_errno_hton and lustre_errno_ntoh functions map between
host and network error numbers before they are sent over the network.
If an errno is unrecognized then it is mapped to EIO.

However an optimization for x86 and i386 architectures replaced the
functions with macros that simply return the original errno. The
result is that x86 and i386 return the original values for ELDLM
errnos and all other architectures return EIO. This difference is
known to break glimpse lock callback handling which depends on clients
responding with ELDLM_NO_LOCK_DATA. The difference in errnos may
result in other as yet unidentified bugs.

The fix defines mappings for the ELDLM errors that leaves the values
unchanged. Error numbers not found in the mapping tables are still
mapped to EIO.

Cray-bug-id: LUS-6057
Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: I0b4e1e0dc6de065729e18f2381ec9cfc58fe31db
Reviewed-on: https://review.whamcloud.com/33471
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11519 hsm: handle hsd_request_count == 0 properly 80/33580/6
John L. Hammond [Mon, 5 Nov 2018 17:48:55 +0000 (11:48 -0600)]
LU-11519 hsm: handle hsd_request_count == 0 properly

In mdt_cdt_waiting_cb() it may be that the coordinator has already
reached the limit of active requests and hsd contains no requests to
be started. Handle this properly when trying to prioritize a restore.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ic843b7672ae6a4509ac127c2d2f90bf3681f84fc
Reviewed-on: https://review.whamcloud.com/33580
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11071 build: use --with-linux-obj for ubuntu 45/33145/10
Li Dongyang [Sat, 17 Nov 2018 03:09:17 +0000 (22:09 -0500)]
LU-11071 build: use --with-linux-obj for ubuntu

We can use --with-linux-obj instead of --with-linux-config
the config will be found just under LINUX_OBJ.
Otherwise the configure could fail if LINUX and LINUX_OBJ
are different paths. e.g. ubuntu18

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ibaa159e5611c4054b3987da0ce3a8bca2992e057
Reviewed-on: https://review.whamcloud.com/33145
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11642 lmv: allocate fid on parent MDT in migrate 41/33641/3
Lai Siyao [Sun, 21 Oct 2018 22:48:17 +0000 (06:48 +0800)]
LU-11642 lmv: allocate fid on parent MDT in migrate

During directory migration, if the migrated file is not directory,
the target should be allocated on its parent MDT, not user specified
MDT. Because if it's parent is striped, this file should be migrated
to the MDT by its name hash, not the starting MDT of its parent.

Add sanity 230k to check file data not changed after migration.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ic7d3de8ea982b7cf4da758e4d3ab8d8ee15ecfdb
Reviewed-on: https://review.whamcloud.com/33641
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11642 mdt: revoke remote LOOKUP lock in dir layout shrink 40/33640/2
Lai Siyao [Sun, 21 Oct 2018 22:44:21 +0000 (06:44 +0800)]
LU-11642 mdt: revoke remote LOOKUP lock in dir layout shrink

mdt_dir_layout_shrink() should revoke remote LOOKUP lock if parent
is remote, because it will alter dir layout, which is refreshed
upon lookup.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I26ae1af5da6142b44005e5d9ea11293af65ed7b5
Reviewed-on: https://review.whamcloud.com/33640
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11497 tests: improve ha.sh to set striped dirs 40/33340/3
Elena Gryaznova [Wed, 10 Oct 2018 16:10:34 +0000 (19:10 +0300)]
LU-11497 tests: improve ha.sh to set striped dirs

For DNE II testing we need the possibility to create
the striped directories, like it is done in t-f:test_mkdir().
This patch covers the following settings:
-c stripe_count -i mdt_index
-c stripe_count -i <random mdt_index>,
which allows to have the clients work directories with
the same stripe_count but different mdt indexes.

Test-Parameters:trivial
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5974
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Change-Id: I6117601e741a95059750149d1c38b402fccb29b7
Reviewed-on: https://review.whamcloud.com/33340
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11494 tests: sanity-quota/22 syntax error fix 37/33337/2
Elena Gryaznova [Wed, 10 Oct 2018 14:57:10 +0000 (17:57 +0300)]
LU-11494 tests: sanity-quota/22 syntax error fix

Patch fixes test_22() trivial syntax error.

Test-Parameters:trivial testlist=sanity-quota envdefinitions=ONLY=22
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-6011
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: Icb3474c7166d6b74b3092617bb314a561687df06
Reviewed-on: https://review.whamcloud.com/33337
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-11492 tests: fix thread_sanity() defect 35/33335/2
Elena Gryaznova [Wed, 10 Oct 2018 14:20:10 +0000 (17:20 +0300)]
LU-11492 tests: fix thread_sanity() defect

tmin, tmax, tstarted are not set if do_facet fails by some reason.
This leads to the following failure:
Assertion 28 failed: (($tstarted >= $tmin && $tstarted <= $tmax ))
   (expanded: ((16 >=  && 16 <= 16 )))
Patch sets variables equal to 0 for cases of failed do_facet, like
it is done for other similar cases.

Test-Parameters:trivial testlist=sanity envdefinitions=ONLY="53a 53b"
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5638
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: If713ba20a0f71cb17208f776a9a4edd359c07c43
Reviewed-on: https://review.whamcloud.com/33335
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>