Whamcloud - gitweb
fs/lustre-release.git
5 years agoLU-11062 libcfs: use save_stack_trace for stack dump 52/32952/3
Yang Sheng [Tue, 7 Aug 2018 16:24:19 +0000 (00:24 +0800)]
LU-11062 libcfs: use save_stack_trace for stack dump

The stacktrace_ops has been removed recently. So we
have to use save_stack_trace_tsk for stack trace
dump.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Icb3d0dbd62c35fdd9b8de925aec9358a2208814f
Reviewed-on: https://review.whamcloud.com/32952
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11176 systemd: use univeral path for modprobe 44/32944/3
James Simmons [Mon, 6 Aug 2018 20:00:58 +0000 (16:00 -0400)]
LU-11176 systemd: use univeral path for modprobe

The program modprobe is not the same on all platforms. On RHEL
systems it is located in /usr/sbin. For Ubuntu/Debian which is
busybox based /sbin/modprobe is a symlink to /bin/kmod. On all
platforms to keep some sort of standard a symlink for modprobe
exist in /sbin. Update the lnet.service script to use the hard
patch /sbin/modprobe

Test-Parameters: trivial

Change-Id: I54342971a6ee1aa4ce86a9fae0ac4dcb167b1510
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32944
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10940 tests: skip sanity test 802 when quota enabled 00/32900/2
James Nunez [Mon, 30 Jul 2018 16:02:17 +0000 (10:02 -0600)]
LU-10940 tests: skip sanity test 802 when quota enabled

If ENABLE_QUOTA is set, sanity test 802 will try to set
the quota type on read-only targets. Setting quota requires
changes to the targets and, thus, does not make sense for
this test. sanity test 802 should be skipped if ENABLE_QUOTA
is set.

Test-Parameters: trivial envdefinitions=ENABLE_QUOTA=yes,ONLY=802 testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ic9c245045961867b7dc93be9268e6f4a4631c1dc
Reviewed-on: https://review.whamcloud.com/32900
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11171 tests: set parameters for racer_on_nfs 80/32880/3
James Nunez [Wed, 25 Jul 2018 16:59:33 +0000 (10:59 -0600)]
LU-11171 tests: set parameters for racer_on_nfs

The parallel-scale-nfs script calls the racer test without
specifying a directory to create files, create directories,
etc. in. In addition, racer needs a few other global
parameters to work properly, including the number of OSTs,
MDTs and which LFS to use.

Test-Parameters: trivial testlist=parallel-scale-nfsv3,parallel-scale-nfsv4
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ic4f5f08ddec7a8df5cb818b434aa3473f6cd72cb
Reviewed-on: https://review.whamcloud.com/32880
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9007 lod: get rid of comp ost in use array 13/32813/3
Bobi Jam [Thu, 12 Jul 2018 22:09:56 +0000 (16:09 -0600)]
LU-9007 lod: get rid of comp ost in use array

Use lod_layout_component::llc_ost_indices to serve the same purpose.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I66c89fe6349b48b89593e34e9e985ec6ea5a1758
Reviewed-on: https://review.whamcloud.com/32813
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11109 mdt: handle zero length xattr values correctly 55/32755/3
John L. Hammond [Mon, 2 Jul 2018 19:52:01 +0000 (14:52 -0500)]
LU-11109 mdt: handle zero length xattr values correctly

In mdt_getxattr(), set OBD_MD_FLXATTR in mbo_valid of the reply's MDT
body so that the client can distinguish between nonexistent extended
attributes and zero length values. In ll_xattr_list() and
ll_getxattr_common() test for OBD_MD_FLXATTR and return 0 rather than
-ENODATA in the appropriate cases. Add sanity test_102t() to test that
zero length values are handled correctly.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I15649581c26dc52e83ca714b44f8372f29954ed5
Reviewed-on: https://review.whamcloud.com/32755
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-10114 hsm: add OBD_CONNECT2_ARCHIVE_ID_ARRAY to pass archive_id lists in array 06/32806/5
Teddy Zheng [Fri, 27 Jul 2018 05:37:18 +0000 (13:37 +0800)]
LU-10114 hsm: add OBD_CONNECT2_ARCHIVE_ID_ARRAY to pass archive_id lists in array

Clients registed to MDS with OBD_CONNECT2_ARCHIVE_ID_ARRAY will
use array to pass ARCHIVED IDs. While clients without it still
use bitmap. This flag allows old clients connect to new MDSs.

Test-Parameters: trivial
Change-Id: I61a691fc262fdc921d5ff4aa88c1fd623f09d565
Signed-off-by: Teddy Zheng <teddy@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/32806
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
5 years agoLU-9538 utils: Tool for syncing file LSOM xattr 24/30124/21
Qian Yingjin [Thu, 16 Nov 2017 01:42:57 +0000 (09:42 +0800)]
LU-9538 utils: Tool for syncing file LSOM xattr

Add a helper tool for syncing file LSOM xattr.
Firstly, register a new changelog user:
lctl --device lustre-MDT0000 changelog_register

After perform some file operations on Lustre file system, run
this tool to sync file LSOM xattr:
llsom_sync -u cl1 -m lustre-MDT0000 /mnt/lustre

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia2878b48f7f665b01b230585921c78ae41846171
Reviewed-on: https://review.whamcloud.com/30124
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9087 build: add support for DKMS debs 28/25328/8
Michael Kuhn [Tue, 31 Jul 2018 02:12:05 +0000 (22:12 -0400)]
LU-9087 build: add support for DKMS debs

This introduces a new package lustre-client-modules-dkms that uses DKMS
to automatically recompile the client kernel modules on kernel upgrades.
The package is only created if the dkms-debs target is used, otherwise
the traditional kernel-specific package is created.

Test-Parameters: trivial
Change-Id: Ie9aeee29f7fd73938b148299d246c663a783ccd3
Signed-off-by: Michael Kuhn <michael.kuhn@informatik.uni-hamburg.de>
Reviewed-on: https://review.whamcloud.com/25328
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-7372 tests: stop running replay-dual test 26 02/32902/2
James Nunez [Mon, 30 Jul 2018 17:34:36 +0000 (11:34 -0600)]
LU-7372 tests: stop running replay-dual test 26

replay-dual test 26 fails frequently. We need to add
this test to the ALWAYS_EXCEPT list and, thus, stop
running the test until we fix the issue.

Test-Parameters: trivial testlist=replay-dual
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ida58ecc4933dae33d396c258fee64f6d3dbd4978
Reviewed-on: https://review.whamcloud.com/32902
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-4684 migrate: pack lmv ea in migrate rpc 24/31424/14
Lai Siyao [Sat, 20 Jan 2018 07:51:32 +0000 (15:51 +0800)]
LU-4684 migrate: pack lmv ea in migrate rpc

To support stripe directory migration, pack lmv_user_md in migrate
RPC. Add arguments of 'mdt-count' and 'mdt-hash' for 'lfs migrate'.

Disable directory migration related tests temprorily, and we'll
enable them later in the last patch of this set.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I914a9205a1a558da8c4231e7c83334621b5c92c0
Reviewed-on: https://review.whamcloud.com/31424
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10181 mdt: read on open for DoM files 11/23011/53
Mikhail Pershin [Thu, 18 Aug 2016 06:26:06 +0000 (09:26 +0300)]
LU-10181 mdt: read on open for DoM files

Read file data upon open and return it in reply. That works
only for file with Data-on-MDT layout and no OST components
initialized. There are three possible cases may occur:
1) file data fits in already allocated reply buffer (~9K)
   and is returned in that buffer in OPEN reply.
2) File fits in the maximum reply buffer (128K) and reply is
   returned with larger size to the client causing resend
   with re-allocated buffer.
3) File doesn't fit in reply buffer but its tail fills page
   partially then that tail is returned. This can be useful
   for an append case

Test-Parameters: mdssizegb=20 testlist=sanity-dom,dom-performance,racer
Change-Id: I5574ce5f74017fc654715e212b71fc3b905bdcae
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-on: https://review.whamcloud.com/23011
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11186 ofd: fix for a final oid at sequence 91/32891/5
Alexander Boyko [Fri, 27 Jul 2018 13:10:23 +0000 (09:10 -0400)]
LU-11186 ofd: fix for a final oid at sequence

There was an error at the end of sequence range and last oid
0xffffffff can't be created. The 0xffffffff is a valid oid, and
sequence update happens only if it is created.

LustreError: 11756:0:(ofd_objects.c:217:ofd_precreate_objects())
lustre-OST0000:0xfffffffe:10737419264 hit the OBIF_MAX_OID (1<<32)!
LustreError: 11756:0:(ofd_dev.c:1764:ofd_create_hdl())
lustre-OST0000: unable to precreate: rc = -28

The patch fixes this error.

The conf-sanity 122 is added for checking sequence update.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I39ad66c05e8358591ca05fadabb2b46bee638070
Cray-bug-id: LUS-6222
Reviewed-on: https://review.whamcloud.com/32891
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11156 scrub: skip project quota inode 29/32829/7
Alexander Boyko [Wed, 18 Jul 2018 14:17:16 +0000 (10:17 -0400)]
LU-11156 scrub: skip project quota inode

Error happened when scrub try to process project quota inode.
Scrub thinks that it is IGIF, because it has no lma fid. And it starts
to create O/inum/{LAST_ID,d0-d31}, and fails with not enough credits.
The project quota inode s_prj_quota_inum should be skipped
from scrub iteration.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-6197
Change-Id: I38c347377a1c648ac3dd3e3ff4c4d65ee34cde39
Reviewed-on: https://review.whamcloud.com/32829
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11102 test: test fewer files on ZFS system 33/32933/2
Lai Siyao [Sun, 22 Jul 2018 01:44:02 +0000 (09:44 +0800)]
LU-11102 test: test fewer files on ZFS system

sanity test_415 may be slow on ZFS system, test with use fewer files.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie21e9e146508b395c8196adac1f6ba3e6854a1ef
Reviewed-on: https://review.whamcloud.com/32933
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11127 test: sanity-flr OST not recovery fast enough 22/32922/3
Bobi Jam [Thu, 2 Aug 2018 04:06:21 +0000 (12:06 +0800)]
LU-11127 test: sanity-flr OST not recovery fast enough

use wait_recovery_complete() than wait_osc_import_state() to be more
patient for OST recovery.

Test-Parameters: trivial mdtcount=2 mdscount=2 testlist=sanity-flr,sanity-flr,sanity-flr,sanity-flr

Test-Parameters: mdtcount=2 mdscount=2 testlist=sanity-flr,sanity-flr,sanity-flr,sanity-flr

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I2d652d09b0575a720e5ef9701fb7067cbf454079
Reviewed-on: https://review.whamcloud.com/32922
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9538 utils: fix lfs xattr.h header usage 18/32918/2
Andreas Dilger [Wed, 1 Aug 2018 19:46:35 +0000 (13:46 -0600)]
LU-9538 utils: fix lfs xattr.h header usage

The lfs_getsom() code added the use of lgetxattr() to lfs.c, but
included the <attr/xattr.h> header instead of <sys/xattr.h> as
is used by other code in the tree.  That adds a dependency on
libattr-devel that we don't really need.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8cccccbdc7186d0ed1bfb1c12d911da763a44bf5
Reviewed-on: https://review.whamcloud.com/32918
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11159 kernel: kernel update RHEL7.5 [3.10.0-862.9.1.el7] 45/32845/2
Jian Yu [Fri, 20 Jul 2018 07:16:12 +0000 (00:16 -0700)]
LU-11159 kernel: kernel update RHEL7.5 [3.10.0-862.9.1.el7]

Update RHEL7.5 kernel to 3.10.0-862.9.1.el7.

Change-Id: I2bb3462efbbdd8ed17803209b9508176ab04be96
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32845
Tested-by: Jenkins
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11166 tests: remove use of /proc/fs/jbd2/*/history file 58/32858/2
James Nunez [Mon, 23 Jul 2018 20:19:11 +0000 (14:19 -0600)]
LU-11166 tests: remove use of /proc/fs/jbd2/*/history file

The /proc/fs/jbd2/*/history file was removed several years
ago with a patch from Theodore Ts’o; commit bf6993276f. We
need to remove all uses of /proc/fs/jbd*/*/history from our
tests and utilities.

In particular, obdfilter-survey.sh and iokit-lstat rely on
/proc/fs/jbd2/*/history to collect data and must be modified.

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ib25dd28a496840199de1e84f597748905bda80d2
Reviewed-on: https://review.whamcloud.com/32858
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-11160 build: Fix uuid / blkid dependency 42/32842/4
Nathaniel Clark [Thu, 19 Jul 2018 19:26:27 +0000 (15:26 -0400)]
LU-11160 build: Fix uuid / blkid dependency

UUID dependency stems from libblkid, so only link with uuid if blkid
is present.

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: If1cc293cc48210a065f8910ea655615b11268b5c
Reviewed-on: https://review.whamcloud.com/32842
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10627 tests: don't use libtool wrapper for applications 35/32835/11
James Simmons [Fri, 27 Jul 2018 23:55:49 +0000 (19:55 -0400)]
LU-10627 tests: don't use libtool wrapper for applications

It is a common pratice of lustre developers to test within the
lustre tree without actually installing lustre onto the local
node. In order for this to work the test suite needs to use
the binary executables instead of the libtool executable wrappers.
Add in the libtool LDFLAG to prevent the creation of the wrappers
for the lustre utils. Additionally properly set LD_LIBRARY_PATH
to where libtool caches the dynamic libraries.

Change-Id: I9570fcb65b927463076f28c47ecec924602bef4e
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32835
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11153 quota: initialize ver for default quota 27/32827/3
Hongchao Zhang [Wed, 18 Jul 2018 04:02:42 +0000 (00:02 -0400)]
LU-11153 quota: initialize ver for default quota

In qmt_set_with_lqe, the variable "ver" is not initialized
if the lqe using the default quota is being updated to use
new default quota setting.

Change-Id: I578543fc69009ef85c667092a66947d3c98a6a7d
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32827
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10916 lfs: improve lfs mirror resync 08/32808/4
Bobi Jam [Wed, 11 Jul 2018 16:24:27 +0000 (10:24 -0600)]
LU-10916 lfs: improve lfs mirror resync

Make mirror resync use read+write+write+... mode instead do the
resync on each stale mirror of a file separately (read+write,
read+write, ...).

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I627fa53fcfde4811b2cd9c84c8545defe151206c
Reviewed-on: https://review.whamcloud.com/32808
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10947 build: test if xfsprogs are installed 39/32539/4
James Simmons [Fri, 27 Jul 2018 23:59:32 +0000 (19:59 -0400)]
LU-10947 build: test if xfsprogs are installed

We need xfsprogs because conf-sanity test_116 use mkfs.xfs. No
need to install the xfsprogs for a single test so just skip the
test if mkfs.xfs is not available. Set $tmpmnt to $TMP/$tdir
since /mnt is read only for my diskless setup. The $TMP is not
in my setup.

Test-Parameters: trivial testlist=conf-sanity mdsdistro=sles12sp3 ossdistro=sles12sp3

Change-Id: I1db88afe7e382e1032ed7e2844a1dec1c032530e
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32539
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-8066 llite: move /proc/fs/lustre/llite/fstype to sysfs 96/32896/2
James Simmons [Sun, 29 Jul 2018 14:28:33 +0000 (10:28 -0400)]
LU-8066 llite: move /proc/fs/lustre/llite/fstype to sysfs

Move fstype file from /proc/fs/lustre/llite/*
to /sys/fs/lustre/llite/*/

This is a modified version of

Linux-commit: 0cee667682b55d7c389d77877adbd63360415baa

due to the large amount of changes to the OpenSFS/Intel branch.

Change-Id: Ic06988c1f9ccfa6a32f99f5ea8ddcf4820a62a8e
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32896
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 llite: move /proc/fs/lustre/llite/client_type to sysfs 99/32499/5
James Simmons [Sun, 29 Jul 2018 02:13:43 +0000 (22:13 -0400)]
LU-8066 llite: move /proc/fs/lustre/llite/client_type to sysfs

Move client_type file from /proc/fs/lustre/llite/*
to /sys/fs/lustre/llite/*/

This is a modified version of

Linux-commit: 95e1b6b0cff09292158ecc0701f721315167b64e

due to the large amount of changes to the OpenSFS/Intel branch.

Change-Id: I056b0b3693b0c747d5a45fb6485cb5c4975acb1b
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32499
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 llite: move /proc/fs/lustre/llite/files* to sysfs 98/32498/6
James Simmons [Sat, 28 Jul 2018 16:02:35 +0000 (12:02 -0400)]
LU-8066 llite: move /proc/fs/lustre/llite/files* to sysfs

Move filestotal and filesfree files from /proc/fs/lustre/llite/*
to /sys/fs/lustre/llite/*/

This is a modified version of

Linux-commit: 7267ec0d8726c214aaf24ca9e8baebb443b0da75

due to the large amount of changes to the OpenSFS/Intel branch.

Change-Id: I84b6d6a0868058d60a83a0700f0389d3ba685ddb
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32498
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 llite: move /proc/fs/lustre/llite/kbytes* to sysfs 97/32497/7
James Simmons [Fri, 27 Jul 2018 22:13:42 +0000 (18:13 -0400)]
LU-8066 llite: move /proc/fs/lustre/llite/kbytes* to sysfs

Move kbytestotal, kbytesavail and kbytesfree files from
/proc/fs/lustre/llite/* to /sys/fs/lustre/llite/*/

This is a modified version of

Linux-commit: 5804b11e1487558c6740282a01a08bb4ba0c6d06

due to the large amount of changes to the OpenSFS/Intel branch.

Change-Id: Ifb43c01bb0055051cecb01ed6a183d1797d3870e
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32497
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 llite: move /proc/fs/lustre/llite/blocksize to sysfs 96/32496/9
James Simmons [Thu, 19 Jul 2018 15:57:30 +0000 (11:57 -0400)]
LU-8066 llite: move /proc/fs/lustre/llite/blocksize to sysfs

Move blocksize file from /proc/fs/lustre/llite/*/ to
/sys/fs/lustre/llite/*/blocksize

This is a heavly modified version of

Linux-commit: 364bcfc8634d5625dbb41683b061bddf307a70e8

due to the large amount of changes to the OpenSFS/Intel branch.

Change-Id: I0b54890e6c5d5f172c2cc3d081c38ea2307b0f88
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32496
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
5 years agoLU-9474 tests: replace import_file with copytool import 98/30098/21
Quentin Bouget [Wed, 15 Nov 2017 10:05:15 +0000 (10:05 +0000)]
LU-9474 tests: replace import_file with copytool import

In sanity-hsm, replace every call to import_file() using the newer
copytool() interface (copytool import).

The appropriate modifications to any function that internally uses
the variable HSM_ARCHIVE are made.

From now on, tests in sanity-hsm that need to launch a copytool,
import a file, rebind archived data should do so using:
 - copytool setup
 - copytool import
 - copytool rebind

With this patch, sanity-hsm also completes the transition from
trap() to stack_trap().

Test-Parameters: trivial clientcount=3 mdscount=2 testlist=sanity-hsm,sanity-hsm
Change-Id: I911964a4bafd4d879e08f506cfe33e3db29cff42
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-on: https://review.whamcloud.com/30098
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-6142 obdclass: Fix style issues for obdo_server.c 97/32897/2
Arshad Hussain [Sun, 22 Jul 2018 16:13:25 +0000 (21:43 +0530)]
LU-6142 obdclass: Fix style issues for obdo_server.c

This patch fixes issues reported by checkpatch
for file lustre/obdclass/obdo_server.c

Change-Id: If2c46841c39258937a0f64ef9e6d589c6ea41809
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32897
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-6142 ko2iblnd: remove typedefs from ko2iblnd 02/32802/3
James Simmons [Thu, 19 Jul 2018 19:32:35 +0000 (15:32 -0400)]
LU-6142 ko2iblnd: remove typedefs from ko2iblnd

Change the typedefs in lnd ko2iblnd to proper structures.
Several other style changes to fix checkpatch issues with
code impacted by typedef change.

Test-Parameters: trivial

Change-Id: I55e9c91e392dee804802153bd609afc858a3591b
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32802
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10181 mds: init cpt params for mdt IO service 50/31750/12
Mikhail Pershin [Fri, 23 Mar 2018 14:18:07 +0000 (17:18 +0300)]
LU-10181 mds: init cpt params for mdt IO service

Initialize CPT values for MDS IO service similar to
OST's values.

Test-Parameters: mdtcount=2 mdscount=2 mdssizegb=20 testlist=dom-performance
Signed-off-by: Mike Pershin <mpershin@whamcloud.com>
Change-Id: I96b5f78c7212d31d43ea1b7abd75000fb19beee9
Reviewed-on: https://review.whamcloud.com/31750
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11148 ldlm: enable trybits for PDO lock 20/32820/4
Lai Siyao [Sat, 23 Jun 2018 09:06:15 +0000 (17:06 +0800)]
LU-11148 ldlm: enable trybits for PDO lock

When trybits was added (in LU-9148), it doesn't enable trybits for
PDO lock in mdt_object_local_lock(), which may cause deadlock in
try_lock.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Icfca639cfbd84e1a3bc25d91de0460d2951c2c2b
Reviewed-on: https://review.whamcloud.com/32820
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10288 lfsck: layout LFSCK for mirrored file 05/32705/5
Fan Yong [Sat, 14 Jul 2018 21:15:21 +0000 (05:15 +0800)]
LU-10288 lfsck: layout LFSCK for mirrored file

This patch makes the layout LFSCK to support mirrored file
as following:

1. Verify mirrored file's LOV EA and PFID EA, including all
   kinds of inconsistencies as non-mirrored file may hit.

2. Rebuild mirrored file's LOV EA from orphan OST-objects,
   recover the component's status/flags before the crash:
   init, stale, and so on.

3. For the mirrored file with dangling reference (OST object),
   it does NOT rebuild the lost OST-object from other replica,
   instead, it either reports the curruption or re-create empty
   OST-object that follows the same rules as non-mirrored case.

Some code cleanup and new test cases for LFSCK against mirrored file.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I560746fc2aae40101dcb0e8513b6c7ed54902ec6
Reviewed-on: https://review.whamcloud.com/32705
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11104 mdt: rename may cause deadlock 01/32701/9
Lai Siyao [Tue, 12 Jun 2018 04:11:36 +0000 (12:11 +0800)]
LU-11104 mdt: rename may cause deadlock

In rename locking, there are two situations we need to lock target
parent before source parent:
1. source parent is subdir of target parent.
2. source and target parents are both stripes of the same directory,
   and stripe index of source parent is after that of target parent.

But the check for the second situation is missing, which may cause
deadlock if another thread is taking stripe locks of their parent.

Cleanup mdd_is_subdir().

Add sanityn.sh test_81b.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ib96fb7b286e7dfdea868ef2fa4919f8d3f1567f9
Reviewed-on: https://review.whamcloud.com/32701
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11102 ldlm: run local lock bl_ast only when necessary 38/32738/9
Lai Siyao [Wed, 20 Jun 2018 18:30:18 +0000 (02:30 +0800)]
LU-11102 ldlm: run local lock bl_ast only when necessary

LDLM local lock will be canceled after use, and it should only
run bl_ast if it needs to trigger Commit-on-Sharing, otherwise
if this bl_ast does nothing, it will prevent subsequent
operations to run bl_ast again, therefore Commit-on-Sharing
can't be triggered.

For example, a concurrent setattr on a striped directory and
rename under this directory:
1. setattr takes UPDATE lock of directory, but not unlock it yet
(i.e., this lock is not downgraded to COS lock).
2. a concurrent 'mv' under this directory will first getattr file by
name, this getattr will take UPDATE lock of this directory, which is
racing with setattr, but this getattr is not a distributed operation,
and the lock still has writer (by setattr), bl_ast does nothing.
3. setattr unlocks this UPDATE lock.
4. rename tries to lock UPDATE lock of this directory, but this lock
was bl_ast was run before(though nothing did), it won't run again,
rename will wait until setattr transaction commit.

To fix this, run local lock bl_ast only when it will trigger
Commit-on-Sharing.

Add sanity.sh test_415 to verify this.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Idae241076e7cae8fe06ae6a34481fe19c7dfd2f3
Reviewed-on: https://review.whamcloud.com/32738
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11103 lod: add lock for lod_object layout 89/32589/8
Lai Siyao [Thu, 7 Jun 2018 11:53:14 +0000 (19:53 +0800)]
LU-11103 lod: add lock for lod_object layout

lod_object layout is loaded on demand, and it may be updated
by layout split/merge. To avoid race, add ldo_layouyt_mutex to
serialize layout load/free/reload.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I43c15a3b07254eadef95a14b288267904a1cd621
Reviewed-on: https://review.whamcloud.com/32589
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11141 tests: put sanity-quota 61 on slow list 03/32903/3
James Nunez [Wed, 25 Jul 2018 18:41:44 +0000 (12:41 -0600)]
LU-11141 tests: put sanity-quota 61 on slow list

Since the patch for LU-11141, with commit, 6316b42a73f8,
landed, sanity-quota test 61 takes between 20 and 50
minutes to run. Test 61 needs to be added to the slow
list and, thus, will not be run unless the SLOW
variable is true.

Test-Parameters: trivial testlist=sanity-quota
Test-Parameters: envdefinitions="SLOW=yes" testlist=sanity-quota

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I2b6c21996ef2db9472da8838d3f41fed60ba5102
Reviewed-on: https://review.whamcloud.com/32903
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
5 years agoLU-11071 build: Add server build support for Ubuntu 18.04 13/32613/10
Li Dongyang [Thu, 19 Jul 2018 16:24:36 +0000 (12:24 -0400)]
LU-11071 build: Add server build support for Ubuntu 18.04

This enables server build for Ubuntu 18.04 LTS, the ldiskfs
patches are based on Gael's 4.12 support,
they apply to kernel versions 4.15.0-20.21 to 4.15.0-23.25

There's also a small fix to make dpkg happy when installing
lustre packages which requires lustre-client-utils.

Test-Parameters: clientdistro=ubuntu1604 trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Gael Delbary <gael.delbary@cea.fr>
Change-Id: I65e1a5ee0d17115f23ba071ff1ab23b4fb22e78f
Reviewed-on: https://review.whamcloud.com/32613
Tested-by: Jenkins
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9538 mdt: Lazy size on MDT 60/29960/43
Qian Yingjin [Tue, 7 Nov 2017 08:27:07 +0000 (16:27 +0800)]
LU-9538 mdt: Lazy size on MDT

The design of Lazy size on MDT (LSOM) does not guarantee the
accuracy. A file that is being opened for a long time might
cause inaccurate LSOM for a very long time. And also eviction or
crash of client might cause incomplete process of closing a file,
thus might cause inaccurate LSOM. A precise LSOM could only be read
from MDT when 1) all possible corruption and inconsistency caused
by client eviction or client/server crash have all been fixed by
LFSCK and 2) the file is not being opened for write.
In the first step of implementing LSOM, LSOM will not be accessible
from client. Instead, LSOM values can only be accessed on MDT. Thus,
no interface or logic codes will be added on client side to enabled
the access of LSOM from client side.
The LSOM will be saved as an EA value on MDT.
LSOM includes both the apparent size and also the disk usage of
the file.
Whenever a file is being truncated, the LSOM of the file on MDT
will be updated.
Whenever a client is closing a file, ll_prepare_close() will send
the size and blocks to the MDS. The MDS will update the LSOM of
the file if the file size or block size is being increased.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: If4032a55f448a65235a6b3db58f857c74222faa3
Reviewed-on: https://review.whamcloud.com/29960
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10928 tests: sanity/133b should wait a bit 69/32069/3
Alex Zhuravlev [Thu, 19 Apr 2018 10:40:57 +0000 (13:40 +0300)]
LU-10928 tests: sanity/133b should wait a bit

to invalidate cache in obd_statfs()

Test-Parameters: trivial

Change-Id: I08283542962e4b88ca4b5dcde4dfcc58316c1bba
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/32069
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11149 build: enable KMP for Mellanox build 33/32833/9
Minh Diep [Thu, 19 Jul 2018 14:27:55 +0000 (07:27 -0700)]
LU-11149 build: enable KMP for Mellanox build

* We need to build Mellanox KMP to avoid error
in symbol dependency when installing lustre
* Remove all Mellanox config parameters and use
default

Test-Parameters: trivial

Change-Id: I4676d01bd5f788581e1be6df98d2d787a5419c07
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/32833
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11083 tests: automatically load external modules 90/32790/4
John L. Hammond [Thu, 5 Jul 2018 16:02:25 +0000 (11:02 -0500)]
LU-11083 tests: automatically load external modules

In the test-framework function load_module(), try to load (using
modprobe) any not yet loaded modules (which are assumed to be
external) that the current module depends on.

Test-Parameters: trivial

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Id1d10519b00854600d095b861670e96f906298fc
Reviewed-on: https://review.whamcloud.com/32790
Tested-by: Jenkins
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11010 tests: remove calls to return after skip() 31/32731/3
James Nunez [Tue, 26 Jun 2018 16:46:14 +0000 (10:46 -0600)]
LU-11010 tests: remove calls to return after skip()

The skip() routine now contains a call to exit. All calls
to skip() and skip_env() should be reviewed and calls to
return that followed skip() should be removed.

This is the second patch in a series that removes calls
to return after skip() in the Lustre test suites.

Calls to return after skip() are removed for:
dne_sanity
insanity
obdfilter-survey
sgpdd-survey

Test-Parameters: trivial testlist=dne-sanity,insanity,obdfilter-survey,sgpdd-survey
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I4b9aeaeddd673dcba371b8340dd635ddeed2b6be
Reviewed-on: https://review.whamcloud.com/32731
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11068 build: remove invalid kernel srpm location 06/32606/2
Minh Diep [Fri, 1 Jun 2018 17:56:56 +0000 (10:56 -0700)]
LU-11068 build: remove invalid kernel srpm location

The location has never been existed

Change-Id: I8958bbdb5c61284c55d6cc337ac92832f91ee08b
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/32606
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10254 tests: fix racer version checks 07/30307/9
Andreas Dilger [Fri, 13 Jul 2018 20:44:39 +0000 (14:44 -0600)]
LU-10254 tests: fix racer version checks

Fix the checks for enabling DOM, PFL, and FLR tests in file_create.sh.
The $LCTL variable was unset in the test script, so the version check
was failing.

Instead of doing the version check inside file_create.sh do it in the
Lustre-specific racer.sh test script, where other version checks live.
This enables PFL and FLR testing by default, but leaves DOM tests off.

Author: Andreas Dilger <adilger@whamcloud.com>

Test-Parameters: trivial testlist=racer envdefinitions=SLOW=yes
Test-Parameters: testlist=racer mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I2aeab0911f19f9741212925cf9b4aeb70e3ebbe5
Reviewed-on: https://review.whamcloud.com/30307
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11115 lod: skip max_create_count=0 OST in QoS and RR algorithms 23/32823/2
Jian Yu [Tue, 17 Jul 2018 00:09:15 +0000 (17:09 -0700)]
LU-11115 lod: skip max_create_count=0 OST in QoS and RR algorithms

While choosing OST to create object, both lod_alloc_qos() and
lod_alloc_rr() functions use lod_statfs_and_check() function
to check whether the OST is available for new OST objects or not.
However, OST with max_create_count=0 is not checked in that
function and just returned as an available OST.

This patch fixes the above issue by detecting OST with
max_create_count=0 in lod_statfs_and_check() and skip it.

Change-Id: I04476a4b369e99133bd89c00155fd9f51bf0c930
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32823
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11174 quota: use sync io to test quota 74/32874/3
Hongchao Zhang [Fri, 20 Jul 2018 19:45:56 +0000 (15:45 -0400)]
LU-11174 quota: use sync io to test quota

In test_61 of sanity-quota, the client cache (grant) could affect
the quota behavior, using sync io to avoid the effect of it.

Test-Parameters: trivial testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota

Change-Id: I08bc19c5e7ac4f9cb679f96a2299c0be772f0330
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32874
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-9007 lod: improve obj alloc for FLR file 04/32404/10
Bobi Jam [Mon, 14 May 2018 11:10:24 +0000 (19:10 +0800)]
LU-9007 lod: improve obj alloc for FLR file

* add lod_layout_component::llc_ost_indices to track the map
  of dt_object to its OST index.
* add lod_device::lod_avoid to collect information of objects on other
  mirrors which overlapped the target component
* lod_should_avoid_ost() use the avoid guidance information to avoid
  allocating objects on the same OST for different mirrors.

Change-Id: Ib7e155e4b02c2e25d3955aa9a4acff7569ab7d8f
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32404
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11141 quota: reset adjust schedule when updating lqe 19/32819/5
Hongchao Zhang [Sat, 7 Jul 2018 17:55:21 +0000 (13:55 -0400)]
LU-11141 quota: reset adjust schedule when updating lqe

The scheduled adjust for some lquota_entry should be reset when its
limits (hard or soft) are updated by the glimpse callback from QMT.

Test-Parameters: mdtfilesystemtype=zfs ostfilesystemtype=zfs \
testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota

Change-Id: Ia16cd90adfa15b92577841259f91f2b275fc7e82
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32819
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11147 llite: add newline to llite.*.offset_stats 17/32817/3
Andreas Dilger [Sat, 14 Jul 2018 09:29:48 +0000 (03:29 -0600)]
LU-11147 llite: add newline to llite.*.offset_stats

The llite.*.offset_stats file is missing a newline in the output.

Fixes: 49577875

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ieade87f500c4fa24a0a7b8bd35d65f18dd5681ba
Reviewed-on: https://review.whamcloud.com/32817
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11138 lfs: getstripe display certain mirror(s) 04/32804/5
Bobi Jam [Tue, 10 Jul 2018 18:07:01 +0000 (12:07 -0600)]
LU-11138 lfs: getstripe display certain mirror(s)

Add [!] --mirror-index=[+-]<index> | [!] --mirror-id=[+-]<id>
option for lfs getstripe to print the components of mirror(s)
relative to <index>-th mirror or components of mirror(s) relative
to the one with mirror ID of <id>.

Change-Id: I9ab8fd5faaea07b7567f88665e06ca71157cca67
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32804
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11129 kernel: kernel update RHEL7.5 [3.10.0-862.6.3.el7] 94/32794/5
Yang Sheng [Sun, 8 Jul 2018 15:49:24 +0000 (11:49 -0400)]
LU-11129 kernel: kernel update RHEL7.5 [3.10.0-862.6.3.el7]

Update RHEL7.5 kernel to 3.10.0-862.6.3.el7

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I59b362135b5c235ac76848afb2d48014b7a4e928
Reviewed-on: https://review.whamcloud.com/32794
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11107 mdt: handle nonexistent xattrs correctly 53/32753/3
John L. Hammond [Mon, 2 Jul 2018 15:07:51 +0000 (10:07 -0500)]
LU-11107 mdt: handle nonexistent xattrs correctly

In mdt_getxattr_pack_reply() propagate -ENODATA returns from
mo_xattr_list() to mdt_getxattr(). Add sanity test_102s() to ensure
that getting a nonexistint xattr will fail.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ic7a01feb3fcac66d39f84b4ebdfc86025c3e2779
Reviewed-on: https://review.whamcloud.com/32753
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-11108 mdt: propagate errors in mdt_getxattr() 43/32743/2
John L. Hammond [Fri, 29 Jun 2018 21:11:05 +0000 (16:11 -0500)]
LU-11108 mdt: propagate errors in mdt_getxattr()

In mdt_getxattr(), if mo_xattr_get() fails then return that error
value rather than letting mdt_nodemap_map_acl() mangle it.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I967bcc5ad6edf30b43f373e85f22fc922647c435
Reviewed-on: https://review.whamcloud.com/32743
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11094 osd-ldiskfs: Fix style issues for osd_quota.c 24/32724/6
Arshad Hussain [Sun, 24 Jun 2018 04:30:57 +0000 (10:00 +0530)]
LU-11094 osd-ldiskfs: Fix style issues for osd_quota.c

This patch fixes issues reported by checkpatch
for file lustre/osd-ldiskfs/osd_quota.c

Change-Id: I1a01c3e6327ec56a1ffcf85c5d06934a5f8e8c54
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32724
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-11087 osd-ldiskfs: Fix style issues for osd_compat.c 09/32709/5
Arshad Hussain [Tue, 12 Jun 2018 15:51:11 +0000 (21:21 +0530)]
LU-11087 osd-ldiskfs: Fix style issues for osd_compat.c

This patch fixes issues reported by checkpatch for
file lustre/osd-ldiskfs/osd_compat.c

Test-Parameters: trivial
Change-Id: Ifa5ea5563fc7e5b5e94ea992e602979dea20eb9f
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32709
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11032 hsm: memory leak in mdt_hsm_cdt_cleanup 56/32456/3
Qian Yingjin [Fri, 18 May 2018 08:55:32 +0000 (16:55 +0800)]
LU-11032 hsm: memory leak in mdt_hsm_cdt_cleanup

Release the alloced memory of archive id in mdt_hsm_cdt_cleanup
when free hsm_agent data structure, avoiding memroy leak problem.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I40e5fd289419d7c18d5f2c3ebe0d3955229f5517
Reviewed-on: https://review.whamcloud.com/32456
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11022 lfs: accept specifing comp_id in mirror split 55/32455/2
Bobi Jam [Fri, 18 May 2018 05:15:11 +0000 (13:15 +0800)]
LU-11022 lfs: accept specifing comp_id in mirror split

This patch enables "lfs mirror split" to accept --component-id
specifying a mirror containing the designated component in mirror
splitting.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I02bf4d75013341d99d95852cb7fb0fbbb41c7a4d
Reviewed-on: https://review.whamcloud.com/32455
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11027 doc: Add lockahead to llapi_ladvise man 37/32437/4
Patrick Farrell [Thu, 17 May 2018 10:31:47 +0000 (05:31 -0500)]
LU-11027 doc: Add lockahead to llapi_ladvise man

Document lockahead in the llapi_ladvise man page.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ia709611bb2751a408e3525c538daa824b365b09c
Reviewed-on: https://review.whamcloud.com/32437
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10970 tests: make sure write is complete 03/32203/5
Patrick Farrell [Mon, 30 Apr 2018 12:10:38 +0000 (07:10 -0500)]
LU-10970 tests: make sure write is complete

The current test does not guarantee the write has arrived
on the server before dropping caches and checking memory
usage.  If the write is still in progress, the baseline
memory used value will be incorrect.

Sync on the client to force the write out.

Test-Parameters: trivial

Cray-bug-id: LUS-5923
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ic0379ffdfd14ff630d65a0197a99fba929868e9c
Reviewed-on: https://review.whamcloud.com/32203
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 years agoLU-11120 test: add compilebench and DNE tests 49/31749/12
Mikhail Pershin [Fri, 23 Mar 2018 09:35:10 +0000 (12:35 +0300)]
LU-11120 test: add compilebench and DNE tests

Add more tests in dom-performance.sh
- add compilebench run
- add default DOM+DNE run

Test-Parameters: trivial mdtcount=2 mdscount=2 mdssizegb=20 testlist=dom-performance
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Id93c17157dba4887d250cd933d7a1fae5906af1b
Reviewed-on: https://review.whamcloud.com/31749
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 osp: migrate from proc to sysfs 77/32377/10
James Simmons [Wed, 11 Jul 2018 17:27:11 +0000 (13:27 -0400)]
LU-8066 osp: migrate from proc to sysfs

Move the osp module from using proc for most single value files
to sysfs. Create the default attrs for dt_devices which can be
used for other server side devices.

Change-Id: I51fef51287585b38a1aff80d8edf986583c54a14
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32377
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10683 osd_zfs: set offset in page correctly 88/32788/2
Hongchao Zhang [Thu, 5 Jul 2018 11:44:38 +0000 (07:44 -0400)]
LU-10683 osd_zfs: set offset in page correctly

In osd_bufs_get_write, the offset in the first page should
be calculated on the offset parameter instead of zero.

Change-Id: I6592d8b5b0162b92953d59e2662a4381ba3e89ba
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32788
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-5638 tests: resume running sanity-quota tests 94/32694/2
James Nunez [Mon, 11 Jun 2018 15:58:31 +0000 (09:58 -0600)]
LU-5638 tests: resume running sanity-quota tests

sanity-quota tests 11 and 33 were not run due to the
issues documented in LU-5638. A patch, commmit id
a046e879fcadd601c9a19fd906f82ecbd2d4efd5, landed to fix
this issue. We should resume running sanity-quota
tests 11 and 33 for ZFS servers.

Test-Parameters: trivial clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity-quota
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iadb1356a0a6b4f5a8b5f54275db794f0ddbb5af6
Reviewed-on: https://review.whamcloud.com/32694
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8708 osc: depart grant shrinking from pinger 02/23202/12
Bobi Jam [Mon, 17 Oct 2016 06:36:31 +0000 (14:36 +0800)]
LU-8708 osc: depart grant shrinking from pinger

* Removing grant shrinking code outside of pinger, use a workqueue
  to handle grant shrinking timer.
* Enable OSC grant shrinking by default.

bugzilla: 19507

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ifb03c907ad285a307d37d707193cfc32998ba2b2
Reviewed-on: https://review.whamcloud.com/23202
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoNew tag 2.11.53 2.11.53 v2_11_53 v2_11_53_0
Oleg Drokin [Tue, 24 Jul 2018 03:58:13 +0000 (23:58 -0400)]
New tag 2.11.53

Change-Id: I02c52e58bd01f54d55a9083a2d1a12f6e811eaf1

5 years agoLU-11132 compile: fix LC_BI_BDEV for old kernels 99/32799/2
Vladimir Saveliev [Thu, 12 Jul 2018 19:45:11 +0000 (22:45 +0300)]
LU-11132 compile: fix LC_BI_BDEV for old kernels

struct bio is located in linux/bio.h in 2.6 kernel serie. LC_BI_BDEV
uses linux/blk_types.h. That makes the configuration check to fail for
those kernels and breaks compiling.

Use linux/bio.h in LC_BI_BDEV so that it worked for both new and all
kernels.

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: Iaeefea9ba96ebe4dad30acedb5fa7551c4516241
Reviewed-on: https://review.whamcloud.com/32799
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
5 years agoLU-11161 tests: stop running sanity test 160g 44/32844/2
James Nunez [Thu, 19 Jul 2018 22:36:27 +0000 (16:36 -0600)]
LU-11161 tests: stop running sanity test 160g

When run with two or more MDSs, sanity test 160g will fail
due to expecting a changelog user being deregistered on
all MDSs.

In order to stop sanity 160g from failing, add it to the
ALWAYS_EXCEPT list when running in a DNE environment which
results in the test not being executed.

Test-Parameters: trivial
Test-Parameters: testlist=sanity mdtcount=2 mdscount=2
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I091f148a3da820cad0103aead559a96c54c9fe8b
Reviewed-on: https://review.whamcloud.com/32844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11157 obd: keep dirty_max_pages a round number of MB 31/32831/4
John L. Hammond [Wed, 18 Jul 2018 20:47:25 +0000 (15:47 -0500)]
LU-11157 obd: keep dirty_max_pages a round number of MB

In client_adjust_max_dirty() ensure that the dirty pages limit is
always divisible by 256 so that it may faithfully be represented in MB
as is the case when the max_dirty_mb parameters are used.

Test-Parameters: trivial

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I8e2fbdd4bf253a46e2951e7840484ab6a617fbe2
Reviewed-on: https://review.whamcloud.com/32831
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
5 years agoLU-11074 mdc: set correct body eadatasize for getxattr() 39/32739/3
John L. Hammond [Fri, 29 Jun 2018 21:11:45 +0000 (16:11 -0500)]
LU-11074 mdc: set correct body eadatasize for getxattr()

In mdc_intent_getxattr_pack() set mbo_eadatasize to the size of the
xattr values buffer rather than the size of the xattr names buffer.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ibbed6aba6718f50eed1a08d506d526b1e0e042c8
Reviewed-on: https://review.whamcloud.com/32739
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11097 utils: add libuuid for llverdev 26/32726/2
Alex Zhuravlev [Sun, 24 Jun 2018 19:00:21 +0000 (22:00 +0300)]
LU-11097 utils: add libuuid for llverdev

this is explicitly required on my setup

Change-Id: I2b518c922d1857411bac74f68223259bb255e0e4
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32726
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
5 years agoLU-11131 target: keep reply data bit set on failover 98/32798/2
Vladimir Saveliev [Thu, 12 Jul 2018 20:27:24 +0000 (23:27 +0300)]
LU-11131 target: keep reply data bit set on failover

The following scenario leads to failure of recent reint rpc:

1. mdt server has number of rpcs being handled, rpc 1 from client A
and rpc 2 from client B.

2. shutdown for the server starts

3. rpc 1 is processed, reply data is added, but client A gets ENODEV
in reply (ptlrpc_send_reply()) as shutdown is running

3. shutdown reaches class_disconnect_exports() and links an export A
to the list of zombie exports

4. obd_zombid thread wakes up and destroy the export A, which includes
freeing of reply data list with clearing bits in
lut->lut_reply_bitmap (tgt_free_reply_data())

5. export B is still processing the rpc 2 and looks for free bit in
the lut->lut_reply_bitmap to store reply data
(tgt_add_reply_data()). If it finds a bit which has been just freed by
obd_zombid thread, then reply data from export A will get overwritten
in reply_data file with reply data from export B

6. after failover, reply data gets restored with
tgt_reply_data_init(). The reply data of client A is missing

7. client A reconnects and resends its rpc 1. Server does not find
reply data and processes the rpc as if it has not been seen yet. In
case of unlink, the directory entry already does not exist so rpc 1
fails

The fix is to not free bits in lut->lut_reply_bitmap in case of
failover.

Test illustrating the issue is added.

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-6004
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Change-Id: I6db3728f3271ce2751fbe08dadca365eb2ffe727
Reviewed-on: https://review.whamcloud.com/32798
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11099 doc: include "-N" option to lfs_setstripe.1 34/32734/2
Emoly Liu [Wed, 27 Jun 2018 04:18:57 +0000 (12:18 +0800)]
LU-11099 doc: include "-N" option to lfs_setstripe.1

This patch includes mirror option "-N[mirror_count]" to
lfs_setstripe.1 man page so that the user can follow the manual
to create a mirrored file or set s default mirror layout on a
directory correctly.
The command format is like:
$lfs setstripe -N[mirror_count] [STRIPE_OPTIONS] <dir|filename>

Test-Parameters: trivial

Change-Id: If0fabd79d218e5582f9c64336f60466f35dbd968
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32734
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
5 years agoLU-11098 ptlrpc: ASSERTION(!list_empty(imp->imp_replay_cursor)) 27/32727/2
Andriy Skulysh [Mon, 4 Jun 2018 16:08:29 +0000 (19:08 +0300)]
LU-11098 ptlrpc: ASSERTION(!list_empty(imp->imp_replay_cursor))

It's ptlrpc_replay_next() vs close race.
ll_close_inode_openhandle() calls
mdc_free_open()->ptlrpc_request_committed->ptlrpc_free_request

Need to reset imp_replay_cursor while dropping a request from
replay list.

Change-Id: Ia0ce327a729f8cf554b008ab6d32323b5dd26ee7
Cray-bug-id: LUS-2455
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/32727
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 llite: replace ll_process_config with class_modify_config 22/32722/4
James Simmons [Tue, 3 Jul 2018 00:35:05 +0000 (20:35 -0400)]
LU-8066 llite: replace ll_process_config with class_modify_config

The current method of handling tunables with ll_process_config can
not work with sysfs. So replace ll_process_config handling with
class_modify_config() which can handle sysfs, debugfs and procfs.

Change-Id: I7ef5a4b1ee47827711a9d6654fda279abde06268
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32722
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 osc: fix idle_timeout handling 19/32719/4
James Simmons [Thu, 14 Jun 2018 16:53:29 +0000 (12:53 -0400)]
LU-8066 osc: fix idle_timeout handling

The patch that landed for LU-7236 introduced new sysfs entries
which were done wrong.

1) For idle_timeout it returns -ERANGE for
   any value passed in expect setting idle_timeout to zero. This
   does not match what the commit message said for LU-7236. So
   I changed lprocfs_str_with_units_to_s64() into kstrtouint()
   since a signed 64 bit timeout is not needed. Using kstrtouint()
   ensures that negative values are not possible and also cap the
   value to CONNECTION_SWITCH_MAX since the max of 4 billion
   seconds is over kill.

2) For the next procfs idle_connect it is really a write only file
   but it was treated as both read and write. There is no need for
   the osc_idle_connect_seq_show() function.

3) Lastly no more stuffing new entries into proc or debugfs. For
   this patch convert these new proc entries to sysfs. It seems
   to be a common occurrence so add LPROC_SEQ_* to spelling.txt
   so checkpatch will complain about using LPROC_SEQ_* which will
   go away.

Change-Id: I1c992b2db47aade6a887919824d869e8d5354c71
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32719
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10855 ptlrpc: remove obsolete LLOG_ORIGIN_* RPCs 54/32654/2
Andreas Dilger [Wed, 6 Jun 2018 22:21:51 +0000 (16:21 -0600)]
LU-10855 ptlrpc: remove obsolete LLOG_ORIGIN_* RPCs

Remove the obsolete RPC opcodes LLOG_ORIGIN_HANDLE_WRITE_REC,
LLOG_ORIGIN_HANDLE_CLOSE, LLOG_ORIGIN_CONNECT, LLOG_CATINFO
along with their unused OBD_FAIL counterparts.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I5a2a15bc0dc9e09d0081b6c3aa291fc7713ebbe5
Reviewed-on: https://review.whamcloud.com/32654
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10855 ptlrpc: assign specific values to MGS opcodes 53/32653/2
Andreas Dilger [Wed, 6 Jun 2018 22:18:03 +0000 (16:18 -0600)]
LU-10855 ptlrpc: assign specific values to MGS opcodes

Assign specific values to all of the MGS opcodes in enum mgs_cmd
so that these values do not change if a new items is added or one
is removed in the future.  These opcodes are part of the wire
protocol and need to remain constant.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8132ca01916cd657933d0c8864e4e78f8b3ebbe5
Reviewed-on: https://review.whamcloud.com/32653
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10855 ptlrpc: remove obsolete OBD RPC opcodes 51/32651/3
Andreas Dilger [Wed, 6 Jun 2018 20:41:13 +0000 (14:41 -0600)]
LU-10855 ptlrpc: remove obsolete OBD RPC opcodes

Remove the obsolete OBD_LOG_CANCEL (since Lustre 1.5) and
OBD_QC_CALLBACK (since Lustre 2.4) RPC opcodes.

Assign  OBD_IDX_READ an explicit opcode (as should be done with all
enums in lustre_idl.h) so that the value does not change if some
prior field is removed.

Also remove the OBD_FAIL checks that were used to test them.
The setting in conf_sanity.sh test_58 was unused for many years.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie68c6be0da1c114fc981cb4b1afdcdb7c13ebbe5
Reviewed-on: https://review.whamcloud.com/32651
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11052 obd: remove OBD ops based stats 02/32602/2
John L. Hammond [Fri, 25 May 2018 14:40:04 +0000 (09:40 -0500)]
LU-11052 obd: remove OBD ops based stats

Stats maintained via the OBD operations wrappers (obd_setup(),
obd_cleanup(), ...) are less and less interesting to the point that we
should remove them. The only stats files affected by this are
obdfilter.*.stats, obdfilter.*.exports.*.stats and
obdecho.*.stats. For obdfilter here is a comparison for two racer
runs. With the current OBD ops based stats:

obdfilter.lustre-OST0000.stats=
snapshot_time             1527267354.328068245 secs.nsecs
read_bytes                610 samples [bytes] 4096 4194304 800043008
write_bytes               2196 samples [bytes] 5 4194304 3410224606
setattr                   13545 samples [reqs]
punch                     7682 samples [reqs]
destroy                   2281 samples [reqs]
create                    74 samples [reqs]
statfs                    234 samples [reqs]
get_info                  1 samples [reqs]
connect                   3 samples [reqs]
disconnect                1 samples [reqs]
preprw                    2806 samples [reqs]
commitrw                  2806 samples [reqs]
ping                      422 samples [reqs]

And after the OBD ops bases stats have been removed:

obdfilter.lustre-OST0000.stats=
snapshot_time             1527168813.867472974 secs.nsecs
read_bytes                200 samples [bytes] 4096 4194304 231366656
write_bytes               1703 samples [bytes] 5 4194304 1220864892
getattr                   337 samples [reqs]
setattr                   6358 samples [reqs]
punch                     2880 samples [reqs]
destroy                   2000 samples [reqs]
create                    71 samples [reqs]
statfs                    2148 samples [reqs]
get_info                  4 samples [reqs]

Changes to obdfilter.lustre-OST0000.exports.*.stats are similar.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: If4fb7022a3de0aa61905212eaab07b94c1687c68
Reviewed-on: https://review.whamcloud.com/32602
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jesse Hanley <hanleyja@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9325 llog: replace simple_strtol with kstrtol 98/32598/3
James Simmons [Thu, 7 Jun 2018 00:52:17 +0000 (20:52 -0400)]
LU-9325 llog: replace simple_strtol with kstrtol

Eventually simple_strtol will be removed so replace its use in
the llog_ioctl code with kstrtoxxx() functions.

Change-Id: I55a4e97837a1d9e0134dde92f0c2380f07691ab9
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32598
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
5 years agoLU-11045 test: use provided directory in racer/racer.sh 14/32514/2
John L. Hammond [Wed, 23 May 2018 15:03:46 +0000 (10:03 -0500)]
LU-11045 test: use provided directory in racer/racer.sh

In racer/racer.sh use the directory provided by the parent script
rather than the environmental variable $DIR.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iab753c34752462a30e7263b7c304e1626e5cc343
Reviewed-on: https://review.whamcloud.com/32514
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11044 osd-ldiskfs: ext4_dir_operations uses iterate_shared 86/32486/2
Chris Horn [Tue, 22 May 2018 14:39:14 +0000 (09:39 -0500)]
LU-11044 osd-ldiskfs: ext4_dir_operations uses iterate_shared

Linux 4.7 commit ae05327a00fd47c34dfe25294b359a3f3fef96e8 replaces
ext4_dir_operations iterate with iterate_shared. dir_relaxed_shared()
was also added in that commit, so we can use that function to verify
that the ext4_dir_operations is using iterate_shared.

Cray-bug-id: LUS-6008
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I67ff714296cab96408cb74fba62855c0e12cdf43
Reviewed-on: https://review.whamcloud.com/32486
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11034 build: update changelog for Ubuntu 18.04 59/32459/3
Minh Diep [Fri, 18 May 2018 17:46:56 +0000 (10:46 -0700)]
LU-11034 build: update changelog for Ubuntu 18.04

Record the version that we are building

Test-Parameters: trivial

Change-Id: I78c4aa6ad9b1a85cd498709b76ec3111e9572b84
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/32459
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11014 mdt: remove enum mdt_it_code 58/32358/3
John L. Hammond [Fri, 11 May 2018 14:52:45 +0000 (09:52 -0500)]
LU-11014 mdt: remove enum mdt_it_code

Remove enum mdt_it_code, struct mdt_it_flavor and the mdt_it_flavor
array. In mdt_intent_opc, collapse the switch statement followed by
array lookup into a single switch statement that assigns the intent
format, handler, and handler flags. Simplify the subsequent logic in
mdt_intent_opc() accordingly.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Id56fe5fa1bd4d4c03a8de2db9d39f571bed06b2f
Reviewed-on: https://review.whamcloud.com/32358
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10990 osc: increase default max_dirty_mb to 2G 88/32288/5
Oleg Drokin [Fri, 4 May 2018 03:08:35 +0000 (23:08 -0400)]
LU-10990 osc: increase default max_dirty_mb to 2G

While ideally we want to go away from max_dirty_mb setting
completely and let grants code to take the msot part of it,
Andreas raises a somewhat valid point that for certain
system configurations with high-latency links, system
administrators might want to have ability to limit
amount of dirty pages just for those OSCs to limit amount
of time it might take to flush that dirty data.

So a good compromise is to lift the max_dirty_mb default
value first while we work out the current grant code
deficiencies

Change-Id: I4de407088af70e0f98f0563160217ba70a635dfb
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/32288
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-10986 lfs: make lfs project tolerant errors 43/32243/16
Wang Shilong [Wed, 2 May 2018 08:54:15 +0000 (16:54 +0800)]
LU-10986 lfs: make lfs project tolerant errors

This patch try to fix following problems:
1)command hang on pipe file, reproduced by following steps:
 $ mkfifo tmp/pipe
 $ lfs project -srp 500 tmp -->this will never finish.

Problem is opening a pipe file will be blocked in default
without O_NOBLOCK or O_NODELAY flag.

2)If a symbolic link with missing target exists, command
returns error and does not process remaining entries.

we should fix this problem by allowing command process
further even it hit some errors.

3)fix a wrong check for MAX_PATH.

Test-Parameters: trivial testlist=sanity-quota,sanity-quota
Change-Id: I7d08a7547e6b1351a1eff23063da6cd9c4cdc5e3
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/32243
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11086 test: reset quota setting properly 07/32707/3
Wang Shilong [Wed, 13 Jun 2018 14:12:16 +0000 (22:12 +0800)]
LU-11086 test: reset quota setting properly

some test cases don't reset quota setting properly, which
make running sanity-quota.sh several times fail, this patch
try to improve this problem by:

1)reset quota setting before check_runas_id_ret, as it will
touch file which might hit EDQUOT if we don't cleanup quota
setting properly since last run.

2)fix to reset quota for test case 55 and 60.

3)reset quota setting again after all tests finished, because
some tests after sanity-quota.sh might be affected, if quota
setting not reset properly for some reasons.

Test-Parameters: trivial testlist=sanity-quota,sanity-quota
Change-Id: I2983102ea379e64173ef8c54b149ba3b5fbfebe9
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/32707
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10734 tests: ensure current GC interval is over 04/31604/9
Bruno Faccini [Fri, 9 Mar 2018 01:59:51 +0000 (02:59 +0100)]
LU-10734 tests: ensure current GC interval is over

In sanity/test_160g, ensure current configured
"changelog_min_gc_interval=2" is over to allow for
GC thread to be effectivelly started.

Also, enable Changelog GC, as it is no longer the
default, in sanity/test_160g sub-test and remove
it from ALWAYS_EXCEPT to reenable it and leave
160f for LU-10680 reason.

sanity/test_160g has also been reworked to become
fully DNE aware.

Test-Parameters: trivial
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I8a079ba2ba1822b488f65ad9703204d6296fada0
Reviewed-on: https://review.whamcloud.com/31604
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11079 llite: control concurrent statahead instances 90/32690/7
Fan Yong [Wed, 13 Jun 2018 14:33:55 +0000 (22:33 +0800)]
LU-11079 llite: control concurrent statahead instances

It is found that if there are too many concurrent statahead
instances, then related statahead RPCs may accumulate on the
client import (for MDT) RPC lists
(imp_sending_list/imp_delayed_list/imp_unreplied_lis), as to
seriously affect the efficiency of spin_lock under the case
of MDT overloaded or in recovery. Be as the temporarily solution,
restrict the concurrent statahead instances.

If want to support more concurrent statahead instances, please
consider to decentralize the RPC lists attached on related import.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I7251cc536f11d184f768e3d3704ba6717644541e
Reviewed-on: https://review.whamcloud.com/32690
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10893 tests: allow to disable dm-flakey layer 58/32658/4
Alexander Boyko [Thu, 7 Jun 2018 13:54:41 +0000 (09:54 -0400)]
LU-10893 tests: allow to disable dm-flakey layer

The patch 54b9e3f789358bd9dfb94b77fe33a4faa1e28ab2 adds flakey layer
to test framework. But it also adds a regression, you can`t run tests
separately from a setup. Before the dm-flakey, it was easy to create a
configuration at ncli, setup a cluster, and start a test. But now it
is impossible. For example
sudo MDSDEV=/dev/sdb MDSDEV1=/dev/sdb sh lustre/tests/llmount.sh
sudo MDSDEV=/dev/sdb MDSDEV1=/dev/sdb ONLY=0 sh
lustre/tests/conf-sanity.sh
Format mds1: /dev/sdb
mkfs.lustre FATAL: Unable to build fs /dev/sdb (256)
mkfs.lustre FATAL: mkfs failed 256

The fix disables dm-flakey layer with option FLAKEY=false.

Test-Parameters: envdefinitions=FLAKEY=false
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-5851
Change-Id: I248be2307cff5fe6b4b2524478ca8e4cd96a77d2
Reviewed-on: https://review.whamcloud.com/32658
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11064 lnd: determine gaps correctly 86/32586/4
Amir Shehata [Wed, 30 May 2018 20:22:11 +0000 (13:22 -0700)]
LU-11064 lnd: determine gaps correctly

We're allowed to start at a non-aligned page offset in the first
fragment and end at a non-aligned page offset in the last fragment.

When checking the iovec exclude both of the first and last fragments
from the tx_gaps check.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I8a9231db7db404a5d5a6294ff263c1bd2ac28e6c
Reviewed-on: https://review.whamcloud.com/32586
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11117 ptlrpc: don't zero request handle 81/32781/4
Alexander Boyko [Fri, 15 Jun 2018 09:02:36 +0000 (05:02 -0400)]
LU-11117 ptlrpc: don't zero request handle

LNet can retransmit a request at any time if it isn't replied.
The ptlrpc_resend_req zero the request handle and ptlrpc_send_rpc
set it. If retransmission happen with zeroed handle, the client
can't find a valid export by handle and set rq_export to NULL and
reply with ENOTCONN. A server evict client with this error.

client (nid x.x.x.x@tcp) returned error from blocking AST
(req status -107 rc -107), evict it

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-6037
Change-Id: I198666d386fea99b46994f965c1519acb5743d75
Reviewed-on: https://review.whamcloud.com/32781
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-7816 quota: add default quota setting support 06/32306/16
Hongchao Zhang [Tue, 5 Jun 2018 22:23:42 +0000 (18:23 -0400)]
LU-7816 quota: add default quota setting support

Similar function which is motivated by GPFS which is friendly
feature for cluster administrators to manage quota.

Lazy Quota default setting support, here is basic idea:

Default quota setting is global quota setting for user, group,
project quotas, if default quota is set for one quota type,
newer created users/groups/projects will inherit this setting
automatically, since Lustre itself don't have ideas when new
users created, they could only know when this users trying to
acquire space from Lustre.

So we try to implement lazy quota setting inherit, Slave firstly
check if there exists default quota setting, if exists, it will
force slave to acquire quota from master, and master will detect
whether default quota is set, then it will set this quota and also
return proper grant space to slave.

To implement this and reuse existed quota APIs, we try to manage
the default quota in the quota record of 0 id, and enforce the
quota check when reading the quota recored from disk.

In the current Lustre implementation, the grace time is either
the time or the timestamp to be used after some quota ID exceeds
the soft limt, then 48bits should be enough for it, its high 16bits
can be used as kinds of quota flags, this patch will use one of
them as the default quota flag.

The global quota record used by default quota will set its soft
and hard limit as zero, its grace time will contain the default flag.

Use lfs setquota -U/-G/-P <mnt> to set default quota.
Use lfs setquota -u/-g/-p foo -d <mnt> to set foo to use default quota
Use lfs quota -U/-G/-P <mnt> to show default quota.

Test-Parameters: envdefinitions=DEBUG_SIZE=64

Change-Id: Ib23007360921832b3c7d5710ab50324bc5067286
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/32306
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11003 ldlm: don't add canceling lock back to LRU 92/32692/2
Mikhail Pershin [Mon, 11 Jun 2018 06:44:01 +0000 (09:44 +0300)]
LU-11003 ldlm: don't add canceling lock back to LRU

When lock is converted check it is not canceling before
adding it back to LRU.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I278389f2a23b304d812f82ffb2dcee2ca70f5b21
Reviewed-on: https://review.whamcloud.com/32692
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11004 ptlrpc: Serialize procfs access to scp_hist_reqs using mutex 07/32307/2
Andriy Skulysh [Thu, 12 Apr 2018 13:12:05 +0000 (16:12 +0300)]
LU-11004 ptlrpc: Serialize procfs access to scp_hist_reqs using mutex

scp_hist_reqs list can be quite long thus a lot of
userland processes can waste CPU power in spinlock cycles.

Change-Id: Ic0fa7338569f9a19213a1dc31f5479c96a76d23a
Cray-bug-id: LUS-5833
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/32307
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10527 obdclass: don't recycle loghandle upon ENOSPC 97/30897/4
Bruno Faccini [Wed, 17 Jan 2018 15:22:58 +0000 (16:22 +0100)]
LU-10527 obdclass: don't recycle loghandle upon ENOSPC

In llog_cat_add_rec(), upon -ENOSPC error being returned from
llog_cat_new_log(), don't reset "cathandle->u.chd.chd_current_log"
to NULL.
Not doing so will avoid to have llog_cat_declare_add_rec() repeatedly
and unnecessarily create new+partially initialized LLOGs/llog_handle
and assigned to "cathandle->u.chd.chd_current_log", this without
llog_init_handle() never being called to initialize
"loghandle->lgh_hdr".

Also, unnecessary LASSERT(llh) has been removed in
llog_cat_current_log() as it prevented to gracefully handle this
case by simply returning the loghandle.
Thanks to S.Cheremencev (Cray) to report this.

Both ways to fix have been kept in patch as the 1st part allows for
better performance in terms of number of FS operations being done
with permanent changelog's ENOSPC condition, even if this covers
a somewhat unlikely situation.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I526f788dc283fa7136ba518179d9337e1d5e3714
Reviewed-on: https://review.whamcloud.com/30897
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10175 ldlm: handle lock converts in cancel handler 14/32314/5
Mikhail Pershin [Mon, 7 May 2018 20:36:55 +0000 (23:36 +0300)]
LU-10175 ldlm: handle lock converts in cancel handler

- Use cancel portals and high-priority handling for lock
  converts. Update ldlm_cancel_handler to understand
  LDLM_CONVERT RPC for that.
- Use ns_dirty_age_limit for lock convert - don't convert too old
  locks.
- Check for empty converts and skip such

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I767626acd974ad88bbbf0bb3b0a46744c45b7897
Reviewed-on: https://review.whamcloud.com/32314
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>