Whamcloud - gitweb
fs/lustre-release.git
7 years agoLU-8615 kernel: kernel update RHEL7.2 [3.10.0-327.36.1.el7] 08/22608/2
Bob Glossman [Wed, 14 Sep 2016 17:25:14 +0000 (10:25 -0700)]
LU-8615 kernel: kernel update RHEL7.2 [3.10.0-327.36.1.el7]

Update RHEL7.2 kernel to 3.10.0-327.36.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iaa93f7b110569dfb86cee502680cc786f7e688cd
Reviewed-on: http://review.whamcloud.com/22608
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8547 test: Skip ost-pools test_24 for older MDS version 33/22533/6
Saurabh Tandan [Thu, 15 Sep 2016 19:24:38 +0000 (12:24 -0700)]
LU-8547 test: Skip ost-pools test_24 for older MDS version

Skipping ost-pools test_24 if server version is less than 2.8.56

Test-Parameters: trivial testlist=ost-pools,ost-pools,ost-pools,ost-pools
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: Id93490606c50a0528d4576f64c808e11255ff01d
Reviewed-on: http://review.whamcloud.com/22533
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-4315 doc: split lctl-network.8 man page from lctl.8 24/22324/2
Andreas Dilger [Tue, 6 Sep 2016 08:42:11 +0000 (02:42 -0600)]
LU-4315 doc: split lctl-network.8 man page from lctl.8

Split the lctl-network.8 man page from lctl.8 so that it can have a
more complete description and usage messages.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I5c3b94fca277c53b249d09ac9e65266a8b8ad322
Reviewed-on: http://review.whamcloud.com/22324
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8535 lod: RR policy should not allocate on same ost 90/22090/3
Rahul Deshmkuh [Wed, 24 Aug 2016 05:28:21 +0000 (10:58 +0530)]
LU-8535 lod: RR policy should not allocate on same ost

Problem: With Round Robin (RR) policy we should not allowed
to create objects on same ost but currently it is possible.

Solution: lod_check_and_reserve_ost() skips a check for
already used OST when speed=0 i.e. at the first round of
object allocation. Enabling the check unconditionally to
fix above mention problem.

This patch contains both re-producer and the fix.

Signed-off-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Seagate-bug-id: MRP-3480
Change-Id: I80895f8d7cc0a146a098869842bbc256152e6c2e
Reviewed-on: http://review.whamcloud.com/22090
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6638 test: wait for grace delay in sanity-hsm test_37() 84/21284/5
John L. Hammond [Wed, 13 Jul 2016 16:31:42 +0000 (11:31 -0500)]
LU-6638 test: wait for grace delay in sanity-hsm test_37()

In sanity-hsm test_37(), allow the previous archive request to expire
from the actions log by calling wait_for_grace_delay().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I42fe4eec44734a9afd59f67d4532c1fe23402269
Reviewed-on: http://review.whamcloud.com/21284
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8625 tests: replay-single test_87 renamed to test_87a 91/22591/3
Chennaiah Palla [Mon, 19 Sep 2016 11:59:56 +0000 (17:29 +0530)]
LU-8625 tests: replay-single test_87 renamed to test_87a

test_87() renamed as to test_87a() to be run separately from test_87b.

Seagate-bug-id: MRP-3820
Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Change-Id: I22db42eef396f9021b974f4fdb5c76614d4405f5
Reviewed-on: http://review.whamcloud.com/22591
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-8454 mountconf: delay FS default stripe config setting 80/22580/2
Lai Siyao [Sun, 18 Sep 2016 07:36:05 +0000 (15:36 +0800)]
LU-8454 mountconf: delay FS default stripe config setting

Previously http://review.whamcloud.com/21612 disabled filesystem
default stripe config setting after 2.10, this is too aggressive,
now delay this feature till > 2.13.53.

Also #ifdef more unused functions in this feature.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ic4d8c9abc837f4d9c9c7992b1f297dbb361beb47
Reviewed-on: http://review.whamcloud.com/22580
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
7 years agoLU-8619 osd: define DN_MAX_BONUSLEN 23/22523/4
Alex Zhuravlev [Thu, 15 Sep 2016 17:05:45 +0000 (20:05 +0300)]
LU-8619 osd: define DN_MAX_BONUSLEN

ZFS master doesn't define DN_MAX_BONUSLEN which is still used
by osd-zfs - define it as DN_OLD_MAX_BONUSLEN.

Change-Id: Ieec310ee4d368cae7c5c93808caddb0db0b08fc7
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/22523
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8018 lov: ld_target could be NULL 11/21411/7
Bobi Jam [Tue, 19 Jul 2016 00:25:01 +0000 (08:25 +0800)]
LU-8018 lov: ld_target could be NULL

lov_device::ld_target[ost_idx] could be NULL if the OST target is
not filled in lov_device::ld_lov::lov_tgt_desc[ost_idx] yet.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I1eddb49b3c3518828c531af568b851465ccdffa3
Reviewed-on: http://review.whamcloud.com/21411
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6245 server: remove types abstraction from quota/target/nodemap code 87/21187/12
James Simmons [Sat, 10 Sep 2016 15:01:31 +0000 (11:01 -0400)]
LU-6245 server: remove types abstraction from quota/target/nodemap code

Originally when lustre code was built for userland we needed
a proper way to handle 32 bit and 64 bit platforms when
reporting unsigned longs. Now that this code is only built
for kernel space and the kernel has it own special string
handling functions we don't need this abstraction anymore.
Remove this abstraction from the quota/target/nodemap code.

Change-Id: Ie3d4fa79dc687fc85296c3a4b21655001d0c7081
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/21187
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7428 test: remove test 84 from ALWAYS_EXCEPT 94/20194/8
Hongchao Zhang [Wed, 3 Feb 2016 14:34:39 +0000 (22:34 +0800)]
LU-7428 test: remove test 84 from ALWAYS_EXCEPT

Debug patch to remove the test 84 from ALWAYS_EXCEPT in conf-sanity
temporarily to verify the problem is really fixed.

Test-Parameters: trivial testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity
Test-Parameters: testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity
Change-Id: I1d2dd16f33f0713a5e11d243708865b45ec283cb
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/20194
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6245 libcfs: remove byteorder.h 16/16916/8
James Simmons [Fri, 16 Sep 2016 15:02:15 +0000 (11:02 -0400)]
LU-6245 libcfs: remove byteorder.h

With the cleanup of userland with libcfs we no longer
need the special byte ordering macros. Kernel space
can just use what is provided by the kernel already.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: Ic154005a24ffae6d6773d4664a6e75d3ead346af
Reviewed-on: http://review.whamcloud.com/16916
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
7 years agoLU-8472 scrub: try to avoid recovery during OI scrub 18/21918/4
Fan Yong [Tue, 21 Jun 2016 19:12:26 +0000 (03:12 +0800)]
LU-8472 scrub: try to avoid recovery during OI scrub

It is known issue that FID based operation will hit -EINPROGRESS
or -EREMCHG failure if related OI mapping is invalid (most cases
because file-level backup/restore).

On the other hand, the recovery for cross-MDTs modifications will
trigger FID based operation(s) before OI scrub rebuilding related
OI mappings.

So during sanity-scrub tests, the scripts should avoid cross-MDTs
recovery via sync all transactions before file-level backup.

More warning message about the recovery failure if because of bad
OI mappings.

Another fix is about setting LOC_F_NEW flag for the object to be
created via out_create().

Test-Parameters: mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4 testlist=sanity-scrub,sanity-scrub,sanity-scrub
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I6e8bc9c5d587be72ecd7e33fa7e9959fe5b34006
Reviewed-on: http://review.whamcloud.com/21918
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8301 lfsck: handle ROOT fid properly 69/20869/4
Fan Yong [Fri, 13 May 2016 15:51:33 +0000 (23:51 +0800)]
LU-8301 lfsck: handle ROOT fid properly

It is found that the lfsck_find_mdt_idx_by_fid() will return
failure for "ROOT" object. That is incorrect. In fact "ROOT"
is always on the MDT0.

Test-Paramenters: trivial testlist=sanity-scrub,sanity-lfsck

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4a1b796ffc574f8f11611d1b329ce79ad2135eb7
Reviewed-on: http://review.whamcloud.com/20869
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8599 utils: restore lshowmount utility 93/17593/13
Andreas Dilger [Wed, 14 Sep 2016 17:49:55 +0000 (01:49 +0800)]
LU-8599 utils: restore lshowmount utility

The lshowmount utility was removed in commit b5a7260ae8f as being
obsolete, but it was not as unused as previously thought.

Restore lshowmount.c, lshowmount.8, nidlist.c, and nidlist.h  from
history, with minor updates to the Makefiles and .gitignore to avoid
conflicts in context.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ibf6f72684d4dcaa95e0366b4fde74386893ebbe5
Reviewed-on: http://review.whamcloud.com/17593
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8629 obdclass: Fix error exit loop in cl_env_percpu_init 35/22635/4
Sandhya Bankar [Tue, 20 Sep 2016 16:03:42 +0000 (12:03 -0400)]
LU-8629 obdclass: Fix error exit loop in cl_env_percpu_init

Clearly we should be using the loop counter j, not loop bound i
in the cleanup loop there.

Change-Id: I84a682846e7bdfc93786ab47ea3c0603f2849860
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/22635
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-8499 wireshark: fix packet-lustre so it compiles 74/22074/3
Kit Westneat [Tue, 23 Aug 2016 16:40:28 +0000 (12:40 -0400)]
LU-8499 wireshark: fix packet-lustre so it compiles

This patch removes a duplicate function definition for
lustre_dissect_struct_capa, and adds a definition for
ett_lustre_ladvise, which is referenced in
lustre_dissect_struct_lu_ladvise, but wasn't defined previously.

Test-Parameters: trivial

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I700e2388175130d7274f4e3fac332cf069afbdcb
Reviewed-on: http://review.whamcloud.com/22074
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
7 years agoLU-8300 lfsck: pass LOC_F_NEW when create new object 68/20868/3
Fan Yong [Fri, 13 May 2016 15:44:35 +0000 (23:44 +0800)]
LU-8300 lfsck: pass LOC_F_NEW when create new object

For lfsck_create_lpf() case, we know that the target object of
'.lustre/lost+found/MDTxxxx' does not exist, need to pass the
lu_object_conf parameter 'LOC_F_NEW' to the lower layer, then
the OSD will not return -115 to the LFSCK if OI files lost.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I0c348dea99e7ff98f432e09a9664c6ba46567f11
Reviewed-on: http://review.whamcloud.com/20868
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8218 osd: handle stale OI mapping for non-restore case 59/20659/8
Fan Yong [Fri, 13 May 2016 15:44:20 +0000 (23:44 +0800)]
LU-8218 osd: handle stale OI mapping for non-restore case

Sometimes, the user may removes the MDT-object under ldiskfs mode
directly but leaves related OI mapping there. Such case also can
happen if the MDT-object lost because of disk corruption. Under
such case, the OSD ldiskfs should has the ability to distinguish
it from the case of MDT file-level backup/restore; otherwise, the
up layer user will get -EREMCHG (78) when locating such MDT-object
with the FID.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iede2542968c21755158637089d20a694f12b309e
Reviewed-on: http://review.whamcloud.com/20659
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6245 server: remove types abstraction from MDS/MGS code 71/22371/4
James Simmons [Fri, 9 Sep 2016 16:19:43 +0000 (12:19 -0400)]
LU-6245 server: remove types abstraction from MDS/MGS code

Originally when lustre code was built for userland we needed
a proper way to handle 32 bit and 64 bit platforms when
reporting unsigned longs. Now that this code is only built
for kernel space and the kernel has it own special string
handling functions we don't need this abstraction anymore.
Remove this abstraction from the MGS/MDS server side code.

Change-Id: I963ab240abbc650289040ee14f267f344c9f4124
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22371
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6245 server: remove types abstraction from lfsck and OSS code 70/22370/4
James Simmons [Sat, 10 Sep 2016 15:06:26 +0000 (11:06 -0400)]
LU-6245 server: remove types abstraction from lfsck and OSS code

Originally when lustre code was built for userland we needed
a proper way to handle 32 bit and 64 bit platforms when
reporting unsigned longs. Now that this code is only built
for kernel space and the kernel has it own special string
handling functions we don't need this abstraction anymore.
Remove this abstraction from the lfsck and OSS related code.

Change-Id: I7663c953d47866fe75644676f581c5074c775fdc
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22370
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8021 tests: Add leading $ to "DEBUGSAVE_SERVER" 86/22586/3
Emoly Liu [Mon, 19 Sep 2016 06:13:16 +0000 (14:13 +0800)]
LU-8021 tests: Add leading $ to "DEBUGSAVE_SERVER"

Add leading $ to "DEBUGSAVE_SERVER", otherwise the check is always
true and the code tries to restore an empty string and it produces
a lot of spurious error messages in the test logs.

Test-Parameters: trivial
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I4e9206bc312af22fea22bf6ab634469439008003
Reviewed-on: http://review.whamcloud.com/22586
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Parinay Kondekar <parinay.kondekar@seagate.com>
7 years agoLU-5050 libcfs: default CPT matches NUMA topology 77/22377/3
Dmitry Eremin [Thu, 8 Sep 2016 06:50:59 +0000 (09:50 +0300)]
LU-5050 libcfs: default CPT matches NUMA topology

Change default value of CPT pattern and make it match NUMA topology

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Iea76deec2face42a01e4aeda690e277be31325a9
Reviewed-on: http://review.whamcloud.com/22377
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7949 osd: Move assignment below LASSERT() 28/22428/2
Arshad Hussain [Fri, 2 Sep 2016 23:12:18 +0000 (04:42 +0530)]
LU-7949 osd: Move assignment below LASSERT()

This patch moves osd_dt_dev() call and assignment of
qsd_instance below LASSERT() under function osd_declare_qid().
This avoids a case of dereferencing osd_thandle parameter
when passed as NULL. Although osd_dt_dev() does its own
checking it is better to move it below LASSERT(). Patch
also adds LASSERT() after osd_dt_dev() call.

Signed-off-by: Arshad Hussain <arshad.hussain@seagate.com>
Change-Id: I80922d372ee768c42d5d34be8222fd5e089bbda5
Reviewed-on: http://review.whamcloud.com/22428
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8135 osc: limits the number of chunks in write RPC 69/22369/7
Jinshan Xiong [Mon, 12 Sep 2016 18:17:10 +0000 (11:17 -0700)]
LU-8135 osc: limits the number of chunks in write RPC

OSC has to make sure that it won't issue write RPCs with too many
chunks otherwise it will casue ZFS to create transactions much
bigger than DMU_MAX_ACCESS in size, which will end up with write
failure.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ib68b09afca35c253ef0a6b569f64f555e08bd11b
Reviewed-on: http://review.whamcloud.com/22369
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8549 test: optimize restore_and_check_size() in sanity-hsm 45/22145/3
Li Xi [Mon, 22 Aug 2016 07:40:56 +0000 (15:40 +0800)]
LU-8549 test: optimize restore_and_check_size() in sanity-hsm

Optimize restore_and_check_size() in sanity-hsm so that time won't
be wasted when waiting.

Change-Id: Ie683953670c618a1790b71b8ab55f24f61198073
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: http://review.whamcloud.com/22145
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8551 test: Use mds1 rather than mds to operate on MDT0000 47/22147/2
Li Xi [Mon, 22 Aug 2016 10:17:18 +0000 (18:17 +0800)]
LU-8551 test: Use mds1 rather than mds to operate on MDT0000

Use mds1 rather than mds to operate on MDT0000, fixed following cases:
conf-sanity.sh: test_43, test_58, test_62
sanityn.sh: test_54, test_55.

Test-Parameters: testlist=conf-sanity
Test-Parameters: testlist=sanityn

Change-Id: Ibe0b419f837e4478610d61b8efb54af93662a45e
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: http://review.whamcloud.com/22147
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8529 config: Print clear message when ldiskfs series not found 84/22084/3
Christopher J. Morrone [Tue, 23 Aug 2016 22:25:01 +0000 (15:25 -0700)]
LU-8529 config: Print clear message when ldiskfs series not found

Add a clear AC_MSG_RESULT failure messages when an ldiskfs series
is not successfully identified in the LDISKFS_LINUX_SERIES macro.

Change-Id: If4a0350e67cd625a5b2a8fc549e341ca4c339ff9
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/22084
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-4444 test: Re-enable conf-sanity/69 for ZFS 43/21543/5
Nathaniel Clark [Wed, 27 Jul 2016 14:46:08 +0000 (10:46 -0400)]
LU-4444 test: Re-enable conf-sanity/69 for ZFS

Enable conf-sanity/69 for ZFS now that ZFS is faster.
Fix zpool import for conf-sanity/50i, should be mds2 not ost1.

Test-Parameters: trivial testlist=conf-sanity mdsfilesystemtype=zfs mdtfilesystemtype=zfs ostfilesystemtype=zfs envdefinitions=SLOW=yes
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I4a367e217ec05ab4f678f58f0ffae95d84e33cc6
Reviewed-on: http://review.whamcloud.com/21543
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-5473 tests: sanity/51b Account for ZFS inode size 21/21821/6
Nathaniel Clark [Mon, 8 Aug 2016 20:35:00 +0000 (16:35 -0400)]
LU-5473 tests: sanity/51b Account for ZFS inode size

Account for inode size for 11KB for ZFS (and 4KB for LDISKFS) when
checking space on MDC during sanity test_51b.
Do lfs df and lfs df -i on completion of sanity/test_51b.  This helps
determine the correct inode size accounting.

Test-Parameters: trivial mdsfilesystemtype=zfs mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity envdefinitions=SLOW=yes mdscount=1
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Icd8e0d55a89d8e3d22bbb1b2ff206e238a7262ac
Reviewed-on: http://review.whamcloud.com/21821
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8100 tests: Test the correctness of target_obd 46/20546/15
Giuseppe Di Natale [Wed, 1 Jun 2016 15:34:47 +0000 (08:34 -0700)]
LU-8100 tests: Test the correctness of target_obd

The target_obd proc file is what is used by lfs to list mdts
of a lustre file system. Added a conf-sanity test to ensure
correctness of lfs mdts output.

Test-Parameters: trivial mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdtcount=1 testlist=conf-sanity
Test-Parameters: trivial mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdtcount=2 mdtcount=4 testlist=conf-sanity
Test-Parameters: trivial mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs mdtcount=1 testlist=conf-sanity

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Change-Id: Ia876ed456b95570b90bc90f05d8b7b97e1aa71af
Reviewed-on: http://review.whamcloud.com/20546
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
7 years agoLU-6142 lnet: replace white spaces with tabs for LNet core 75/19975/10
James Simmons [Tue, 13 Sep 2016 17:34:51 +0000 (13:34 -0400)]
LU-6142 lnet: replace white spaces with tabs for LNet core

This work converts all the remaining white spaces left
in the LNet layer to the proper tab format. Fixed all
space prohibited warnings reported by checkpatch. Any
other space issues reported by checkpatch are also
addressed for the code that was retabbed.

Test-Parameters: trivial

Change-Id: I12589439c9532d1d3989deee00aa68c29f84db85
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/19975
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoNew tag 2.8.58 2.8.58 v2_8_58 v2_8_58_0
Oleg Drokin [Tue, 20 Sep 2016 16:00:57 +0000 (12:00 -0400)]
New tag 2.8.58

Change-Id: I7891ad025b041e318a96e35a5adaa13dc324fae4
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8508 nodemap: improve object handling in cache saving 04/22004/7
Kit Westneat [Thu, 18 Aug 2016 15:00:35 +0000 (11:00 -0400)]
LU-8508 nodemap: improve object handling in cache saving

Saving cache files requires that the old cache be removed. This means
that the config object reference needs to change to point to the new
file. Previously this was done in a number of different places and
was more opaque. This patch hopefully makes it more transparent.

This patch also fixes a problem on MDTs/OSTs when creating a new
config. Previously all initial config creates would assume an MGS
context. A side-effect of the improved object handling is that a
target context is handled correctly.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Iee33a423b76f30eba849288c746e6528ecefa7c6
Reviewed-on: http://review.whamcloud.com/22004
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-4931 ladvise: Add dontneed advice support for ladvise 03/20203/8
Li Xi [Tue, 14 Jun 2016 05:15:04 +0000 (13:15 +0800)]
LU-4931 ladvise: Add dontneed advice support for ladvise

This patch addds DONTNEED advice to ladvise framework. OSS will
cleanup the page cache of the file when this hint is provided.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Change-Id: If5cf7f3193924ca7cccb96d8d841c0d889469393
Reviewed-on: http://review.whamcloud.com/20203
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-930 misc: update README path to git hooks 28/13428/4
Olaf Faaland [Tue, 25 Aug 2015 00:41:26 +0000 (17:41 -0700)]
LU-930 misc: update README path to git hooks

Update the README file so that the paths given to the git hooks stored
in contrib are current.

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: If864751d88ad54de629a1cda9d1ba6f2089ecd69
Reviewed-on: http://review.whamcloud.com/13428
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-8340 nodemap: debug patch 07/21907/5
Kit Westneat [Fri, 12 Aug 2016 15:50:09 +0000 (11:50 -0400)]
LU-8340 nodemap: debug patch

Add some debugging info to the sanity-sec test to figure out why
setquota is failing.

Run the 'full' test group and sanity-sec multiple times.

Test-Parameters: trivial envdefinitions=SLOW=yes testlist=sanity-sec,sanity-sec,sanity-sec,sanity-sec,sanity-sec,sanity-sec,sanity-sec

Test-Parameters: trivial envdefinitions=SLOW=yes testgroup=full

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I2202b9acc006f8f35c65b8c56e6c3c5c6a3852f0
Reviewed-on: http://review.whamcloud.com/21907
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-8537 kernel: kernel update RHEL6.8 [2.6.32-642.4.2.el6] 94/22194/4
Bob Glossman [Sat, 27 Aug 2016 22:06:46 +0000 (15:06 -0700)]
LU-8537 kernel: kernel update RHEL6.8 [2.6.32-642.4.2.el6]

Update RHEL6.8 kernel to 2.6.32-642.4.2.el6

Test-Parameters: clientdistro=el6.8 mdsdistro=el6.8 ossdistro=el6.8 \
  mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Idce534b52b69fab8564fbfb6de57cd30d74c84eb
Reviewed-on: http://review.whamcloud.com/22194
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7310 clio: sync write should update mtime 63/21063/6
Niu Yawei [Wed, 29 Jun 2016 12:57:22 +0000 (08:57 -0400)]
LU-7310 clio: sync write should update mtime

Sync write should update m/ctime promptly, otherwise, stale m/ctime
could be updated on the OST object by the sync write RPC.

Added sanityn test_39d to verify this.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9c7e1d75f610a3104c163df9d68c33442d8fe3f4
Reviewed-on: http://review.whamcloud.com/21063
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7950 osp: Remove assigned but not used variable 24/21024/4
Arshad Hussain [Fri, 24 Jun 2016 10:15:08 +0000 (15:45 +0530)]
LU-7950 osp: Remove assigned but not used variable

This patch removes import variable which were assigned but
were not used within function
1. lwp_device_fini() under lustre/osp/lwp_dev.c
2. osp_device_fini() under lustre/osp/osp_dev.c

Signed-off-by: Arshad Hussain <arshad.hussain@seagate.com>
Change-Id: If37bc78060aaf97127802c0fcbf50bc5483977fc
Reviewed-on: http://review.whamcloud.com/21024
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6441 ptlrpc: fix sanity 224c for different RPC sizes 03/22403/3
Jinshan Xiong [Fri, 9 Sep 2016 04:31:34 +0000 (21:31 -0700)]
LU-6441 ptlrpc: fix sanity 224c for different RPC sizes

- fail_loc should be set on the OST side;
- an RPC can have 16 bulk descriptors at most, make the test case
  usable even with smaller RPC size.

Patch http://review.whamcloud.com/14399 added sanity.sh test_224c
to verify correct handling of failures with bulk transfers over 1MB,
but did not correctly handle the different transfer sizes possible.

Test-Parameters: trivial
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I9d0bc0c523cb71d95c6165066e666878c2a380cc
Reviewed-on: http://review.whamcloud.com/22403
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8560 libcfs: handle page_cache_*() removal in newer kernels 85/22385/3
James Simmons [Thu, 8 Sep 2016 16:39:14 +0000 (12:39 -0400)]
LU-8560 libcfs: handle page_cache_*() removal in newer kernels

Since page cache handling never was handled differently than
normal pages the page_cache_*() macros have been removed
starting with linux kernel 4.6. Now put_page() and get_page()
need to be used instead.

Second change is that get_user_page dropped the first two
arguments in linux kernel version 4.6. We handle this change
as well in libcfs.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I84c347a55c45e0794b913134f1abdd45926c24e8
Reviewed-on: http://review.whamcloud.com/22385
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-8284 osd-ldisksf: need lock around i_size update 03/22103/3
Bobi Jam [Wed, 24 Aug 2016 15:33:07 +0000 (09:33 -0600)]
LU-8284 osd-ldisksf: need lock around i_size update

In OSD layer i_size_write() needs protection.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ia893679129cb1335cdf612ec7f38492435d19db4
Reviewed-on: http://review.whamcloud.com/22103
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6808 ptlrpc: properly set "rq_xid" for 4MB IO 73/22373/5
Fan Yong [Mon, 4 Jul 2016 20:41:59 +0000 (04:41 +0800)]
LU-6808 ptlrpc: properly set "rq_xid" for 4MB IO

The commit d099fdd6 replaced the "rq_xid" with "rq_mbits" as
the matchbits of bulk data transferring. To be interoperable
with old servers, it introduced the new connection flag:
OBD_CONNECT_BULK_MBITS. If the server does not support such
feature, then the "rq_xid" would be set the same value as
"rq_mbits". Unfortunately, it forgot to handle multiple bulk
operations, for example 4MB IO. If the new client wants to
make 4MB IO with old server, it may send a small "rq_xid" to
the old server, as to the old server will regard it as an 1MB
IO or 2MB IO. So the data transfer will not complete because
of only part of data transferred. Then the client will timeout
failure and retry again and again.

Test-Parameters: alwaysuploadlogs testlist=sanity envdefinitions=ONLY=224c ossjob=lustre-b2_7_fe mdsjob=lustre-b2_7_fe ossbuildno=95 mdsbuildno=95 mdsdistro=el6.7 ossdistro=el6.7
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9b1c0de13674f16443bef2b454c491e6c72b8ab3
Reviewed-on: http://review.whamcloud.com/22373
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7809 lod: stop recovery before destory dtrq list 51/18651/8
Di Wang [Thu, 16 Jun 2016 23:23:21 +0000 (19:23 -0400)]
LU-7809 lod: stop recovery before destory dtrq list

Let's stop the recovery thread before destroying
update recovery list, which might cause race
especially when doing umount during recovery.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I96fd2dd09caadb458723001f535d53f1d468394b
Reviewed-on: http://review.whamcloud.com/18651
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8600 tests: ignore error if running in vm for sanity 399 27/22427/2
Jinshan Xiong [Sat, 10 Sep 2016 01:50:37 +0000 (18:50 -0700)]
LU-8600 tests: ignore error if running in vm for sanity 399

Performance in vm is not reliable. Define a new function
error_not_in_vm() to handle this common case.

Test-Parameters: trivial
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I792ec7531564cbc2d80504e77fb3273b79c7ab96
Reviewed-on: http://review.whamcloud.com/22427
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-6284 ptlrpc: comment for FLD_QUERY RPC reply swab 09/22309/3
Fan Yong [Sat, 2 Jul 2016 06:30:08 +0000 (14:30 +0800)]
LU-6284 ptlrpc: comment for FLD_QUERY RPC reply swab

The 'fld_read_server' uses 'RMF_GENERIC_DATA' to hold the 'FLD_QUERY'
RPC reply that is composed of 'struct lu_seq_range_array'. But there
is not registered swabber function for 'RMF_GENERIC_DATA'. So the RPC
peers need to handle the RPC reply with fixed little-endian format.

In theory, we can define new structure with some swabber registered
to handle the 'FLD_QUERY' RPC reply result automatically. But from
the implementation view, it is not easy to be done within current
'struct req_msg_field' framework. Because the sequence range array
in the RPC reply is not fixed length, instead, its length depends
on 'lu_seq_range' count, that is unknown when prepare the RPC buffer.
Generally, for such flexible length RPC usage, there will be a field
in the RPC layout to indicate the data length. But for the 'FLD_READ'
RPC, we have no way to do that unless we add new length filed that
will broken the on-wire RPC protocol and cause interoperability
trouble with old peer.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I466a7e229e4ecbb062e6d0f8eea3c6f053ef5e75
Reviewed-on: http://review.whamcloud.com/22309
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
7 years agoLU-7898 osd: remove unnecessary declarations 96/22296/2
Alex Zhuravlev [Wed, 23 Mar 2016 18:42:54 +0000 (21:42 +0300)]
LU-7898 osd: remove unnecessary declarations

Refactor the code a bit to remove unnecessary declarations
(which are very expensive in ZFS). The patch also introduces
initial preparations to support large dnodes - it tracks
all declared EAs at object creation and tracked number can
be used to request dnode of appropriate size.

With this patch + LU-7918 disk/memory space reserved for a
single-stripe creation goes down from ~33MB to 4.6MB.

Performance improvements from this patch are also significant.
Running mdtest create performance on a test node (ramdisk):

    Threads    0.6.5   0.6.5+patch
        1       9933       14279
        2      12870       20469
        4      16405       26407
        8      19320       28254
       16      15648       26620
       32      14107       26483

Change-Id: I2c25542e51a320b1b48b4782b5f0b43799de5fe9
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/19101
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/22296

7 years agoLU-8524 tests: Awk command re-structured to pass correct value 70/22070/3
Saurabh Tandan [Tue, 23 Aug 2016 01:12:21 +0000 (18:12 -0700)]
LU-8524 tests: Awk command re-structured to pass correct value

Variable "selinux_policy" was having incorrect value passed to
it due to inappropriate structuring of couple of commands.
Both those commands have been restructured in order to have the
correct value passed to the variable "selinux_policy".
Test was run with this modified piece of code successfully,
and the result for it can be found in comments section of the
ticket.

Test-Parameters: trivial testlist=sanity-selinux,sanity-selinux,sanity-selinux
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: I3c6c86d607edeadd03ab694435fb201c08c23654
Reviewed-on: http://review.whamcloud.com/22070
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8165 target: detect race by checking last_rcvd slot index 28/20328/2
Bruno Faccini [Thu, 19 May 2016 15:04:39 +0000 (17:04 +0200)]
LU-8165 target: detect race by checking last_rcvd slot index

A race can occur on Server during Client connection and
concurent eviction, when Client's last_rcvd slot index has still
not been assigned (-1).
This patch adds a check to address such condition.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ifead82719a0dc9411f1b79d6c8c59eb9ef339fa5
Reviewed-on: http://review.whamcloud.com/20328
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Grégoire Pichon <gregoire.pichon@bull.net>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8081 osd-ldiskfs: improve transaction debug message 65/19865/5
Andreas Dilger [Fri, 1 Jul 2016 23:53:42 +0000 (07:53 +0800)]
LU-8081 osd-ldiskfs: improve transaction debug message

Print the actual credit limits that were exceeded when complaining
on the console about problems with transaction credit accounting.

Ensure all transaction credit messages include the device name.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I17125bb39ecaf699a722ac77bf29060cde3ebbe5
Reviewed-on: http://review.whamcloud.com/19865
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7650 o2iblnd: Put back work queue check previously removed 81/22281/3
Doug Oucharek [Fri, 2 Sep 2016 01:15:09 +0000 (18:15 -0700)]
LU-7650 o2iblnd: Put back work queue check previously removed

The previous patch, http://review.whamcloud.com/21304/, removed
a check needed until LU-5718 is properly addressed.  With
the check, LU-5718 results in an error message and a lost
RDMA operation.  Without it, we have memory corruption and
a crash (much harder to debug).

Putting the check back in case LU-5718 is not fixed soon.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I2efcc4e60a80794b38174da707d2a7fc27f81b6a
Reviewed-on: http://review.whamcloud.com/22281
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8410 ldiskfs: new FIEMAP API 03/21603/7
Alexey Lyashkov [Fri, 29 Jul 2016 16:12:57 +0000 (19:12 +0300)]
LU-8410 ldiskfs: new FIEMAP API

With RH 6.5 old API was deprecated and was removed.
Backport a new API from ext4 upstream in opposite to copy-paste
older buggy code as FIEMAP now uses in current code.

Kernel upstream commit is 91dd8c114499e9818f2d5919ef0b9eee61810220
ext4: prevent race while walking extent tree for fiemap.

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Change-Id: I7790c9e1a9cbfbd2cc429292aa764250e0525e21
Reviewed-on: http://review.whamcloud.com/21603
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8348 recovery: don't send last_committed after panic 60/21060/2
Alexey Lyashkov [Wed, 29 Jun 2016 11:36:31 +0000 (14:36 +0300)]
LU-8348 recovery: don't send last_committed after panic

Do not update last_committed if we are not sure the
commit was successful.

Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Seagate-bug-id: MRP-3013
Change-Id: I176b86a01cac46bd7d6af85843135a57a3df0e87
Reviewed-on: http://review.whamcloud.com/21060
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7782 scrub: handle slave obj of striped directory 06/21506/5
Fan Yong [Fri, 17 Jun 2016 22:50:02 +0000 (06:50 +0800)]
LU-7782 scrub: handle slave obj of striped directory

When lookup item under striped directory, we need to locate the
master MDT-object of the striped directory firstly, then the
client will send lookup (getattr_by_name) RPC to the MDT with
some slave MDT-object's FID and the item's name. If the system
is restored from MDT file level backup, then before the OI scrub
completely built the OI files, the OI mappings of the master
MDT-object and slave MDT-object may be invalid. Usually, it is
not a problem for the master MDT-object. Because when locate the
master MDT-object, we will do name based lookup (for the striped
directory itself) firstly, during such process we can setup the
correct OI mapping for the master MDT-object. But it will be
trouble for the slave MDT-object. Because the client will not
trigger name based lookup on the MDT to locate the slave
MDT-object before locating item under the striped directory,
then when osd_fid_lookup(), it will find that the OI mapping
for the slave MDT-object is invalid and does not know what the
right OI mapping is, then the MDT has to return -EINPROGRESS to
the client to notify that the OI scrub is rebuiding the OI file,
related OI mapping is unknown yet, please try again later. And
then client will re-try the RPC again and again until related
OI mapping has been updated. That is quite inefficient.

To resolve above trouble, we will handle it as the following
two cases:

1) The slave MDT-object and the MDT-object are on different
   MDTs. It is relative easy. Be as one of remote MDT-objects,
   the slave MDT-object is linked under /REMOTE_PARENT_DIR
   with the name of its FID string. We can locate the slave
   MDT-object via lookup the /REMOTE_PARENT_DIR directly.

2) The slave MDT-object and the MDT-object reside on the same
   MDT. Under such case, during lookup the master MDT-object,
   we will lookup the slave MDT-object via readdir the master
   MDT-object, because the slave MDT-objects information are
   stored as sub-directories with the name "${FID}:${index}".
   Then when find the local slave MDT-object, its OI mapping
   will be recorded. Then subsequent osd_fid_lookup() will
   know the correct OI mapping for the slave MDT-object.

The patch also fix a race between osd_fid_lookup and OI scrub:
the OI scrub thread will remove osd_inconsistent_item from the
global list before updating related OI mapping, and if someone
call osd_fid_lookup() for the OI mapping during such interval,
it will get failure and trigger full mode OI scrub by wrong.

The patch also enhance sanity-scrub to avoid DNE in sanity-scrub
on one MDT.

Test-Parameters: mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4 testlist=sanity-scrub,sanity-scrub,sanity-scrub
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I6b449ef86221410dfc16005a586ed140b9a48b38
Reviewed-on: http://review.whamcloud.com/21506
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7433 ldlm: xattr locks are lost on mdt 20/17220/10
Vitaly Fertman [Thu, 28 Jul 2016 21:44:14 +0000 (00:44 +0300)]
LU-7433 ldlm: xattr locks are lost on mdt

mdt_intent_getxattr() can return EFAULT if a buffer cannot be found,
it is returned after lock_replace, where a new lock is installed into
lockp. An error forces ldlm_lock_enqueue() to destroy the original
lock, but ldlm_handle_enqueue0() drops the reference on the new lock.
xattr client code implied intent error is returned under a lock,
which is immediately cancelled. Check if a lock obtained and cancel
it properly for error cases. Note: we should support both cases for
interop needs, an intent error under a lock and with a lock abort.
Keep returning a lock with an intent error for interop purposes for
now, to be dropped later when client will get old enough.
make all intent ops to work through md_intent_lock: getxattr
and layout, which should extract the intent error.

Signed-off-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Change-Id: I7b628b50448c4bdb26a3a8758fc16a44212ad9ac
Seagate-bug-id: MRP-3072 MRP-3137
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: http://review.whamcloud.com/17220
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8565 test: change sanity 255a to not fail for performance when running in VM 75/22375/3
Gu Zheng [Thu, 8 Sep 2016 02:57:25 +0000 (10:57 +0800)]
LU-8565 test: change sanity 255a to not fail for performance when running in VM

Considering we may run our testing in VMs with other parallel workloads,
and also out VMs are short on memory. Therefore the complete time of I/O
task is unreliable and depends on the workload on the host machine when
the task is running.
So as Andreas suggested, here we change sanity 255a to not fail even if
the performance isn't as expected when running in a VM, like we did to
sanity 248.

Test-Parameters: trivial

Change-Id: If2a76c64f053dc6c7dc8acf4afd5a68ea3a757b6
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: http://review.whamcloud.com/22375
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
7 years agoLU-8518 kernel: kernel update [SLES12 SP1 3.12.62-60.62] 45/22045/3
Bob Glossman [Fri, 19 Aug 2016 15:45:55 +0000 (08:45 -0700)]
LU-8518 kernel: kernel update [SLES12 SP1 3.12.62-60.62]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12 testgroup=review-ldiskfs \
  mdsdistro=sles12 ossdistro=sles12 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I3f15d7910e4d356ee696696c3c9af9d9b9d589f2
Reviewed-on: http://review.whamcloud.com/22045
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8513 kernel: kernel update RHEL7.2 [3.10.0-327.28.3.el7] 49/22049/2
Bob Glossman [Thu, 18 Aug 2016 16:32:27 +0000 (09:32 -0700)]
LU-8513 kernel: kernel update RHEL7.2 [3.10.0-327.28.3.el7]

Update RHEL7.2 kernel to 3.10.0-327.28.3.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I39b888ff6bcb905dd5f5b58c3a014734e4144742
Reviewed-on: http://review.whamcloud.com/22049
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8510 dne: set osd_obj_ea_ops::dt_invalidate 17/22017/2
Bobi Jam [Fri, 19 Aug 2016 03:07:45 +0000 (11:07 +0800)]
LU-8510 dne: set osd_obj_ea_ops::dt_invalidate

git commit 226fd401f9d8bfcd1a71bf264d9baef1e0842441 omits setting
dt_invalidate operation for osd_obj_ea_ops.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I33ae8b7239e056b3fb6981c9bc2dc0ec3c530e15
Reviewed-on: http://review.whamcloud.com/22017
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-522 lod: do not ignore degraded flag of ost. 47/20747/5
Jadhav Vikram [Wed, 2 Mar 2016 01:48:20 +0000 (07:18 +0530)]
LU-522 lod: do not ignore degraded flag of ost.

QoS allocation algorithm ignores degraded flag of OSTs.
Added a check for degraded ost flag in lod_alloc_qos().

Seagate-bug-id: MRP-2820

Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Change-Id: Ib2390518afff7b9bd459ce64bf609af99071e46d
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/9966
Tested-by: Jenkins
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Tested-by: Parinay Vijayprakash Kondekar <parinay.kondekar@seagate.com>
Reviewed-on: http://review.whamcloud.com/20747
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7655 tests: ost fake write for performance testing 64/5164/22
Jinshan Xiong [Mon, 29 Aug 2016 05:45:33 +0000 (22:45 -0700)]
LU-7655 tests: ost fake write for performance testing

Just drop the pages in ofd_commitrw_write(), but we need to maintain
correct file size and always create a transaction so client can pin
those pages in memory until transaction commits.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ia9a2af0a159c8969479656d3a7016db3cda71a91
Reviewed-on: http://review.whamcloud.com/5164
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8560 llite: handle is_compat_task() rename 08/22208/3
James Simmons [Mon, 29 Aug 2016 23:19:48 +0000 (19:19 -0400)]
LU-8560 llite: handle is_compat_task() rename

The linux kernel 4.6 renamed is_compat_task() to
in_compat_syscall().

Change-Id: I2d3733a1ec03873d000b9f25aa8a98c3b02be410
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22208
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8560 libcfs: handle stacktrace function address() change 07/22207/2
James Simmons [Sun, 28 Aug 2016 20:16:56 +0000 (16:16 -0400)]
LU-8560 libcfs: handle stacktrace function address() change

Starting in linux kernel 4.6 the address() function
from struct stacktrace now return an int. Update
Lustre to handle this change.

Change-Id: I7d14c9134de3ae5642e2cad7d1d3829eb4ee9c50
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22207
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8560 libcfs: handle PAGE_CACHE_* removal in newer kernels 06/22206/4
James Simmons [Sun, 28 Aug 2016 23:52:15 +0000 (19:52 -0400)]
LU-8560 libcfs: handle PAGE_CACHE_* removal in newer kernels

Starting with linux kernel 4.6 all the PAGE_CACHE_* defines
have been removed. Now it is required to use PAGE_* instead.
This is a simple blanket change since PAGE_CACHE_* was always
the same as PAGE_*.

Change-Id: I3ba8954d44969e2473afa939bbb8b8b5b1345446
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22206
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8560 libcfs: add autoconf test for crypto changes 05/22205/3
James Simmons [Sun, 28 Aug 2016 18:59:46 +0000 (14:59 -0400)]
LU-8560 libcfs: add autoconf test for crypto changes

For linux 4.5 kernels the simple ifdef test in
linux-crypto.c worked but with linux 4.6+ kernels
we need to add a proper crypto api test for the
new inline functions crypto_ahash_alg_name() and
crypto_ahash_driver_name().

Test-Parameters: trivial

Change-Id: Ic18808b622d374cf6dc2417220ed83adc43ea692
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22205
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8560 lustre: remove unused crypto handlers in lustre_compat.h 04/22204/4
James Simmons [Sun, 28 Aug 2016 19:58:17 +0000 (15:58 -0400)]
LU-8560 lustre: remove unused crypto handlers in lustre_compat.h

The unused crypto code in lustre_compat.h doesn't
build with linux kernel version 4.6+. Since its
not used just delete it.

Test-Parameters: trivial

Change-Id: If7634428357837372f4756b0ace3af9c2cd77366
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22204
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8407 recovery: more clear message about recovery failure 59/21759/4
Fan Yong [Fri, 24 Jun 2016 04:21:07 +0000 (12:21 +0800)]
LU-8407 recovery: more clear message about recovery failure

Currently, the DNE recovery depends on the update logs on the MDTs.
If fail to get the update logs from some MDT(s), then the recovery
cannot go ahead. Different from client-side recovery failure, the
cross-MDT recovery failure may cause the namespace inconsistency.
Because we does not want to export the inconsistent namespace to
client, then we make the recovery (not abort because of timeout)
to wait there until related update logs available.

So if some MDT does not up or not mount, then the recovery on other
MDTs will hung there. As the time going, the client (re)connection
will trigger warning message on the MDTs to say about the recovery
hung. But such message does not clearly describe what happened.

This patch addes callback interface in target_distribute_txn_data,
called 'tdtd_show_update_logs_retrievers'. It allows the users to
check which MDTs are still in fetching update logs. Then the admin
can check related MDTs in detail when hit recovery trouble.

This patch also introduce new recovery status "WAITING" for the
case of update logs not ready for some MDT(s). Under such case,
the non-ready MDTs index and waited time will be shown.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: If5ed4487fe1e6d94f02479d83f6a187d6427b3a7
Reviewed-on: http://review.whamcloud.com/21759
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8361 lfsck: detect Lustre device automatically 96/21596/4
Fan Yong [Tue, 21 Jun 2016 23:52:26 +0000 (07:52 +0800)]
LU-8361 lfsck: detect Lustre device automatically

Originally, when start/stop/query LFSCK, the user needs to
specify the Lustre device via "-M" option explicitly. Even
if there is only single Lustre device on current server or
the user wants to start the LFSCK on all devices with the
"-A" option specified, the "-M" option is still required.
Such requirement is inconvenient. This patch enhances the
LFSCK user interfaces to allow the user to run the LFSCK
commands without "-M" specified. Instead, it will select
the available Lustre device on current server automatically.
But under the following cases the "-M" option is still
required: if there are multiple devices on current server
those belong to different Lustre filesystems, or if "-A"
option is not specified and there are multiple devices on
current server.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I291b958440b2409c93cdc8ef3a5e3fbe14885141
Reviewed-on: http://review.whamcloud.com/21596
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-1482 mdd: Setting xattr are properly checked with and without ACLs 96/21496/4
Dmitry Eremin [Mon, 25 Jul 2016 14:04:12 +0000 (17:04 +0300)]
LU-1482 mdd: Setting xattr are properly checked with and without ACLs

Setting extended attributes permissions are properly checked with and
without ACLs. In user.* namespace, only regular files and directories
can have extended attributes. For sticky directories, only the owner
and privileged user can write attributes.

Intel-bug-id: LDEV-40
Intel-change: http://review.whamcloud.com/15848

Change-Id: Ibd79dcc15e61839d878f4847f7836f29d823be61
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: http://review.whamcloud.com/21496
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8498 nodemap: new zfs index files not properly initialized 39/21939/8
Kit Westneat [Tue, 16 Aug 2016 03:50:07 +0000 (23:50 -0400)]
LU-8498 nodemap: new zfs index files not properly initialized

Calling index ->next on a new zfs returns a non-zero RC, but ldiskfs
indexes start with a blank record. This change modifies the config
load code to always write the default nodemap to an empty index file.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I30a365f65463979889f09f7ad5ffcdacc83fa868
Reviewed-on: http://review.whamcloud.com/21939
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8333 test: make sure COS is cleared 24/21924/3
Hongchao Zhang [Mon, 27 Jun 2016 12:24:56 +0000 (20:24 +0800)]
LU-8333 test: make sure COS is cleared

In subtest 21b of replay-dual, the COS could be set after the MDT
is failed over, and the test will fail in this case

Change-Id: I9401b905593c76f8fddfab19ab9eb6c0fe886e41
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/21924
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7903 mdt: dump exports information on console 99/21599/6
Niu Yawei [Fri, 29 Jul 2016 06:54:16 +0000 (02:54 -0400)]
LU-7903 mdt: dump exports information on console

To avoid being truncated in debug log, obd_exports_barrier() should
dump the exports information on console along with the "Is it stuck?"
warning message.

Test-Parameters: testlist=recovery-small,recovery-small,recovery-small
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9dbaa7ed1d590db89ad6f42b66ec883dfb8b7ce1
Reviewed-on: http://review.whamcloud.com/21599
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-3815 tests: sanity-hsm - Remove tests from Always_Except 79/20079/5
Saurabh Tandan [Tue, 10 May 2016 00:02:53 +0000 (17:02 -0700)]
LU-3815 tests: sanity-hsm - Remove tests from Always_Except

Removing tests 34/35/36 from the ALWAYS_EXCEPT list

Test-Parameters: trivial \
testlist=sanity-hsm,sanity-hsm,sanity-hsm,sanity-hsm

Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: I293b45ab0f8ff27c4f35500ffa30ba348489e788
Reviewed-on: http://review.whamcloud.com/20079
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoRevert "LU-7898 osd: remove unnecessary declarations" 93/22293/2
Oleg Drokin [Fri, 2 Sep 2016 16:38:18 +0000 (16:38 +0000)]
Revert "LU-7898 osd: remove unnecessary declarations"

This patch causes build failures in master due to
reverted LU-7899 6cd79ab5860c5 patch that I failed
to catch in time due to deficiency in my build process.

This cannot be easily fixed since apparently a big
chunk of functionality was yanked from under this patch,
so I can only revert it for now.

This reverts commit ead6df2feee9c143b617cb60e50e403c955bd401.

Change-Id: I5ee89bf0c9260312f157c251b83dd417fa2cf260
Reviewed-on: http://review.whamcloud.com/22293
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8175 ldlm: conflicting PW & PR extent locks on a client 45/20345/5
Andriy Skulysh [Thu, 14 Jul 2016 10:43:31 +0000 (13:43 +0300)]
LU-8175 ldlm: conflicting PW & PR extent locks on a client

PW lock isn't replayed once a lock is marked
LDLM_FL_CANCELING and glimpse lock doesn't wait for
conflicting locks on the client. So the server will
grant a PR lock in response to the glimpse lock request,
which conflicts with the PW lock in LDLM_FL_CANCELING
state on the client.

Lock in LDLM_FL_CANCELING state may still have pending IO,
so it should be replayed until LDLM_FL_BL_DONE is set to
avoid granted conflicting lock by a server.

Change-Id: I99a1d81a8932ac7b7b3346558446f9d638156309
Seagate-bug-id: MRP-3311
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/20345
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8500 ldlm: fix export reference problem 31/22031/3
Hongchao Zhang [Wed, 24 Aug 2016 23:44:41 +0000 (19:44 -0400)]
LU-8500 ldlm: fix export reference problem

1, in client_import_del_conn, the export returned from
   class_conn2export is not released after using it.

2, in ptlrpc_connect_interpret, the export is not released
   if the connect_flags isn't compatible.

Change-Id: Ie7ef9cb0de2fa1aba71d3981ce47ae87c75e82d8
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/22031
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2547 test: re-enable 24a/b of recovery-small 20/22020/3
Niu Yawei [Fri, 19 Aug 2016 05:40:19 +0000 (01:40 -0400)]
LU-2547 test: re-enable 24a/b of recovery-small

Re-enable test_24a/b of recovery-small.

Test-Parameters: trivial testlist=recovery-small,recovery-small,recovery-small
Test-Parameters: testlist=recovery-small,recovery-small,recovery-small

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ie3d111e36a5a3792b3c3b5a7bd7f6b9979a321d5
Reviewed-on: http://review.whamcloud.com/22020
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8349 ldlm: ASSERTION(flock->blocking_export!=0) failed 61/21061/4
Andriy Skulysh [Wed, 29 Jun 2016 12:04:14 +0000 (15:04 +0300)]
LU-8349 ldlm: ASSERTION(flock->blocking_export!=0) failed

Hash lock protects only during .hs_put_locked.
Switch to atomic blocking_refs.

Whole policy structure was zeroed twice.
Once during enqueue and second time during resend or replay.

Policy structure should be initialized with default values
only in ldlm_lock_new().

Change-Id: Ib916f64cd03cfe812c86463b4354bf5a9bbcdd56
Seagate-bug-id: MRP-2536, MRP-2909
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-by: Alexander Boyko <alexander.boyko@seagate.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Signed-off-by: Ben Evans <bevans@cray.com>
Reviewed-on: http://review.whamcloud.com/21061
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7898 osd: remove unnecessary declarations 01/19101/12
Alex Zhuravlev [Wed, 23 Mar 2016 18:42:54 +0000 (21:42 +0300)]
LU-7898 osd: remove unnecessary declarations

Refactor the code a bit to remove unnecessary declarations
(which are very expensive in ZFS). The patch also introduces
initial preparations to support large dnodes - it tracks
all declared EAs at object creation and tracked number can
be used to request dnode of appropriate size.

With this patch + LU-7918 disk/memory space reserved for a
single-stripe creation goes down from ~33MB to 4.6MB.

Performance improvements from this patch are also significant.
Running mdtest create performance on a test node (ramdisk):

    Threads    0.6.5   0.6.5+patch
        1       9933       14279
        2      12870       20469
        4      16405       26407
        8      19320       28254
       16      15648       26620
       32      14107       26483

Change-Id: I0778ad8d13ba1f7a5fa5ad5d874fbb1bd7203958
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/19101
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7044 test: Skip sanityn test_77e/77f/77g 54/19054/6
Wei Liu [Wed, 24 Aug 2016 02:59:18 +0000 (22:59 -0400)]
LU-7044 test: Skip sanityn test_77e/77f/77g

Skip sanityn test_77e/77f/77g if server is older than 2.7.58

Test-Parameters: trivial testlist=sanityn

Change-Id: Ic2d93d74027d66f4471a4916cf35c830fd4225bb
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: http://review.whamcloud.com/19054
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7813 tests: clean up ost-pools.sh 89/18889/9
James Nunez [Thu, 23 Jun 2016 22:51:56 +0000 (16:51 -0600)]
LU-7813 tests: clean up ost-pools.sh

Clean up the tests in ost-pools.sh to drop archaic use of
"lfs getstripe -v" that parses the output text in favour of
using options for "lfs getstripe -c" for OST count.

Add the check for newly-created dir/file being in the pool
into create_dir() and create_file().

Test-Parameters: trivial testlist=ost-pools

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ib2df663a62f89df48a70d07702b41f05f0194ef9
Reviewed-on: http://review.whamcloud.com/18889
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7593 target: umount vs tgt_last_rcvd_update deadlock 04/17704/11
Andriy Skulysh [Tue, 12 Jul 2016 15:13:48 +0000 (18:13 +0300)]
LU-7593 target: umount vs tgt_last_rcvd_update deadlock

tgt_client_del() and
ofd_commitrw_write->tgt_last_rcvd_update
take transaction and ted->ted_lcd_lock
in different order:

thread1:
    osd_trans_start
    tgt_client_data_update
    tgt_client_del       <<< mutex_lock(&ted->ted_lcd_lock);
    ofd_obd_disconnect
    class_disconnect_export_list
    class_disconnect_exports
    class_cleanup
    ...
    sys_umount

thread2:
    __mutex_lock_slowpath
    mutex_lock          <<< mutex_lock(&ted->ted_lcd_lock);
    tgt_last_rcvd_update
    tgt_txn_stop_cb
    dt_txn_hook_stop
    osd_trans_stop
    ofd_trans_stop
    ofd_commitrw_write
    ...
    tgt_brw_write

Lock only around tgt_client_data_write() inside
the tgt_client_data_update()

Change-Id: Id3f60636be2abb3b70a99ee44b735aab7dfb7657
Seagate-bug-id: MRP-3109
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/17704
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7149 tests: restore writethrough_cache_enable 24/16424/8
Artem Blagodarenko [Tue, 15 Sep 2015 07:55:58 +0000 (10:55 +0300)]
LU-7149 tests: restore writethrough_cache_enable

Test sanity.sh test_224c is failed as expected if executed separately
and passes if executed by automatic system. Tests 155d,155f,155h,156
do "set_cache writethrough off" and don't restore the state. This
makes next tests work incorrectly.

This patch adds writethrough_cache_enable restore for each function
above.

Test-Parameters: trivial

Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Xyratex-bug-id: MRP-2590
Change-Id: I5f4f3f6c419a3aa415426607e776403da9822c2c
Reviewed-on: http://review.whamcloud.com/16424
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6245 libcfs: move uid handling to linux directory 39/22139/2
James Simmons [Thu, 25 Aug 2016 20:16:02 +0000 (16:16 -0400)]
LU-6245 libcfs: move uid handling to linux directory

Simple patch to move the uid handling added to handle
older kernels to the linux directory. The linux
directory is where we handle APIs of newer kernels
with older distribution kernels.

Test-Parameters: trivial

Change-Id: Ie3676d33ce33ebc0f98ffa460cba37ab55928617
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22139
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8540 o2iblnd: Add support for 5arg ib_map_mr_sg() 26/22126/2
Christopher J. Morrone [Wed, 24 Aug 2016 23:35:44 +0000 (16:35 -0700)]
LU-8540 o2iblnd: Add support for 5arg ib_map_mr_sg()

Starting in kernel v4.7, ib_map_mr_sg() takes five arguments
rather than four.  It added an "sg_offset_p" offset pointer
argument.

RHEL7.3 also contains this change.

Change-Id: Ie63c992421bdf4ca195cf55152e6dfed9cf40e1d
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/22126
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Li Dongyang <dongyang.li@anu.edu.au>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8507 lnet: Enable setting per NI peer_credits 48/21948/3
Doug Oucharek [Mon, 29 Aug 2016 03:26:11 +0000 (23:26 -0400)]
LU-8507 lnet: Enable setting per NI peer_credits

The code to allow peer_credits to be set per NI was originally
"left inactive" because there were concerns about peer_credits
interfering with the ability for IB nodes to connect to each
other when peer_credits are not the same (peer_credits controls
the queue depth for IB). With LU-3322, the values do not have
to match so it is now safe to enable this code so peer_credits
can be set per NI.

This patch enables existing code for setting per NI peer_credits.

Second this patch fixes a long standing bug in that the conf data
was not being used to set variables in the lnet_ni structure until
after lnd_startup() was called which meant LND drivers were
ignoring struct lnet_ni tunable values being set. Now we change
struct lnet_ni data fields based on conf data before calling
lnd_startup().

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I28ede7a139c43ca9a3d1b22255d3358694057918
Reviewed-on: http://review.whamcloud.com/21948
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8501 lnet: Ensure routing is turned on first time 34/21934/2
Doug Oucharek [Mon, 15 Aug 2016 21:14:38 +0000 (14:14 -0700)]
LU-8501 lnet: Ensure routing is turned on first time

In lnet_rtrpools_enable(), a mistake was made and routing
was not being turned on when the rtrpools are being allocated
for the first time.

This patch fixes that routine so we remember to turn on
routing after allocating the rtrpools.

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I8ef3e11bc8082cdce93e53d640f69e59ddbe9588
Reviewed-on: http://review.whamcloud.com/21934
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7803 tests: Cleanup after sanity/78 08/21808/5
Nathaniel Clark [Mon, 8 Aug 2016 14:52:08 +0000 (10:52 -0400)]
LU-7803 tests: Cleanup after sanity/78

Remove large file created by sanity/78 regardless of failure.  If this
file is left after failure, it causes some cascading failures because
of limited space available.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib359b9024360015ce92f209e5350f2d679071cb8
Reviewed-on: http://review.whamcloud.com/21808
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8443 utils: exclude "resize" parameter with meta_bg option 45/21545/5
Artem Blagodarenko [Wed, 27 Jul 2016 15:05:58 +0000 (18:05 +0300)]
LU-8443 utils: exclude "resize" parameter with meta_bg option

Partitions with size > 256TB must use meta_bg option. This option
is not compatible with "resize_inode" option and "resize" extended
option. For optimization reason "resize" option is enabled by
default. For filesystems with < 2^32 blocks this optimization is
useless.

This patch disables resize option if meta_bg is enabled. The test
that formats Lustre FS with "^resize_inode,meta_bg" options on OST
added.

Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Seagate-bug-id: MRP-3647
Change-Id: Ibea2d18f79498636a165a682cf6b6435f7cebfba
Reviewed-on: http://review.whamcloud.com/21545
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8025 llite: make vvp_io_write_start lockless for newer kernels 40/19840/22
James Simmons [Wed, 24 Aug 2016 00:59:19 +0000 (20:59 -0400)]
LU-8025 llite: make vvp_io_write_start lockless for newer kernels

When support for newer kernels was backported from the
upstream kernel it lacked any of the enhancements done
for newer version of lustre. This work makes the newer
kernel support lockless writes like the rest of the
lustre llite code.

Change-Id: I6ea32dbb3097aea3e2031e1121e238e549bccc9b
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Ben Evans <bevans@cray.com>
Reviewed-on: http://review.whamcloud.com/19840
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7927 llite: Deadlock between ll_setattr and write/ll_fsync 65/19165/9
Andriy Skulysh [Tue, 23 Aug 2016 21:07:37 +0000 (16:07 -0500)]
LU-7927 llite: Deadlock between ll_setattr and write/ll_fsync

The patch http://review.whamcloud.com/10013 (commit 85bd36cc695)
"LU-4840 lfs: Use file lease to implement migration" moves
lli_trunc_sem into vvp layer.  It violates lli_trunc_sem/i_mutex
locking order.  So i_mutex should be taken after lli_trunc_sem now.

Change-Id: I2ecd52b7ae6eca74c6db7d94b1de1333560bc45d
Seagate-bug-id: MRP-3372
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/19165
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ann Koehler <amk@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-6245 libcfs: cleanup list handling 00/15200/3
James Simmons [Wed, 17 Aug 2016 17:48:09 +0000 (13:48 -0400)]
LU-6245 libcfs: cleanup list handling

For the kernel space side we should use list.h directly
expect in the case of kernel API changes that impact us
then we use linux-list.h that handles those API changes.
A few of the user land utilities use a list implementation
so we provide a separate list implementation for the
libcfs library.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I1280d74a629dbaa9c11a3c506fd635fab99ce182
Reviewed-on: http://review.whamcloud.com/15200
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8514 mdd: transaction failure should be checked 71/22071/4
Lai Siyao [Tue, 23 Aug 2016 05:30:59 +0000 (13:30 +0800)]
LU-8514 mdd: transaction failure should be checked

Transaction failure should not be silently ignored, otherwise
MDT doesn't know whether current operation have transaction, therefore
save lock upon transaction failure.

Add sanity.sh 407 for this.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ie133a77c7f1bf890319dbd3cc2b03412a23f5c82
Reviewed-on: http://review.whamcloud.com/22071
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8408 mgc: handle config_llog_data::cld_refcount properly 16/21616/7
Fan Yong [Fri, 24 Jun 2016 04:04:01 +0000 (12:04 +0800)]
LU-8408 mgc: handle config_llog_data::cld_refcount properly

Originally, the logic of handling config_llog_data::cld_refcount
is some confusing, it may cause the cld_refcount to be leaked or
trigger "LASSERT(atomic_read(&cld->cld_refcount) > 0);" when put
the reference. This patch clean related logic as following:

1) When the 'cld' is created, its reference is set as 1.

2) No need additional reference when add the 'cld' into the list
   'config_llog_list'.

3) Inrease 'cld_refcount' when set lock data after mgc_enqueue()
   done successfully by mgc_process_log().

4) When mgc_requeue_thread() traversals the 'config_llog_list',
   it needs to take additional reference on each 'cld' to avoid
   being freed during subsequent processing. The reference also
   prevents the 'cld' to be dropped from the 'config_llog_list',
   then the mgc_requeue_thread() can safely locate next 'cld',
   and then decrease the 'cld_refcount' for previous one.

5) mgc_blocking_ast() will drop the reference of 'cld_refcount'
   that is taken in mgc_process_log().

6) The others need to call config_log_find() to find the 'cld'
   if want to access related config log data. That will increase
   the 'cld_refcount' to avoid being freed during accessing. The
   sponsor needs to call config_log_put() after using the 'cld'.

7) Other confused or redundant logic are dropped.

On the other hand, the patch also enhances the protection for
'config_llog_data' flags, such as 'cld_stopping'/'cld_lostlock'
as following.

a) Use 'config_list_lock' (spinlock) to handle the possible
   parallel accessing of these flags among mgc_requeue_thread()
   and others config llog data visitors, such as mount/umount,
   blocking_ast, and so on.

b) Use 'config_llog_data::cld_lock' (mutex) to pretect other
   parallel accessing of these flags among kinds of blockable
   operations, such as mount, umount, and blocking ast.

The 'config_llog_data::cld_lock' is also used for protecting
the sub-cld members, such as 'cld_sptlrpc'/'cld_params', and
so on.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9fb6c3b7ae23dcea147aca7ffec240e0f33ef746
Reviewed-on: http://review.whamcloud.com/21616
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoNew tag 2.8.57 2.8.57 v2_8_57 v2_8_57_0
Oleg Drokin [Thu, 1 Sep 2016 17:31:56 +0000 (13:31 -0400)]
New tag 2.8.57

Change-Id: I00319d4310725e3ffce4bdad12ab532663b88c17
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-8523 test: sanity 311 is too strict 10/22210/3
Lai Siyao [Mon, 29 Aug 2016 04:08:55 +0000 (12:08 +0800)]
LU-8523 test: sanity 311 is too strict

sanity 311 unlinks 1000 files, but the real destroyed objects may be
less, because there is some delay from when the files are unlinked
and when the MDS destroys the objects on the OSTs. Previously it's
set to check at least 900 objects are destroyed, but autotest found
only 880 objects destroyed in some cases, so now it's reduced to 800.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I88f45ae744475f2e2cdf8f82c1405164d6f4cd1c
Reviewed-on: http://review.whamcloud.com/22210
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-4865 zfs: grow block size by write pattern 41/18441/11
Jinshan Xiong [Fri, 12 Aug 2016 04:20:44 +0000 (21:20 -0700)]
LU-4865 zfs: grow block size by write pattern

This patch grows the block size by write RPC. The osd-zfs blocksize
used to be fixed at 128KB, which is too big for random write and
too small for seqential write.

This patch decides the block size by the first few RPCs. If the first
few RPCs are sequential, mostly it will pick maximum block size for
the object; otherwise, a feasible block size will be picked by the
RPC size.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I66f7cbdc2b5e0365058b152b4865b00cdabb0cf3
Reviewed-on: http://review.whamcloud.com/18441
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Don Brady <don.brady@intel.com>
7 years agoLU-8006 ptlrpc: specify ordering of TBF policy rules 76/19476/12
Li Xi [Fri, 15 Jul 2016 01:02:54 +0000 (09:02 +0800)]
LU-8006 ptlrpc: specify ordering of TBF policy rules

With this patch, when inserting a new rule, the rank of the rule
can be given by "start" command. Also, the rank of the rule can be
changed by command of "change".

lctl set_param ost.OSS.ost_io.nrs_tbf_rule=
"start $NAME jobid={$ID} rate=$RATE rank=$NEXT"
lctl set_param ost.OSS.ost_io.nrs_tbf_rule=
"change $NAME rate=$RATE rank=$NEXT"

$NAME is the target rule name. $NEXT is the rule name that the target
rule will be moved before.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I6b465342365d6c09710616cd3c9e068b66a8fc89
Reviewed-on: http://review.whamcloud.com/19476
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-7845 lnet: check if ni is in current net namespace 84/21884/7
Sebastien Buisson [Thu, 11 Aug 2016 09:36:00 +0000 (18:36 +0900)]
LU-7845 lnet: check if ni is in current net namespace

Add new 'ni_net_ns' field to struct lnet_ni to hold a reference
to original net namespace in which ni is created.
In LNetDist(), check if ni was created in same net namespace as
current's one. If not, assign order above 0xffff0000, to make
this ni not a priority.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5abde6e325983352b42c0eafe16aef22567e3e0e
Reviewed-on: http://review.whamcloud.com/21884
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>