Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-8560 llite: handle is_compat_task() rename 08/22208/3
James Simmons [Mon, 29 Aug 2016 23:19:48 +0000 (19:19 -0400)]
LU-8560 llite: handle is_compat_task() rename

The linux kernel 4.6 renamed is_compat_task() to
in_compat_syscall().

Change-Id: I2d3733a1ec03873d000b9f25aa8a98c3b02be410
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22208
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8560 libcfs: handle stacktrace function address() change 07/22207/2
James Simmons [Sun, 28 Aug 2016 20:16:56 +0000 (16:16 -0400)]
LU-8560 libcfs: handle stacktrace function address() change

Starting in linux kernel 4.6 the address() function
from struct stacktrace now return an int. Update
Lustre to handle this change.

Change-Id: I7d14c9134de3ae5642e2cad7d1d3829eb4ee9c50
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22207
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8560 libcfs: handle PAGE_CACHE_* removal in newer kernels 06/22206/4
James Simmons [Sun, 28 Aug 2016 23:52:15 +0000 (19:52 -0400)]
LU-8560 libcfs: handle PAGE_CACHE_* removal in newer kernels

Starting with linux kernel 4.6 all the PAGE_CACHE_* defines
have been removed. Now it is required to use PAGE_* instead.
This is a simple blanket change since PAGE_CACHE_* was always
the same as PAGE_*.

Change-Id: I3ba8954d44969e2473afa939bbb8b8b5b1345446
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22206
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jenkins
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8560 libcfs: add autoconf test for crypto changes 05/22205/3
James Simmons [Sun, 28 Aug 2016 18:59:46 +0000 (14:59 -0400)]
LU-8560 libcfs: add autoconf test for crypto changes

For linux 4.5 kernels the simple ifdef test in
linux-crypto.c worked but with linux 4.6+ kernels
we need to add a proper crypto api test for the
new inline functions crypto_ahash_alg_name() and
crypto_ahash_driver_name().

Test-Parameters: trivial

Change-Id: Ic18808b622d374cf6dc2417220ed83adc43ea692
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22205
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8560 lustre: remove unused crypto handlers in lustre_compat.h 04/22204/4
James Simmons [Sun, 28 Aug 2016 19:58:17 +0000 (15:58 -0400)]
LU-8560 lustre: remove unused crypto handlers in lustre_compat.h

The unused crypto code in lustre_compat.h doesn't
build with linux kernel version 4.6+. Since its
not used just delete it.

Test-Parameters: trivial

Change-Id: If7634428357837372f4756b0ace3af9c2cd77366
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22204
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8407 recovery: more clear message about recovery failure 59/21759/4
Fan Yong [Fri, 24 Jun 2016 04:21:07 +0000 (12:21 +0800)]
LU-8407 recovery: more clear message about recovery failure

Currently, the DNE recovery depends on the update logs on the MDTs.
If fail to get the update logs from some MDT(s), then the recovery
cannot go ahead. Different from client-side recovery failure, the
cross-MDT recovery failure may cause the namespace inconsistency.
Because we does not want to export the inconsistent namespace to
client, then we make the recovery (not abort because of timeout)
to wait there until related update logs available.

So if some MDT does not up or not mount, then the recovery on other
MDTs will hung there. As the time going, the client (re)connection
will trigger warning message on the MDTs to say about the recovery
hung. But such message does not clearly describe what happened.

This patch addes callback interface in target_distribute_txn_data,
called 'tdtd_show_update_logs_retrievers'. It allows the users to
check which MDTs are still in fetching update logs. Then the admin
can check related MDTs in detail when hit recovery trouble.

This patch also introduce new recovery status "WAITING" for the
case of update logs not ready for some MDT(s). Under such case,
the non-ready MDTs index and waited time will be shown.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: If5ed4487fe1e6d94f02479d83f6a187d6427b3a7
Reviewed-on: http://review.whamcloud.com/21759
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8361 lfsck: detect Lustre device automatically 96/21596/4
Fan Yong [Tue, 21 Jun 2016 23:52:26 +0000 (07:52 +0800)]
LU-8361 lfsck: detect Lustre device automatically

Originally, when start/stop/query LFSCK, the user needs to
specify the Lustre device via "-M" option explicitly. Even
if there is only single Lustre device on current server or
the user wants to start the LFSCK on all devices with the
"-A" option specified, the "-M" option is still required.
Such requirement is inconvenient. This patch enhances the
LFSCK user interfaces to allow the user to run the LFSCK
commands without "-M" specified. Instead, it will select
the available Lustre device on current server automatically.
But under the following cases the "-M" option is still
required: if there are multiple devices on current server
those belong to different Lustre filesystems, or if "-A"
option is not specified and there are multiple devices on
current server.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I291b958440b2409c93cdc8ef3a5e3fbe14885141
Reviewed-on: http://review.whamcloud.com/21596
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-1482 mdd: Setting xattr are properly checked with and without ACLs 96/21496/4
Dmitry Eremin [Mon, 25 Jul 2016 14:04:12 +0000 (17:04 +0300)]
LU-1482 mdd: Setting xattr are properly checked with and without ACLs

Setting extended attributes permissions are properly checked with and
without ACLs. In user.* namespace, only regular files and directories
can have extended attributes. For sticky directories, only the owner
and privileged user can write attributes.

Intel-bug-id: LDEV-40
Intel-change: http://review.whamcloud.com/15848

Change-Id: Ibd79dcc15e61839d878f4847f7836f29d823be61
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: http://review.whamcloud.com/21496
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8498 nodemap: new zfs index files not properly initialized 39/21939/8
Kit Westneat [Tue, 16 Aug 2016 03:50:07 +0000 (23:50 -0400)]
LU-8498 nodemap: new zfs index files not properly initialized

Calling index ->next on a new zfs returns a non-zero RC, but ldiskfs
indexes start with a blank record. This change modifies the config
load code to always write the default nodemap to an empty index file.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I30a365f65463979889f09f7ad5ffcdacc83fa868
Reviewed-on: http://review.whamcloud.com/21939
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8333 test: make sure COS is cleared 24/21924/3
Hongchao Zhang [Mon, 27 Jun 2016 12:24:56 +0000 (20:24 +0800)]
LU-8333 test: make sure COS is cleared

In subtest 21b of replay-dual, the COS could be set after the MDT
is failed over, and the test will fail in this case

Change-Id: I9401b905593c76f8fddfab19ab9eb6c0fe886e41
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/21924
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7903 mdt: dump exports information on console 99/21599/6
Niu Yawei [Fri, 29 Jul 2016 06:54:16 +0000 (02:54 -0400)]
LU-7903 mdt: dump exports information on console

To avoid being truncated in debug log, obd_exports_barrier() should
dump the exports information on console along with the "Is it stuck?"
warning message.

Test-Parameters: testlist=recovery-small,recovery-small,recovery-small
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I9dbaa7ed1d590db89ad6f42b66ec883dfb8b7ce1
Reviewed-on: http://review.whamcloud.com/21599
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-3815 tests: sanity-hsm - Remove tests from Always_Except 79/20079/5
Saurabh Tandan [Tue, 10 May 2016 00:02:53 +0000 (17:02 -0700)]
LU-3815 tests: sanity-hsm - Remove tests from Always_Except

Removing tests 34/35/36 from the ALWAYS_EXCEPT list

Test-Parameters: trivial \
testlist=sanity-hsm,sanity-hsm,sanity-hsm,sanity-hsm

Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: I293b45ab0f8ff27c4f35500ffa30ba348489e788
Reviewed-on: http://review.whamcloud.com/20079
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoRevert "LU-7898 osd: remove unnecessary declarations" 93/22293/2
Oleg Drokin [Fri, 2 Sep 2016 16:38:18 +0000 (16:38 +0000)]
Revert "LU-7898 osd: remove unnecessary declarations"

This patch causes build failures in master due to
reverted LU-7899 6cd79ab5860c5 patch that I failed
to catch in time due to deficiency in my build process.

This cannot be easily fixed since apparently a big
chunk of functionality was yanked from under this patch,
so I can only revert it for now.

This reverts commit ead6df2feee9c143b617cb60e50e403c955bd401.

Change-Id: I5ee89bf0c9260312f157c251b83dd417fa2cf260
Reviewed-on: http://review.whamcloud.com/22293
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8175 ldlm: conflicting PW & PR extent locks on a client 45/20345/5
Andriy Skulysh [Thu, 14 Jul 2016 10:43:31 +0000 (13:43 +0300)]
LU-8175 ldlm: conflicting PW & PR extent locks on a client

PW lock isn't replayed once a lock is marked
LDLM_FL_CANCELING and glimpse lock doesn't wait for
conflicting locks on the client. So the server will
grant a PR lock in response to the glimpse lock request,
which conflicts with the PW lock in LDLM_FL_CANCELING
state on the client.

Lock in LDLM_FL_CANCELING state may still have pending IO,
so it should be replayed until LDLM_FL_BL_DONE is set to
avoid granted conflicting lock by a server.

Change-Id: I99a1d81a8932ac7b7b3346558446f9d638156309
Seagate-bug-id: MRP-3311
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/20345
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8500 ldlm: fix export reference problem 31/22031/3
Hongchao Zhang [Wed, 24 Aug 2016 23:44:41 +0000 (19:44 -0400)]
LU-8500 ldlm: fix export reference problem

1, in client_import_del_conn, the export returned from
   class_conn2export is not released after using it.

2, in ptlrpc_connect_interpret, the export is not released
   if the connect_flags isn't compatible.

Change-Id: Ie7ef9cb0de2fa1aba71d3981ce47ae87c75e82d8
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/22031
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-2547 test: re-enable 24a/b of recovery-small 20/22020/3
Niu Yawei [Fri, 19 Aug 2016 05:40:19 +0000 (01:40 -0400)]
LU-2547 test: re-enable 24a/b of recovery-small

Re-enable test_24a/b of recovery-small.

Test-Parameters: trivial testlist=recovery-small,recovery-small,recovery-small
Test-Parameters: testlist=recovery-small,recovery-small,recovery-small

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ie3d111e36a5a3792b3c3b5a7bd7f6b9979a321d5
Reviewed-on: http://review.whamcloud.com/22020
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8349 ldlm: ASSERTION(flock->blocking_export!=0) failed 61/21061/4
Andriy Skulysh [Wed, 29 Jun 2016 12:04:14 +0000 (15:04 +0300)]
LU-8349 ldlm: ASSERTION(flock->blocking_export!=0) failed

Hash lock protects only during .hs_put_locked.
Switch to atomic blocking_refs.

Whole policy structure was zeroed twice.
Once during enqueue and second time during resend or replay.

Policy structure should be initialized with default values
only in ldlm_lock_new().

Change-Id: Ib916f64cd03cfe812c86463b4354bf5a9bbcdd56
Seagate-bug-id: MRP-2536, MRP-2909
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-by: Alexander Boyko <alexander.boyko@seagate.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Signed-off-by: Ben Evans <bevans@cray.com>
Reviewed-on: http://review.whamcloud.com/21061
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7898 osd: remove unnecessary declarations 01/19101/12
Alex Zhuravlev [Wed, 23 Mar 2016 18:42:54 +0000 (21:42 +0300)]
LU-7898 osd: remove unnecessary declarations

Refactor the code a bit to remove unnecessary declarations
(which are very expensive in ZFS). The patch also introduces
initial preparations to support large dnodes - it tracks
all declared EAs at object creation and tracked number can
be used to request dnode of appropriate size.

With this patch + LU-7918 disk/memory space reserved for a
single-stripe creation goes down from ~33MB to 4.6MB.

Performance improvements from this patch are also significant.
Running mdtest create performance on a test node (ramdisk):

    Threads    0.6.5   0.6.5+patch
        1       9933       14279
        2      12870       20469
        4      16405       26407
        8      19320       28254
       16      15648       26620
       32      14107       26483

Change-Id: I0778ad8d13ba1f7a5fa5ad5d874fbb1bd7203958
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/19101
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7044 test: Skip sanityn test_77e/77f/77g 54/19054/6
Wei Liu [Wed, 24 Aug 2016 02:59:18 +0000 (22:59 -0400)]
LU-7044 test: Skip sanityn test_77e/77f/77g

Skip sanityn test_77e/77f/77g if server is older than 2.7.58

Test-Parameters: trivial testlist=sanityn

Change-Id: Ic2d93d74027d66f4471a4916cf35c830fd4225bb
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: http://review.whamcloud.com/19054
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7813 tests: clean up ost-pools.sh 89/18889/9
James Nunez [Thu, 23 Jun 2016 22:51:56 +0000 (16:51 -0600)]
LU-7813 tests: clean up ost-pools.sh

Clean up the tests in ost-pools.sh to drop archaic use of
"lfs getstripe -v" that parses the output text in favour of
using options for "lfs getstripe -c" for OST count.

Add the check for newly-created dir/file being in the pool
into create_dir() and create_file().

Test-Parameters: trivial testlist=ost-pools

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ib2df663a62f89df48a70d07702b41f05f0194ef9
Reviewed-on: http://review.whamcloud.com/18889
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7593 target: umount vs tgt_last_rcvd_update deadlock 04/17704/11
Andriy Skulysh [Tue, 12 Jul 2016 15:13:48 +0000 (18:13 +0300)]
LU-7593 target: umount vs tgt_last_rcvd_update deadlock

tgt_client_del() and
ofd_commitrw_write->tgt_last_rcvd_update
take transaction and ted->ted_lcd_lock
in different order:

thread1:
    osd_trans_start
    tgt_client_data_update
    tgt_client_del       <<< mutex_lock(&ted->ted_lcd_lock);
    ofd_obd_disconnect
    class_disconnect_export_list
    class_disconnect_exports
    class_cleanup
    ...
    sys_umount

thread2:
    __mutex_lock_slowpath
    mutex_lock          <<< mutex_lock(&ted->ted_lcd_lock);
    tgt_last_rcvd_update
    tgt_txn_stop_cb
    dt_txn_hook_stop
    osd_trans_stop
    ofd_trans_stop
    ofd_commitrw_write
    ...
    tgt_brw_write

Lock only around tgt_client_data_write() inside
the tgt_client_data_update()

Change-Id: Id3f60636be2abb3b70a99ee44b735aab7dfb7657
Seagate-bug-id: MRP-3109
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/17704
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7149 tests: restore writethrough_cache_enable 24/16424/8
Artem Blagodarenko [Tue, 15 Sep 2015 07:55:58 +0000 (10:55 +0300)]
LU-7149 tests: restore writethrough_cache_enable

Test sanity.sh test_224c is failed as expected if executed separately
and passes if executed by automatic system. Tests 155d,155f,155h,156
do "set_cache writethrough off" and don't restore the state. This
makes next tests work incorrectly.

This patch adds writethrough_cache_enable restore for each function
above.

Test-Parameters: trivial

Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Xyratex-bug-id: MRP-2590
Change-Id: I5f4f3f6c419a3aa415426607e776403da9822c2c
Reviewed-on: http://review.whamcloud.com/16424
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 libcfs: move uid handling to linux directory 39/22139/2
James Simmons [Thu, 25 Aug 2016 20:16:02 +0000 (16:16 -0400)]
LU-6245 libcfs: move uid handling to linux directory

Simple patch to move the uid handling added to handle
older kernels to the linux directory. The linux
directory is where we handle APIs of newer kernels
with older distribution kernels.

Test-Parameters: trivial

Change-Id: Ie3676d33ce33ebc0f98ffa460cba37ab55928617
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/22139
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8540 o2iblnd: Add support for 5arg ib_map_mr_sg() 26/22126/2
Christopher J. Morrone [Wed, 24 Aug 2016 23:35:44 +0000 (16:35 -0700)]
LU-8540 o2iblnd: Add support for 5arg ib_map_mr_sg()

Starting in kernel v4.7, ib_map_mr_sg() takes five arguments
rather than four.  It added an "sg_offset_p" offset pointer
argument.

RHEL7.3 also contains this change.

Change-Id: Ie63c992421bdf4ca195cf55152e6dfed9cf40e1d
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/22126
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Li Dongyang <dongyang.li@anu.edu.au>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8507 lnet: Enable setting per NI peer_credits 48/21948/3
Doug Oucharek [Mon, 29 Aug 2016 03:26:11 +0000 (23:26 -0400)]
LU-8507 lnet: Enable setting per NI peer_credits

The code to allow peer_credits to be set per NI was originally
"left inactive" because there were concerns about peer_credits
interfering with the ability for IB nodes to connect to each
other when peer_credits are not the same (peer_credits controls
the queue depth for IB). With LU-3322, the values do not have
to match so it is now safe to enable this code so peer_credits
can be set per NI.

This patch enables existing code for setting per NI peer_credits.

Second this patch fixes a long standing bug in that the conf data
was not being used to set variables in the lnet_ni structure until
after lnd_startup() was called which meant LND drivers were
ignoring struct lnet_ni tunable values being set. Now we change
struct lnet_ni data fields based on conf data before calling
lnd_startup().

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I28ede7a139c43ca9a3d1b22255d3358694057918
Reviewed-on: http://review.whamcloud.com/21948
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8501 lnet: Ensure routing is turned on first time 34/21934/2
Doug Oucharek [Mon, 15 Aug 2016 21:14:38 +0000 (14:14 -0700)]
LU-8501 lnet: Ensure routing is turned on first time

In lnet_rtrpools_enable(), a mistake was made and routing
was not being turned on when the rtrpools are being allocated
for the first time.

This patch fixes that routine so we remember to turn on
routing after allocating the rtrpools.

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I8ef3e11bc8082cdce93e53d640f69e59ddbe9588
Reviewed-on: http://review.whamcloud.com/21934
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7803 tests: Cleanup after sanity/78 08/21808/5
Nathaniel Clark [Mon, 8 Aug 2016 14:52:08 +0000 (10:52 -0400)]
LU-7803 tests: Cleanup after sanity/78

Remove large file created by sanity/78 regardless of failure.  If this
file is left after failure, it causes some cascading failures because
of limited space available.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib359b9024360015ce92f209e5350f2d679071cb8
Reviewed-on: http://review.whamcloud.com/21808
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8443 utils: exclude "resize" parameter with meta_bg option 45/21545/5
Artem Blagodarenko [Wed, 27 Jul 2016 15:05:58 +0000 (18:05 +0300)]
LU-8443 utils: exclude "resize" parameter with meta_bg option

Partitions with size > 256TB must use meta_bg option. This option
is not compatible with "resize_inode" option and "resize" extended
option. For optimization reason "resize" option is enabled by
default. For filesystems with < 2^32 blocks this optimization is
useless.

This patch disables resize option if meta_bg is enabled. The test
that formats Lustre FS with "^resize_inode,meta_bg" options on OST
added.

Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Seagate-bug-id: MRP-3647
Change-Id: Ibea2d18f79498636a165a682cf6b6435f7cebfba
Reviewed-on: http://review.whamcloud.com/21545
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8025 llite: make vvp_io_write_start lockless for newer kernels 40/19840/22
James Simmons [Wed, 24 Aug 2016 00:59:19 +0000 (20:59 -0400)]
LU-8025 llite: make vvp_io_write_start lockless for newer kernels

When support for newer kernels was backported from the
upstream kernel it lacked any of the enhancements done
for newer version of lustre. This work makes the newer
kernel support lockless writes like the rest of the
lustre llite code.

Change-Id: I6ea32dbb3097aea3e2031e1121e238e549bccc9b
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Ben Evans <bevans@cray.com>
Reviewed-on: http://review.whamcloud.com/19840
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7927 llite: Deadlock between ll_setattr and write/ll_fsync 65/19165/9
Andriy Skulysh [Tue, 23 Aug 2016 21:07:37 +0000 (16:07 -0500)]
LU-7927 llite: Deadlock between ll_setattr and write/ll_fsync

The patch http://review.whamcloud.com/10013 (commit 85bd36cc695)
"LU-4840 lfs: Use file lease to implement migration" moves
lli_trunc_sem into vvp layer.  It violates lli_trunc_sem/i_mutex
locking order.  So i_mutex should be taken after lli_trunc_sem now.

Change-Id: I2ecd52b7ae6eca74c6db7d94b1de1333560bc45d
Seagate-bug-id: MRP-3372
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/19165
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ann Koehler <amk@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 libcfs: cleanup list handling 00/15200/3
James Simmons [Wed, 17 Aug 2016 17:48:09 +0000 (13:48 -0400)]
LU-6245 libcfs: cleanup list handling

For the kernel space side we should use list.h directly
expect in the case of kernel API changes that impact us
then we use linux-list.h that handles those API changes.
A few of the user land utilities use a list implementation
so we provide a separate list implementation for the
libcfs library.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I1280d74a629dbaa9c11a3c506fd635fab99ce182
Reviewed-on: http://review.whamcloud.com/15200
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8514 mdd: transaction failure should be checked 71/22071/4
Lai Siyao [Tue, 23 Aug 2016 05:30:59 +0000 (13:30 +0800)]
LU-8514 mdd: transaction failure should be checked

Transaction failure should not be silently ignored, otherwise
MDT doesn't know whether current operation have transaction, therefore
save lock upon transaction failure.

Add sanity.sh 407 for this.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ie133a77c7f1bf890319dbd3cc2b03412a23f5c82
Reviewed-on: http://review.whamcloud.com/22071
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8408 mgc: handle config_llog_data::cld_refcount properly 16/21616/7
Fan Yong [Fri, 24 Jun 2016 04:04:01 +0000 (12:04 +0800)]
LU-8408 mgc: handle config_llog_data::cld_refcount properly

Originally, the logic of handling config_llog_data::cld_refcount
is some confusing, it may cause the cld_refcount to be leaked or
trigger "LASSERT(atomic_read(&cld->cld_refcount) > 0);" when put
the reference. This patch clean related logic as following:

1) When the 'cld' is created, its reference is set as 1.

2) No need additional reference when add the 'cld' into the list
   'config_llog_list'.

3) Inrease 'cld_refcount' when set lock data after mgc_enqueue()
   done successfully by mgc_process_log().

4) When mgc_requeue_thread() traversals the 'config_llog_list',
   it needs to take additional reference on each 'cld' to avoid
   being freed during subsequent processing. The reference also
   prevents the 'cld' to be dropped from the 'config_llog_list',
   then the mgc_requeue_thread() can safely locate next 'cld',
   and then decrease the 'cld_refcount' for previous one.

5) mgc_blocking_ast() will drop the reference of 'cld_refcount'
   that is taken in mgc_process_log().

6) The others need to call config_log_find() to find the 'cld'
   if want to access related config log data. That will increase
   the 'cld_refcount' to avoid being freed during accessing. The
   sponsor needs to call config_log_put() after using the 'cld'.

7) Other confused or redundant logic are dropped.

On the other hand, the patch also enhances the protection for
'config_llog_data' flags, such as 'cld_stopping'/'cld_lostlock'
as following.

a) Use 'config_list_lock' (spinlock) to handle the possible
   parallel accessing of these flags among mgc_requeue_thread()
   and others config llog data visitors, such as mount/umount,
   blocking_ast, and so on.

b) Use 'config_llog_data::cld_lock' (mutex) to pretect other
   parallel accessing of these flags among kinds of blockable
   operations, such as mount, umount, and blocking ast.

The 'config_llog_data::cld_lock' is also used for protecting
the sub-cld members, such as 'cld_sptlrpc'/'cld_params', and
so on.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9fb6c3b7ae23dcea147aca7ffec240e0f33ef746
Reviewed-on: http://review.whamcloud.com/21616
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoNew tag 2.8.57 2.8.57 v2_8_57 v2_8_57_0
Oleg Drokin [Thu, 1 Sep 2016 17:31:56 +0000 (13:31 -0400)]
New tag 2.8.57

Change-Id: I00319d4310725e3ffce4bdad12ab532663b88c17
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8523 test: sanity 311 is too strict 10/22210/3
Lai Siyao [Mon, 29 Aug 2016 04:08:55 +0000 (12:08 +0800)]
LU-8523 test: sanity 311 is too strict

sanity 311 unlinks 1000 files, but the real destroyed objects may be
less, because there is some delay from when the files are unlinked
and when the MDS destroys the objects on the OSTs. Previously it's
set to check at least 900 objects are destroyed, but autotest found
only 880 objects destroyed in some cases, so now it's reduced to 800.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I88f45ae744475f2e2cdf8f82c1405164d6f4cd1c
Reviewed-on: http://review.whamcloud.com/22210
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-4865 zfs: grow block size by write pattern 41/18441/11
Jinshan Xiong [Fri, 12 Aug 2016 04:20:44 +0000 (21:20 -0700)]
LU-4865 zfs: grow block size by write pattern

This patch grows the block size by write RPC. The osd-zfs blocksize
used to be fixed at 128KB, which is too big for random write and
too small for seqential write.

This patch decides the block size by the first few RPCs. If the first
few RPCs are sequential, mostly it will pick maximum block size for
the object; otherwise, a feasible block size will be picked by the
RPC size.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I66f7cbdc2b5e0365058b152b4865b00cdabb0cf3
Reviewed-on: http://review.whamcloud.com/18441
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Don Brady <don.brady@intel.com>
4 years agoLU-8006 ptlrpc: specify ordering of TBF policy rules 76/19476/12
Li Xi [Fri, 15 Jul 2016 01:02:54 +0000 (09:02 +0800)]
LU-8006 ptlrpc: specify ordering of TBF policy rules

With this patch, when inserting a new rule, the rank of the rule
can be given by "start" command. Also, the rank of the rule can be
changed by command of "change".

lctl set_param ost.OSS.ost_io.nrs_tbf_rule=
"start $NAME jobid={$ID} rate=$RATE rank=$NEXT"
lctl set_param ost.OSS.ost_io.nrs_tbf_rule=
"change $NAME rate=$RATE rank=$NEXT"

$NAME is the target rule name. $NEXT is the rule name that the target
rule will be moved before.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I6b465342365d6c09710616cd3c9e068b66a8fc89
Reviewed-on: http://review.whamcloud.com/19476
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7845 lnet: check if ni is in current net namespace 84/21884/7
Sebastien Buisson [Thu, 11 Aug 2016 09:36:00 +0000 (18:36 +0900)]
LU-7845 lnet: check if ni is in current net namespace

Add new 'ni_net_ns' field to struct lnet_ni to hold a reference
to original net namespace in which ni is created.
In LNetDist(), check if ni was created in same net namespace as
current's one. If not, assign order above 0xffff0000, to make
this ni not a priority.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5abde6e325983352b42c0eafe16aef22567e3e0e
Reviewed-on: http://review.whamcloud.com/21884
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7977 lnet: Have selftest use proper units (MB/s or MiB/s) 91/20891/2
Doug Oucharek [Tue, 21 Jun 2016 01:02:26 +0000 (18:02 -0700)]
LU-7977 lnet: Have selftest use proper units (MB/s or MiB/s)

lnet-selftest currently reports bandwidth statistics as
MB/s but it is really calculated as MiB/s.

This patch corrects the output to say MiB/s and adds a
new option, "--mbs" to the "lst stat" command to change
the units to MB/s.

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: Iae8f6ca92b9b0ee00e6307eaf22e5c0791ed323d
Reviewed-on: http://review.whamcloud.com/20891
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8285 test: Allow LNet logging as default in autotest 18/20818/4
Doug Oucharek [Wed, 15 Jun 2016 21:41:08 +0000 (14:41 -0700)]
LU-8285 test: Allow LNet logging as default in autotest

The default in the local.sh configuration file for autotest
is to turn off all logging from three subsystems: lnet, lnd,
and pinger.  There is no good reason to be doing this and
this could be hiding important logs highlighting bugs.

This patch makes the default to allow all subsystems to log.

Test-Parameters: trivial
Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I8ef88679b1aa716311a10f7be43480ee3184d1a0
Reviewed-on: http://review.whamcloud.com/20818
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8249 lnet: potential deadlock in lnet 76/20676/6
Quentin Bouget [Thu, 16 Jun 2016 21:46:42 +0000 (17:46 -0400)]
LU-8249 lnet: potential deadlock in lnet

Fixes potential deadlock in LNetMDAttach (vfree must not be called in
interrupt context in linux kernel versions prior to 3.10).

Signed-off-by: Quentin Bouget <quentin.bouget.ocre@cea.fr>
Change-Id: I1b421b470bab97d58f441040c39b9f1caf11b1fe
Reviewed-on: http://review.whamcloud.com/20676
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8480 ofd: hold obd_dev_lock across grant comparison 13/21813/2
Andreas Dilger [Mon, 8 Aug 2016 17:49:16 +0000 (11:49 -0600)]
LU-8480 ofd: hold obd_dev_lock across grant comparison

Hold obd_dev_lock until the global ofd_tot_* grant values are saved,
so that their comparison is not racy.  Otherwise it is possible to
report grant inconsistencies when multiple clients are unmounted.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I19ffd102b657df2df539d01d182a782aa17ad924
Reviewed-on: http://review.whamcloud.com/21813
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7528 test: Deregister changelog client on test fail. 06/17506/8
Kirtankumar Krishna Shetty [Tue, 8 Dec 2015 09:32:37 +0000 (15:02 +0530)]
LU-7528 test: Deregister changelog client on test fail.

The tests which are related to changelog did not deregister
changelog when any failure was encountered, because of which
any changes afterward were also recorded leading to huge
test stdouts. I have added the functions changelog_cleanup
which will deregister the changelog client created by the
test before exiting. This function is to be used in place
of changelog deregister in any test that registers a changelog
client and a trap statement is included to execute this
function on EXIT.

Test-Parameters: trivial

Seagate-bug-id: MRP-3063
Signed-off-by: Kirtankumar Krishna Shetty <kirtan.shetty@seagate.com>
Change-Id: I7f26f266ba8bda294b75ff5619d95c26704fd83f
Reviewed-on: http://review.whamcloud.com/17506
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6844 tests: re-enable striped dir 08/21508/3
Di Wang [Wed, 3 Aug 2016 22:34:19 +0000 (18:34 -0400)]
LU-6844 tests: re-enable striped dir

Since this failure should be fixed by
http://review.whamcloud.com/21088

Let's revert http://review.whamcloud.com/20022
to re-enable striped dir in replay-single 70b.

Test-Parameters: trivial testlist=replay-single,replay-single,replay-single,replay-single,replay-single,replay-single
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Ie7fc18d4d57a74be6925d8a635fdb09d4917a2e7
Reviewed-on: http://review.whamcloud.com/21508
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8383 build: Spec file cleanup after LU-5614 25/22125/2
Dmitry Eremin [Mon, 11 Jul 2016 15:51:46 +0000 (18:51 +0300)]
LU-8383 build: Spec file cleanup after LU-5614

Add dependency from kmod-%{lustre_name}-tests
Fix BuildRequires: %kernel_module_package_buildreqs

Test-Parameters: trivial

Change-Id: I92325687812f10fb308971391e67bb80c08ae5db
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/22125
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8056 build: announce linux kernel 4.5.7 support 70/21970/3
James Simmons [Wed, 17 Aug 2016 16:47:53 +0000 (12:47 -0400)]
LU-8056 build: announce linux kernel 4.5.7 support

Bump kernel version in ChangeLog to latest supported
kernel which is 4.5.7

Test-Parameters: trivial

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I9607aa68e67174e588b284d4e3048131e2dcc2bd
Reviewed-on: http://review.whamcloud.com/21970
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8495 kernel: kernel update [SLES11 SP4 3.0.101-80] 66/21866/3
Bob Glossman [Wed, 10 Aug 2016 19:37:59 +0000 (12:37 -0700)]
LU-8495 kernel: kernel update [SLES11 SP4 3.0.101-80]

Update SLES11 SP4 kernel to 3.0.101-80

Test-Parameters: mdsdistro=sles11sp4 ossdistro=sles11sp4 \
  clientdistro=sles11sp4 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I2878d388bd58905643ff73401eeae166c34aac95
Reviewed-on: http://review.whamcloud.com/21866
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8056 mem: handle GFP_IOFS removal in newer kernels 81/21781/2
James Simmons [Mon, 8 Aug 2016 01:35:13 +0000 (21:35 -0400)]
LU-8056 mem: handle GFP_IOFS removal in newer kernels

Starting with linux kernel 4.5 GFP_IOFS has been removed.
GFP_IOFS was meant to be a short hande to clear two
GFP flags but it was never used properly. Replace it with
GFP_NOFS instead.

Change-Id: I97e045b1363ce216426ae709145b839a838e5762
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/21781
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6401 headers: Move functions out of lustre_idl.h 84/21484/5
Ben Evans [Fri, 22 Jul 2016 16:39:47 +0000 (11:39 -0500)]
LU-6401 headers: Move functions out of lustre_idl.h

Migrate functions
lma_to_lustre_flags, lustre_to_lma_flags
set/get_mrc_cr_flags
ldlm_res_eq
ldlm_extent_overlap
ldlm_extent_contain
ldlm_request_bufsize
rec_tail
agent_req_in_final_state
lustre_print_user_md
all PTLRPC dump_* functions
lovea_slot_is_dummy

Delete unused
lmv_mds_md_stripe_count

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: If65e9f63b727889f4952d5c326b18356cc4dae9d
Reviewed-on: http://review.whamcloud.com/21484
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
4 years agoLU-8371 llite: Trust creates in revalidate too. 68/21168/5
Oleg Drokin [Wed, 6 Jul 2016 04:38:19 +0000 (00:38 -0400)]
LU-8371 llite: Trust creates in revalidate too.

By forcing creates to always go via lookup we lose some
important caching benefits too.
Instead let's trust creates with positive cached entries.

Then we have 3 possible outcomes:
1. Negative dentry - we go via atomic_open and do the create
   by name there.
2. Positive dentry, no contention - we just go straight to
   ll_intent_file_open and open by fid.
3. positive dentry, contention - by the time we reach the server,
   the inode is gone. We get ENOENT which is unacceptable to return
   from create. But since we know it's a create, we substitute it
   with ESTALE and VFS retries again with LOOKUP_REVAL set, we catch
   that in revalidate and force a lookup (same path as before this
   patch).

Change-Id: I7b006a50703bfb37e8747dca0f95b2c512b82429
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/21168
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-8084 lfsck: handle linkea record length properly 77/19877/5
Fan Yong [Thu, 16 Jun 2016 14:58:10 +0000 (22:58 +0800)]
LU-8084 lfsck: handle linkea record length properly

The record length in the linkea may be corrupted. If we do not handle
the invalid record length when locate the next linkea record or delete
the current record, it may cause invalid memory accessing and corrupt
other data in RAM, and then cause kinds of strange RAM issues.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I27d724025c8157ecf51e3269a39e2fdfbc27a27d
Reviewed-on: http://review.whamcloud.com/19877
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7160 mgs: Skip processing .bak files on MGS 28/16428/7
Artem Blagodarenko [Tue, 15 Sep 2015 10:32:14 +0000 (13:32 +0300)]
LU-7160 mgs: Skip processing .bak files on MGS

lctl replace_nids command saves previous version of
config files to file with original_name.bak file name.
This file should never be processed by MGS.

This patch adds code that skips file with .bak extention
from list to be processed by MGS.

Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Xyratex-bug-id: MRP-2742
Change-Id: I5ad5cf5548d395459d2245394ef3f7764fe8f0ca
Reviewed-on: http://review.whamcloud.com/16428
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8454 llite: normal user can't set FS default stripe 12/21612/2
Lai Siyao [Wed, 27 Jul 2016 14:35:48 +0000 (22:35 +0800)]
LU-8454 llite: normal user can't set FS default stripe

Current client doesn't check permission before updating filesystem
default stripe on MGS, which isn't secure and obvious.

Since we setattr on MDS first, and then set default stripe on MGS,
we can just return error upon setattr failure.

Now filesystem default stripe is stored in ROOT in MDT, so saving
it in system config is for compatibility with old servers, this
will be removed in the future.

Add sanity 65m to verify this.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ia224a9211c1ceab08a3a064adc67bc945ee3fc11
Reviewed-on: http://review.whamcloud.com/21612
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8493 osp: Do not set stale for new osp obj 61/21861/2
Di Wang [Thu, 4 Aug 2016 22:34:14 +0000 (18:34 -0400)]
LU-8493 osp: Do not set stale for new osp obj

Do not set stale for the new OSP object, otherwise
it will cause ESTALE failure for the following
write operation, see osp_md_declare_write().

This problem is brought in by
http://review.whamcloud.com/19041

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Ib92deab4e0c900d59fbdc2bf50e17fd29fd2ecce
Reviewed-on: http://review.whamcloud.com/21861
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7650 o2iblnd: handle mixed page size configurations. 04/21304/7
James Simmons [Fri, 12 Aug 2016 19:53:06 +0000 (15:53 -0400)]
LU-7650 o2iblnd: handle mixed page size configurations.

Currently it is not possible to send LNet traffic between
two nodes using infiniband hardware that have different
page sizes for the case when RDMA fragments are used.
When two nodes establish a connection they tell the other
node the maximum number of RDMA fragments they support.
The issue is that the units are pages, and 256 64K pages
corresponds to 16MB of data, whereas a 4K page system is
limited to messages with 1MB of data. The solution is to
report over the wire the maximum number of fragments in
4K unites regardless of the native page size. The recipient
then uses its native page size to translate into the
maximum number of pages sized fragments it can send to
the other node.

Change-Id: I5aa4a464a0320fbd1841f9ad3add810e7b4f124a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/21304
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 client: remove types abstraction from client code 90/20590/12
James Simmons [Thu, 11 Aug 2016 15:42:14 +0000 (11:42 -0400)]
LU-6245 client: remove types abstraction from client code

Originally when lustre code was built for userland we needed
a proper way to handle 32 bit and 64 bit platforms when
reporting unsigned longs. Now that this code is only built
for kernel space and the kernel has it own special string
handling functions we don't need this abstraction anymore.
Remove this abstraction from the client side code.

Change-Id: Ic0c55a413237bdf57d60031c12d5d9b62fa39cef
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/20590
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4433 tests: fix mds-survey.sh to support multiple MDTs 37/19437/7
Fan Yong [Sun, 19 Jun 2016 09:16:27 +0000 (17:16 +0800)]
LU-4433 tests: fix mds-survey.sh to support multiple MDTs

This patch fixes mds-survey.sh and mds-survey to support
multiple MDTs.

Test-Parameters: mdtcount=1 testlist=mds-survey
Test-Parameters: envdefinitions=PTLDEBUG=-1,DEBUG_SIZE=150 mdscount=2 mdtcount=4 testlist=mds-survey
Signed-off-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I9193b7fb65ab8b5dfd0817a8c203dae463deb090
Reviewed-on: http://review.whamcloud.com/19437
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7903 hsm: leaked export refcount 42/21942/2
Niu Yawei [Tue, 16 Aug 2016 08:59:35 +0000 (04:59 -0400)]
LU-7903 hsm: leaked export refcount

Add missed class_export_put() in mdt_hsm_agent_send().

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ie9119c53f11901573161034a85bfa7bf83ca6ff8
Reviewed-on: http://review.whamcloud.com/21942
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8258 nodemap: fix userspace address access in proc code 57/21857/3
Kit Westneat [Wed, 10 Aug 2016 16:41:48 +0000 (12:41 -0400)]
LU-8258 nodemap: fix userspace address access in proc code

The fileset proc write handler was incorrectly passing the userspace
buffer address directly to the nodemap code. This patch copies it to
kernel space before passing it.  Because the buffer could be greater
than 2k, allocate the buffer off stack.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: If90c1a95c80b2afd2a4cf6a70dc41d28dd157a2f
Reviewed-on: http://review.whamcloud.com/21857
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8314 utils: revert lfs_getdirstripe to non-recursive mode 16/21516/5
Lai Siyao [Tue, 26 Jul 2016 15:27:48 +0000 (23:27 +0800)]
LU-8314 utils: revert lfs_getdirstripe to non-recursive mode

Since 2.7 'lfs getdirstripe' enabled recursion mode by default,
while it's not the obvious behavior, this patch reverts it
to non-recursive mode.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I683f5dec203230b36ee3da404e7f0817e91d090f
Reviewed-on: http://review.whamcloud.com/21516
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8460 osc: max_pages_per_rpc should be chunk size aligned 25/21825/4
Bobi Jam [Mon, 8 Aug 2016 09:31:34 +0000 (17:31 +0800)]
LU-8460 osc: max_pages_per_rpc should be chunk size aligned

max_pages_per_rpc should be chunk size aligned.

obd_brw_size need to be at least one block size.

Improve the LASSERT() to an LASSERTF() that prints the related
parameters to help debug problem.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: If73b8f05052f96970f3e97015a4642152ace2a38
Reviewed-on: http://review.whamcloud.com/21825
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4931 ladvise: Add willread advice support for ladvise 58/12458/36
Li Xi [Sat, 6 Aug 2016 14:13:02 +0000 (22:13 +0800)]
LU-4931 ladvise: Add willread advice support for ladvise

This patch adds WILLREAD advice to ladvise framework. OSS will
prefetch data into memory when this hint is provided. It is not
garanteed how long the cached pages will be kept in memory.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I21394b88a22a8c46ceae7151402341364860ee88
Reviewed-on: http://review.whamcloud.com/12458
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7646 lnet: Stop Infinite CON RACE Condition 30/19430/11
Doug Oucharek [Tue, 19 Jan 2016 01:26:08 +0000 (17:26 -0800)]
LU-7646 lnet: Stop Infinite CON RACE Condition

In current code, when a CON RACE occurs, the passive side will
let the node with the higher NID value win the race.

We have a field case where a node can have a "stuck"
connection which never goes away and is the trigger of a
never-ending loop of re-connections.

This patch introduces a counter to how many times a
connection in a connecting state has been the cause of a CON RACE
rejection. After 20 times (constant MAX_CONN_RACES_BEFORE_ABORT),
we assume the connection is stuck and let the other side (with
lower NID) win.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I32e035806e95868b13c28c42e241b969940a35c9
Reviewed-on: http://review.whamcloud.com/19430
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7483 tests: sanity test_103a misc test in acl corrected 22/21722/4
Saurabh Tandan [Thu, 4 Aug 2016 19:22:45 +0000 (12:22 -0700)]
LU-7483 tests: sanity test_103a misc test in acl corrected

sanity test_103a was failing under SELinux enabled client
throwing misc test failed error message. With the previous
command "ls -dl d/l | awk 'sub(/\\./, "", $1); {print $1}'"
the output consisted of 2 rows which was causing the failure.
The issue is mentioned in the comments section of LU-7483.

The solution was to filter the results using
"ls -dl d/l | awk '{ sub(/\\.$/, "", $1); print $1 }'"
instead and have the desired result. Results for successfully
run sanity test_103a under SELinux enabled client environment
can be found in the ticket.

Test-Parameters: trivial testlist=sanity,sanity
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: I3fcd60161873040b66d7004fb1cf682b41a0b8d9
Reviewed-on: http://review.whamcloud.com/21722
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8468 kernel: kernel update RHEL7.2 [3.10.0-327.28.2.el7] 92/21692/3
Bob Glossman [Tue, 2 Aug 2016 15:10:26 +0000 (08:10 -0700)]
LU-8468 kernel: kernel update RHEL7.2 [3.10.0-327.28.2.el7]

Update RHEL7.2 kernel to 3.10.0-327.28.2.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I63fe2cde33efba13be29e0bff0a4ef6b9a3306f5
Reviewed-on: http://review.whamcloud.com/21692
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8474 tests: stop MGS before setup_noconfig in conf-sanity.sh 53/21653/3
Jian Yu [Wed, 3 Aug 2016 07:20:35 +0000 (00:20 -0700)]
LU-8474 tests: stop MGS before setup_noconfig in conf-sanity.sh

In conf-sanity.sh, some sub-tests will leave MGS mounted
under separate MGT and MDT configuration, which will cause
setup_noconfig() fail. This patch fixes the issue by stopping
MGS before running setup_noconfig().

Test-Parameters: trivial combinedmdsmgs=false envdefinitions=ONLY=55 testlist=conf-sanity

Test-Parameters: trivial envdefinitions=ONLY=55 testlist=conf-sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I53f5549c392d300a2c76bbc4ce68e9a8198ba559
Reviewed-on: http://review.whamcloud.com/21653
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8473 tests: skip conf-sanity test 41a with separate MGT and MDT 51/21651/4
Jian Yu [Wed, 3 Aug 2016 01:09:50 +0000 (18:09 -0700)]
LU-8473 tests: skip conf-sanity test 41a with separate MGT and MDT

conf-sanity test 41a is to test “nosvc” and “nomgs” mount options
on a combined MGT/MDT device. It’s not applicable for separate
MGT and MDT configuration. This patch adds codes to check that.

Test-Parameters: trivial combinedmdsmgs=false envdefinitions=ONLY=41a testlist=conf-sanity

Test-Parameters: trivial envdefinitions=ONLY=41a testlist=conf-sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I967f63c70953c7c8e5bb296e832ee5335e56f69c
Reviewed-on: http://review.whamcloud.com/21651
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7061 osd-ldiskfs: NULL pointer in osd_scrub_refresh_mapping 20/20620/2
Kit Westneat [Fri, 3 Jun 2016 18:22:50 +0000 (14:22 -0400)]
LU-7061 osd-ldiskfs: NULL pointer in osd_scrub_refresh_mapping

Commit c0dafc483c (change 16138) missed a spot. id can be NULL for
DTO_INDEX_DELETE operation.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Id73f8dfb1834ff5275da006c03f59d4c56286aa7
Reviewed-on: http://review.whamcloud.com/20620
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8085 scrub: increase iteration cursor to skip unused inodes 76/19876/3
Fan Yong [Wed, 13 Apr 2016 08:48:39 +0000 (16:48 +0800)]
LU-8085 scrub: increase iteration cursor to skip unused inodes

After the OI scrub iteration handled the last used bits in the
inode table, it should increase the iteration position to next
group before the jump to avoid loop for ever.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib7a63202a134ecc82070868b9630430f054b69fa
Reviewed-on: http://review.whamcloud.com/19876
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8083 lfsck: repair symbol file nlink properly 74/19874/3
Fan Yong [Wed, 13 Apr 2016 08:45:52 +0000 (16:45 +0800)]
LU-8083 lfsck: repair symbol file nlink properly

Miss to check symbolic link case in lfsck_namespace_repair_nlink.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I420d558803672100292990f1ff4888c03888c39a
Reviewed-on: http://review.whamcloud.com/19874
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6688 tests: use proper nodes for NRS test 64/21764/5
James Simmons [Sat, 6 Aug 2016 23:39:30 +0000 (19:39 -0400)]
LU-6688 tests: use proper nodes for NRS test

Several of the NRS test for sanityn are reporting
error: set_param: ost/OSS/*/nrs_policies: Found no match.
The reason for this is that the test are attempting
to configure NRS oss settings on the MDS servers.
Those oss settings don't exist on MDS servers.
Change the test to alter the oss NRS settings on
the OSS servers instead.

Test-Parameters: trivial testlist=sanityn

Change-Id: I83600165fc1b9f0d9c6ee0d093f54604c46328b9
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/21764
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
4 years agoLU-7472 tests: Allows to specify IOR block size with suffix 54/17354/7
Aditya Pandit [Wed, 25 Nov 2015 06:16:23 +0000 (11:46 +0530)]
LU-7472 tests: Allows to specify IOR block size with suffix

Added a variable io_blockUnit which can
be set to K, M or G and adjust the IOR block sizes.
Modified to keep m as the default option. Default block
size for IOR tests would be 6MB.

Test-Parameters: trivial
Seagate-bug-id: MRP-2685
Change-Id: Ie7a11d8cb06faad902abc56bca4fc5914df8f42d
Signed-off-by: Aditya Pandit <aditya.pandit@seagate.com>
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-on: http://review.whamcloud.com/17354
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7225 llite: ladvise protocol changes 66/20666/20
Patrick Farrell [Mon, 1 Aug 2016 18:45:10 +0000 (13:45 -0500)]
LU-7225 llite: ladvise protocol changes

This patch makes some changes to the ladvise API and
protocol to support lock ahead and possible future users.

Primarily, it separates the userspace API arguments from
the structures which go out on the network, and adds a
number of 'value' fields without a predefined use.

The meaning of each value field can be different for
different advice types, allowing some extensibility.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I7ac18e546f16a20c3c6bc6849becb0d45e3d5dc9
Reviewed-on: http://review.whamcloud.com/20666
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7800 llog: check if next llog exists 42/18542/8
Di Wang [Fri, 19 Feb 2016 09:29:26 +0000 (04:29 -0500)]
LU-7800 llog: check if next llog exists

Because next llog creation will only be checked
and created in declare phase, and it does not
serialize the catllog accessing in the whole
declare_add and add process, so if there are
mulitple threads access the catlog at the same
time, and if the llog creation did not succeeds,
then next_log (in catllog) might be NULL, and
it will cause panic in llog_cat_current_log().

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I2343023c1f3109c077c98d78d3669377d95ed42f
Reviewed-on: http://review.whamcloud.com/18542
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7117 osp: set ptlrpc_request::rq_allow_replay properly 40/20940/15
Fan Yong [Wed, 15 Jun 2016 06:56:01 +0000 (14:56 +0800)]
LU-7117 osp: set ptlrpc_request::rq_allow_replay properly

In ptlrpc layer, if the ptlrpc_request::rq_allow_replay is set,
then such RPC can be sent to remote peer even if it is not the
replay RPC during the remote server recovery. Such flag is used
for sending RPC under the case of current server and the remote
server are both in recovery.

On the other hand, abusing such flag will cause some trouble.
For example: consider DNE mode, assume the MDT_m is in recover,
the MDT_n is healthy. At that time, one client can send a normal
reint unlink RPC to the MDT_n to remove the file_A (that resides
on the MDT_n) under the dir_B (that resides on the MDT_m). Under
such case, the MDT_n needs to lookup the dir_B with the file_A's
name, means the MDT_n needs to send lookup OUT RPC to the MDT_m,
but before that it needs to lock the dir_B with LDLM_ENQUEUE RPC
firstly. Because the MDT_m is recovering, since the LDLM_ENQUEUE
RPC is not for replay, it should be blocked until the recovery
done on the MDT_m. That is expected behavior. But if the MDT_n
(via OSP) sets ptlrpc_request::rq_allow_replay improperly, then
such LDLM_ENQUEUE RPC may be sent to the MDT_m during the MDT_m
recovery and granted without conflict. And then the subsequent
lookup OUT RPC may obtain some stale information from the MDT_m
if the dir_B has NOT been recovered yet.

So the ptlrpc_request::rq_allow_replay will be set during current
MDT recovery. On the other hand, there are multiple threads those
are related with the recovery, such as target_recovery_thread and
lod_sub_recovery_thread. Because the obd_device::obd_recovering
is controlled by the target_recovery_thread that is started later
than the lod_sub_recovery_thread. Only checking the obd_recovering
flag does not work under some cases. So it needs to check other
flags: obd_device::obd_replayable and obd_device::obd_no_conn to
distinguish recovery related RPC properly.

So for above case, the client sponsored unlink will be blocked on
the MDT_n for the LDLM_ENQUEUE RPC until the MDT_m recovery done.

Test-Parameters: mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4 testlist=replay-single,replay-single,replay-single
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Id9ac542751cc0042fba0a94166dfc57ace52dc69
Reviewed-on: http://review.whamcloud.com/20940
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-2805 tests: sanity -Remove test_184c from Always Except 04/20104/3
Saurabh Tandan [Wed, 8 Jun 2016 23:45:19 +0000 (16:45 -0700)]
LU-2805 tests: sanity -Remove test_184c from Always Except

For sanity.sh removing test_184c from the ALWAYS_EXCEPT
list.

Test-Parameters: trivial mdtfilesystemtype=zfs ostfilesystemtype=zfs mdsfilesystemtype=zfs envdefinitions=SLOW=yes testlist=sanity,sanity,sanity,sanity
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: Iba0b5b0ff2613e94c1cc90921face828605d05cb
Reviewed-on: http://review.whamcloud.com/20104
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoRevert "LU-7899 osd: batch EA updates" 78/21878/2
Oleg Drokin [Thu, 11 Aug 2016 07:13:29 +0000 (07:13 +0000)]
Revert "LU-7899 osd: batch EA updates"

Reverting this patch as it seems to be causing OOM issus
documented in LU-8449 and there does not seem to be an
easy fix in sight.

This reverts commit 6cd79ab5860c59c2a640a9e8ca4ee86eec050b43.

Change-Id: I934af93d893b01dad7190471b6b1a7bdffb1b509
Reviewed-on: http://review.whamcloud.com/21878
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-8479 obdclass: Reserve some value for OBD_FAIL_* macros 66/21766/3
Yang Sheng [Fri, 5 Aug 2016 17:22:48 +0000 (01:22 +0800)]
LU-8479 obdclass: Reserve some value for OBD_FAIL_* macros

Since these value were used by other branch. So
reserve them in master for consistency.

Test-Parameters: trivial envdefinitions=ONLY=0 testlist=sanity

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I3b262c34b3d86effeeeecb924092c2ffc8764c42
Reviewed-on: http://review.whamcloud.com/21766
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoRevert "LU-8383 build: Spec file cleanup after LU-5614" 77/21877/2
Oleg Drokin [Thu, 11 Aug 2016 06:18:57 +0000 (06:18 +0000)]
Revert "LU-8383 build: Spec file cleanup after LU-5614"

This patch appears to break SLES builds with:
error: Failed build dependencies:
kernel-syms is needed by lustre-2.8.56_23_ge8273a3-1.x86_64
make: *** [srpm] Error 1

This reverts commit 55836cd0e55eb1912911c6f195412c99852115aa.

Change-Id: I612e02431a7aafa4bb3daa7b3fb14a31e08175e3
Reviewed-on: http://review.whamcloud.com/21877
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6447 mdt: mdt_identity_upcall to not block with rwlock held 32/14432/5
Oleg Drokin [Fri, 10 Apr 2015 02:29:20 +0000 (22:29 -0400)]
LU-6447 mdt: mdt_identity_upcall to not block with rwlock held

mdt_identity_upcall is currently calling call_usermodehelper
with an rwlock held, which is a no-no since it allocates memory
and schedules. Just replace the rwlock with a rw_semaphore.

Change-Id: I7b063a4db47313fbae6241da7bcec2c397b8e8c4
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-on: http://review.whamcloud.com/14432
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
4 years agoLU-8371 llite: optimize atomic_open of negative dentry. 61/21161/4
Oleg Drokin [Wed, 6 Jul 2016 00:35:30 +0000 (20:35 -0400)]
LU-8371 llite: optimize atomic_open of negative dentry.

No point in talking to MDS in that case if we are not creating,
just return -ENOENT.

Change-Id: I15c00fdc841e5e9d4d1923b2353f7fdc5910d67b
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/21161
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-8471 obdclass: restore EXPORT_SYMBOL for lu_ref* functions 40/21640/3
Frank Zago [Tue, 2 Aug 2016 19:24:23 +0000 (14:24 -0500)]
LU-8471 obdclass: restore EXPORT_SYMBOL for lu_ref* functions

LU-5829 removed a lot of EXPORT_SYMBOL, including for the lu_ref_*
functions. As these functions are only compiled in when the
--enable-lu_ref is passed to configure, the breakage was missed. This
patch restores the missing EXPORT_SYMBOLS that were present, except
for lu_ref_print and lu_ref_is_marker which are only used in the
obdclass module.

Test-Parameters: trivial
Signed-off-by: Frank Zago <fzago@cray.com>
Change-Id: I16e6065b75c568a18386c0f0a746484fdad38d6e
Reviewed-on: http://review.whamcloud.com/21640
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8436 nodemap: MGS local MGC should not get_config 00/21500/3
Kit Westneat [Mon, 25 Jul 2016 19:45:25 +0000 (15:45 -0400)]
LU-8436 nodemap: MGS local MGC should not get_config

An MGC that is co-located with an MGS does not need to pull the
nodemap config as the MGS will manage it directly. Having the MGC
pull the config could lead to a race condition where new config
changes are overwritten by the old config.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Ief62f44dd3ef75abb704edee0e55f7f1b1334e42
Reviewed-on: http://review.whamcloud.com/21500
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-3289 gss: Fix for SK bulk HMACs 91/21491/4
Jeremy Filizetti [Tue, 19 Jul 2016 00:06:32 +0000 (20:06 -0400)]
LU-3289 gss: Fix for SK bulk HMACs

The original patches for SK failed to provide integrity for bulk RPCs.
It was missing the iov for the encrypted pages and only verifying the
zeroed header portion.  In addition bulk handling in SK needs to
account for the fact that all kiovs in the bulk descriptor are
populated so the HMAC must limit the bytes to the number specified
in bd_nob as encryption happens before the integrity calculation.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Change-Id: I4bb2c62eff2a1c9391c0f2c8409db36257480d4e
Reviewed-on: http://review.whamcloud.com/21491
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8415 tests: customise MPIRUN 31/21431/3
Elena Gryaznova [Tue, 19 Jul 2016 22:31:34 +0000 (01:31 +0300)]
LU-8415 tests: customise MPIRUN

Sometimes it is required to use Hydra process
manager instead of mpirun.

Test-Parameters: trivial testlist=performance-sanity

Seagate-bug-id: MRP-3191
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Change-Id: I49d6cf3e01214715f577f6364f2cf27ec1d70fd3
Reviewed-on: http://review.whamcloud.com/21431
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7732 ldlm: silence verbose "waking for gap" log messages 18/21418/3
Bob Glossman [Wed, 13 Jul 2016 16:56:50 +0000 (09:56 -0700)]
LU-7732 ldlm: silence verbose "waking for gap" log messages

Quiet down the very frequent and not very useful
"waking for gap in transno" error messages that fill up
the log under certain conditions.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I8264a958fa030ef66a3752dbf6e58ba79130d4b7
Reviewed-on: http://review.whamcloud.com/21418
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
4 years agoLU-8387 test: fix check in sanity test_129 56/21256/6
Wang Shilong [Tue, 12 Jul 2016 07:56:03 +0000 (15:56 +0800)]
LU-8387 test: fix check in sanity test_129

[ $has_warning ] always return 0 which makes this check
useless, fix it.

Test-Parameters: trivial

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I74f920a0940516230c7d25f13b75ad354c3f8348
Reviewed-on: http://review.whamcloud.com/21256
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
4 years agoLU-8383 build: Spec file cleanup after LU-5614 08/21208/8
Dmitry Eremin [Mon, 11 Jul 2016 15:51:46 +0000 (18:51 +0300)]
LU-8383 build: Spec file cleanup after LU-5614

Add dependency from kmod-%{lustre_name}-tests
Fix BuildRequires: %kernel_module_package_buildreqs

Test-Parameters: trivial

Change-Id: I92325687812f10fb308971391e67bb80c08ae5da
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: http://review.whamcloud.com/21208
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6449 mdt: broadcast orphan hsm_remove requests 91/20991/6
Bruno Faccini [Mon, 27 Jun 2016 09:25:20 +0000 (11:25 +0200)]
LU-6449 mdt: broadcast orphan hsm_remove requests

If a hsm_remove request is received for an unlinked file with
no/0 archive_id specified and no Agent/CT has registered to serve
all archive_ids, broadcast the request once to each registered
archive_id.

Also created specific sanity-hsm/test_29d sub-test using
"lfs hsm_remove <fid>" capability introduced by LU-6494.

Test-Parameters: clientcount=4
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ice5acaa5116dc036d5a98d76368eba2023a29f49
Reviewed-on: http://review.whamcloud.com/20991
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8305 tests: add traces for sanity-sec 90/20990/6
Sebastien Buisson [Mon, 27 Jun 2016 08:58:09 +0000 (10:58 +0200)]
LU-8305 tests: add traces for sanity-sec

Add more traces in sanity-sec.sh to help debug test failures.
Also add traces in mdt_get_root() function.

Test-Parameters: trivial testlist=sanity-sec,sanity-sec,sanity-sec,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I52291dc1b7c815eb4a845d4cd6a9c926122d107b
Reviewed-on: http://review.whamcloud.com/20990
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
4 years agoLU-5560 security: send file security context for creates 71/19971/29
John L. Hammond [Thu, 11 Feb 2016 14:36:37 +0000 (08:36 -0600)]
LU-5560 security: send file security context for creates

Send file security context to MDT along with create RPCs. This closes
the insecure window between creation and setting of the security
context that existed previously. It also avoids a potential LDLM hang
which arises from ll_create_it() when we send a MDS_SETXATTR RPC while
holding the lookup+layout lock returned from open.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I21415593b7dd362fecbb18cf90b1dc9fbf1c13db
Reviewed-on: http://review.whamcloud.com/19971
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7988 hsm: remove compound_id from mdt_hsm_add_actions proto 83/19583/14
Frank Zago [Fri, 8 Apr 2016 19:14:29 +0000 (15:14 -0400)]
LU-7988 hsm: remove compound_id from mdt_hsm_add_actions proto

mdt_hsm_request is the only caller of mdt_hsm_add_actions and it
doesn't care about compound_id, so make it a local variable, instead
of an output variable.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Iabf89497aad2cdf6364c155290c6df44487d8039
Reviewed-on: http://review.whamcloud.com/19583
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vinayak <vinayakswami.hariharmath@seagate.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7988 hsm: only browse entries of hsd.request when needed 81/19581/15
Frank Zago [Thu, 7 Apr 2016 16:59:52 +0000 (12:59 -0400)]
LU-7988 hsm: only browse entries of hsd.request when needed

In the coordinator callback, only the entries in hsd.request from 0 to
hsd.request_cnt-1 are used. So use that property instead of walking
the whole array. Consequently, the next entry available in hsd.request
is always at the hsd.request_cnt position, so use it instead of trying
to find an empty slot.

There is no need to reset hsd.request since hsd.request is already
reset in a for loop above. This saves a memset of 16 bytes per
request, each time the arbitor runs.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I2be0fe5ce918ded028bb260ef345a859b2cc41d4
Reviewed-on: http://review.whamcloud.com/19581
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vinayak <vinayakswami.hariharmath@seagate.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7988 hsm: Fix possible out of bounds reference in message 79/19579/14
Frank Zago [Mon, 11 Apr 2016 19:24:02 +0000 (15:24 -0400)]
LU-7988 hsm: Fix possible out of bounds reference in message

In the "Cannot allocate memory" error message, request[i].hal_sz could
be out of bound if the value of i was hsd->max_requests, which is
likely. The short fix would have been to use
request[empty_slot].hal_sz. Instead use a local variable for the
current request. This also reduces the size of the code.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I324087117df284dacea25774ebd9d4ed04794cbc
Reviewed-on: http://review.whamcloud.com/19579
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vinayak <vinayakswami.hariharmath@seagate.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7669 lmv: assume a real connection in lmv_connect() 18/18018/5
John L. Hammond [Fri, 15 Jan 2016 17:14:12 +0000 (11:14 -0600)]
LU-7669 lmv: assume a real connection in lmv_connect()

Assume a real connection in lmv_connect(). Mark OBD_CONNECT_REAL
obsolete. Remove the then unnecessary refcount and exp members of
struct lmv_obd. Remove calls to lmv_check_connect(). Disconnect the
export in the appropriate error path of lmv_connect().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I6d4f449506020964afd7f012983bfb2339429e0f
Reviewed-on: http://review.whamcloud.com/18018
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
4 years agoLU-5092 nodemap: save nodemaps to targets for caching 03/17503/26
Kit Westneat [Mon, 11 Jul 2016 15:28:08 +0000 (11:28 -0400)]
LU-5092 nodemap: save nodemaps to targets for caching

Modify nodemap config storage to save config to targets as well as
MGSes. This allows targets to start with the last received nodemap
configuration even if the MGS is not available. The config is
replaced by the MGS' config the next time the MGS is available.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I1c5221815618fe0265908bfd900ba55f44d1021b
Reviewed-on: http://review.whamcloud.com/17503
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7311 osd: smp_mb__before_clear_bit deprecated since kernel 3.16 91/16891/11
frank zago [Mon, 1 Aug 2016 16:05:55 +0000 (12:05 -0400)]
LU-7311 osd: smp_mb__before_clear_bit deprecated since kernel 3.16

smp_mb__before_clear_bit() was deprecated in kernel 3.16 and removed
in kernel 3.18, and was replaced by smp_mb__before_atomic(). To fix,
use clear_bit_unlock which does the old smp_mb__before_clear_bit +
clear_bit.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I970807ee4c1d91ddda4011ffb22bbe8af0a7764b
Reviewed-on: http://review.whamcloud.com/16891
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-8248 tests: fix sanity test_248 VM checks 98/20698/4
Andreas Dilger [Wed, 8 Jun 2016 23:33:31 +0000 (17:33 -0600)]
LU-8248 tests: fix sanity test_248 VM checks

Skip test on clients that do not have fast_read support.
If running in any VM, not just kvm, ignore perf test failures.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I09fbdb58e71d042c57b6bac1f6f1ef82243ebbe5
Reviewed-on: http://review.whamcloud.com/20698
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7613 llite: changes to avoid cache corruption 32/17732/5
Lokesh Nagappa Jaliminche [Wed, 10 Feb 2016 14:00:44 +0000 (19:30 +0530)]
LU-7613 llite: changes to avoid cache corruption

ll_find_alias is responsible for getting alias for inode
which can be reused. Directories are assumed to have unique
alias, where in case of non-directories there can be multiple
aliases. In case of lustre there can be two type of aliases
i.e. discon_alias and invalid_alias. Usage of discon_alias in
case of non-directories may corrupt dcache and leads to kernel
crash. Changes made to avoid use of discon_alias in case of
non-directories.

Seagate-bug-id: MRP-2739, MRP-3601
Change-Id: Ieb9dabab0784bdb3c52e2cb32be1a766ffe55313
Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Tested-by: Parinay Vijayprakash Kondekar <parinay.kondekar@seagate.com>
Reviewed-on: http://review.whamcloud.com/17732
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4039 tests: enable test_90 for replay-single 36/21736/4
Yang Sheng [Fri, 5 Aug 2016 22:54:58 +0000 (15:54 -0700)]
LU-4039 tests: enable test_90 for replay-single

Enable the test_90 since the issue cannot be reproduced.

Test-Parameters: trivial envdefinitions=ONLY=90 testlist=replay-single

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Icaddf3835373c836da6f2ae5eebfb3eb0e12540a
Reviewed-on: http://review.whamcloud.com/21736
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>