Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-11094 osd-ldiskfs: Fix style issues for osd_quota.c 24/32724/6
Arshad Hussain [Sun, 24 Jun 2018 04:30:57 +0000 (10:00 +0530)]
LU-11094 osd-ldiskfs: Fix style issues for osd_quota.c

This patch fixes issues reported by checkpatch
for file lustre/osd-ldiskfs/osd_quota.c

Change-Id: I1a01c3e6327ec56a1ffcf85c5d06934a5f8e8c54
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32724
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
3 years agoLU-11087 osd-ldiskfs: Fix style issues for osd_compat.c 09/32709/5
Arshad Hussain [Tue, 12 Jun 2018 15:51:11 +0000 (21:21 +0530)]
LU-11087 osd-ldiskfs: Fix style issues for osd_compat.c

This patch fixes issues reported by checkpatch for
file lustre/osd-ldiskfs/osd_compat.c

Test-Parameters: trivial
Change-Id: Ifa5ea5563fc7e5b5e94ea992e602979dea20eb9f
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32709
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-11032 hsm: memory leak in mdt_hsm_cdt_cleanup 56/32456/3
Qian Yingjin [Fri, 18 May 2018 08:55:32 +0000 (16:55 +0800)]
LU-11032 hsm: memory leak in mdt_hsm_cdt_cleanup

Release the alloced memory of archive id in mdt_hsm_cdt_cleanup
when free hsm_agent data structure, avoiding memroy leak problem.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I40e5fd289419d7c18d5f2c3ebe0d3955229f5517
Reviewed-on: https://review.whamcloud.com/32456
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11022 lfs: accept specifing comp_id in mirror split 55/32455/2
Bobi Jam [Fri, 18 May 2018 05:15:11 +0000 (13:15 +0800)]
LU-11022 lfs: accept specifing comp_id in mirror split

This patch enables "lfs mirror split" to accept --component-id
specifying a mirror containing the designated component in mirror
splitting.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I02bf4d75013341d99d95852cb7fb0fbbb41c7a4d
Reviewed-on: https://review.whamcloud.com/32455
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11027 doc: Add lockahead to llapi_ladvise man 37/32437/4
Patrick Farrell [Thu, 17 May 2018 10:31:47 +0000 (05:31 -0500)]
LU-11027 doc: Add lockahead to llapi_ladvise man

Document lockahead in the llapi_ladvise man page.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ia709611bb2751a408e3525c538daa824b365b09c
Reviewed-on: https://review.whamcloud.com/32437
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10970 tests: make sure write is complete 03/32203/5
Patrick Farrell [Mon, 30 Apr 2018 12:10:38 +0000 (07:10 -0500)]
LU-10970 tests: make sure write is complete

The current test does not guarantee the write has arrived
on the server before dropping caches and checking memory
usage.  If the write is still in progress, the baseline
memory used value will be incorrect.

Sync on the client to force the write out.

Test-Parameters: trivial

Cray-bug-id: LUS-5923
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ic0379ffdfd14ff630d65a0197a99fba929868e9c
Reviewed-on: https://review.whamcloud.com/32203
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 years agoLU-11120 test: add compilebench and DNE tests 49/31749/12
Mikhail Pershin [Fri, 23 Mar 2018 09:35:10 +0000 (12:35 +0300)]
LU-11120 test: add compilebench and DNE tests

Add more tests in dom-performance.sh
- add compilebench run
- add default DOM+DNE run

Test-Parameters: trivial mdtcount=2 mdscount=2 mdssizegb=20 testlist=dom-performance
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Id93c17157dba4887d250cd933d7a1fae5906af1b
Reviewed-on: https://review.whamcloud.com/31749
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-8066 osp: migrate from proc to sysfs 77/32377/10
James Simmons [Wed, 11 Jul 2018 17:27:11 +0000 (13:27 -0400)]
LU-8066 osp: migrate from proc to sysfs

Move the osp module from using proc for most single value files
to sysfs. Create the default attrs for dt_devices which can be
used for other server side devices.

Change-Id: I51fef51287585b38a1aff80d8edf986583c54a14
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32377
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10683 osd_zfs: set offset in page correctly 88/32788/2
Hongchao Zhang [Thu, 5 Jul 2018 11:44:38 +0000 (07:44 -0400)]
LU-10683 osd_zfs: set offset in page correctly

In osd_bufs_get_write, the offset in the first page should
be calculated on the offset parameter instead of zero.

Change-Id: I6592d8b5b0162b92953d59e2662a4381ba3e89ba
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32788
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-5638 tests: resume running sanity-quota tests 94/32694/2
James Nunez [Mon, 11 Jun 2018 15:58:31 +0000 (09:58 -0600)]
LU-5638 tests: resume running sanity-quota tests

sanity-quota tests 11 and 33 were not run due to the
issues documented in LU-5638. A patch, commmit id
a046e879fcadd601c9a19fd906f82ecbd2d4efd5, landed to fix
this issue. We should resume running sanity-quota
tests 11 and 33 for ZFS servers.

Test-Parameters: trivial clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity-quota
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iadb1356a0a6b4f5a8b5f54275db794f0ddbb5af6
Reviewed-on: https://review.whamcloud.com/32694
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-8708 osc: depart grant shrinking from pinger 02/23202/12
Bobi Jam [Mon, 17 Oct 2016 06:36:31 +0000 (14:36 +0800)]
LU-8708 osc: depart grant shrinking from pinger

* Removing grant shrinking code outside of pinger, use a workqueue
  to handle grant shrinking timer.
* Enable OSC grant shrinking by default.

bugzilla: 19507

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ifb03c907ad285a307d37d707193cfc32998ba2b2
Reviewed-on: https://review.whamcloud.com/23202
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
3 years agoNew tag 2.11.53 2.11.53 v2_11_53 v2_11_53_0
Oleg Drokin [Tue, 24 Jul 2018 03:58:13 +0000 (23:58 -0400)]
New tag 2.11.53

Change-Id: I02c52e58bd01f54d55a9083a2d1a12f6e811eaf1

3 years agoLU-11132 compile: fix LC_BI_BDEV for old kernels 99/32799/2
Vladimir Saveliev [Thu, 12 Jul 2018 19:45:11 +0000 (22:45 +0300)]
LU-11132 compile: fix LC_BI_BDEV for old kernels

struct bio is located in linux/bio.h in 2.6 kernel serie. LC_BI_BDEV
uses linux/blk_types.h. That makes the configuration check to fail for
those kernels and breaks compiling.

Use linux/bio.h in LC_BI_BDEV so that it worked for both new and all
kernels.

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: Iaeefea9ba96ebe4dad30acedb5fa7551c4516241
Reviewed-on: https://review.whamcloud.com/32799
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
3 years agoLU-11161 tests: stop running sanity test 160g 44/32844/2
James Nunez [Thu, 19 Jul 2018 22:36:27 +0000 (16:36 -0600)]
LU-11161 tests: stop running sanity test 160g

When run with two or more MDSs, sanity test 160g will fail
due to expecting a changelog user being deregistered on
all MDSs.

In order to stop sanity 160g from failing, add it to the
ALWAYS_EXCEPT list when running in a DNE environment which
results in the test not being executed.

Test-Parameters: trivial
Test-Parameters: testlist=sanity mdtcount=2 mdscount=2
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I091f148a3da820cad0103aead559a96c54c9fe8b
Reviewed-on: https://review.whamcloud.com/32844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11157 obd: keep dirty_max_pages a round number of MB 31/32831/4
John L. Hammond [Wed, 18 Jul 2018 20:47:25 +0000 (15:47 -0500)]
LU-11157 obd: keep dirty_max_pages a round number of MB

In client_adjust_max_dirty() ensure that the dirty pages limit is
always divisible by 256 so that it may faithfully be represented in MB
as is the case when the max_dirty_mb parameters are used.

Test-Parameters: trivial

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I8e2fbdd4bf253a46e2951e7840484ab6a617fbe2
Reviewed-on: https://review.whamcloud.com/32831
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
3 years agoLU-11074 mdc: set correct body eadatasize for getxattr() 39/32739/3
John L. Hammond [Fri, 29 Jun 2018 21:11:45 +0000 (16:11 -0500)]
LU-11074 mdc: set correct body eadatasize for getxattr()

In mdc_intent_getxattr_pack() set mbo_eadatasize to the size of the
xattr values buffer rather than the size of the xattr names buffer.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ibbed6aba6718f50eed1a08d506d526b1e0e042c8
Reviewed-on: https://review.whamcloud.com/32739
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11097 utils: add libuuid for llverdev 26/32726/2
Alex Zhuravlev [Sun, 24 Jun 2018 19:00:21 +0000 (22:00 +0300)]
LU-11097 utils: add libuuid for llverdev

this is explicitly required on my setup

Change-Id: I2b518c922d1857411bac74f68223259bb255e0e4
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32726
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-11131 target: keep reply data bit set on failover 98/32798/2
Vladimir Saveliev [Thu, 12 Jul 2018 20:27:24 +0000 (23:27 +0300)]
LU-11131 target: keep reply data bit set on failover

The following scenario leads to failure of recent reint rpc:

1. mdt server has number of rpcs being handled, rpc 1 from client A
and rpc 2 from client B.

2. shutdown for the server starts

3. rpc 1 is processed, reply data is added, but client A gets ENODEV
in reply (ptlrpc_send_reply()) as shutdown is running

3. shutdown reaches class_disconnect_exports() and links an export A
to the list of zombie exports

4. obd_zombid thread wakes up and destroy the export A, which includes
freeing of reply data list with clearing bits in
lut->lut_reply_bitmap (tgt_free_reply_data())

5. export B is still processing the rpc 2 and looks for free bit in
the lut->lut_reply_bitmap to store reply data
(tgt_add_reply_data()). If it finds a bit which has been just freed by
obd_zombid thread, then reply data from export A will get overwritten
in reply_data file with reply data from export B

6. after failover, reply data gets restored with
tgt_reply_data_init(). The reply data of client A is missing

7. client A reconnects and resends its rpc 1. Server does not find
reply data and processes the rpc as if it has not been seen yet. In
case of unlink, the directory entry already does not exist so rpc 1
fails

The fix is to not free bits in lut->lut_reply_bitmap in case of
failover.

Test illustrating the issue is added.

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-6004
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Change-Id: I6db3728f3271ce2751fbe08dadca365eb2ffe727
Reviewed-on: https://review.whamcloud.com/32798
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11099 doc: include "-N" option to lfs_setstripe.1 34/32734/2
Emoly Liu [Wed, 27 Jun 2018 04:18:57 +0000 (12:18 +0800)]
LU-11099 doc: include "-N" option to lfs_setstripe.1

This patch includes mirror option "-N[mirror_count]" to
lfs_setstripe.1 man page so that the user can follow the manual
to create a mirrored file or set s default mirror layout on a
directory correctly.
The command format is like:
$lfs setstripe -N[mirror_count] [STRIPE_OPTIONS] <dir|filename>

Test-Parameters: trivial

Change-Id: If0fabd79d218e5582f9c64336f60466f35dbd968
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32734
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-11098 ptlrpc: ASSERTION(!list_empty(imp->imp_replay_cursor)) 27/32727/2
Andriy Skulysh [Mon, 4 Jun 2018 16:08:29 +0000 (19:08 +0300)]
LU-11098 ptlrpc: ASSERTION(!list_empty(imp->imp_replay_cursor))

It's ptlrpc_replay_next() vs close race.
ll_close_inode_openhandle() calls
mdc_free_open()->ptlrpc_request_committed->ptlrpc_free_request

Need to reset imp_replay_cursor while dropping a request from
replay list.

Change-Id: Ia0ce327a729f8cf554b008ab6d32323b5dd26ee7
Cray-bug-id: LUS-2455
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/32727
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-8066 llite: replace ll_process_config with class_modify_config 22/32722/4
James Simmons [Tue, 3 Jul 2018 00:35:05 +0000 (20:35 -0400)]
LU-8066 llite: replace ll_process_config with class_modify_config

The current method of handling tunables with ll_process_config can
not work with sysfs. So replace ll_process_config handling with
class_modify_config() which can handle sysfs, debugfs and procfs.

Change-Id: I7ef5a4b1ee47827711a9d6654fda279abde06268
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32722
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-8066 osc: fix idle_timeout handling 19/32719/4
James Simmons [Thu, 14 Jun 2018 16:53:29 +0000 (12:53 -0400)]
LU-8066 osc: fix idle_timeout handling

The patch that landed for LU-7236 introduced new sysfs entries
which were done wrong.

1) For idle_timeout it returns -ERANGE for
   any value passed in expect setting idle_timeout to zero. This
   does not match what the commit message said for LU-7236. So
   I changed lprocfs_str_with_units_to_s64() into kstrtouint()
   since a signed 64 bit timeout is not needed. Using kstrtouint()
   ensures that negative values are not possible and also cap the
   value to CONNECTION_SWITCH_MAX since the max of 4 billion
   seconds is over kill.

2) For the next procfs idle_connect it is really a write only file
   but it was treated as both read and write. There is no need for
   the osc_idle_connect_seq_show() function.

3) Lastly no more stuffing new entries into proc or debugfs. For
   this patch convert these new proc entries to sysfs. It seems
   to be a common occurrence so add LPROC_SEQ_* to spelling.txt
   so checkpatch will complain about using LPROC_SEQ_* which will
   go away.

Change-Id: I1c992b2db47aade6a887919824d869e8d5354c71
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32719
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10855 ptlrpc: remove obsolete LLOG_ORIGIN_* RPCs 54/32654/2
Andreas Dilger [Wed, 6 Jun 2018 22:21:51 +0000 (16:21 -0600)]
LU-10855 ptlrpc: remove obsolete LLOG_ORIGIN_* RPCs

Remove the obsolete RPC opcodes LLOG_ORIGIN_HANDLE_WRITE_REC,
LLOG_ORIGIN_HANDLE_CLOSE, LLOG_ORIGIN_CONNECT, LLOG_CATINFO
along with their unused OBD_FAIL counterparts.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I5a2a15bc0dc9e09d0081b6c3aa291fc7713ebbe5
Reviewed-on: https://review.whamcloud.com/32654
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10855 ptlrpc: assign specific values to MGS opcodes 53/32653/2
Andreas Dilger [Wed, 6 Jun 2018 22:18:03 +0000 (16:18 -0600)]
LU-10855 ptlrpc: assign specific values to MGS opcodes

Assign specific values to all of the MGS opcodes in enum mgs_cmd
so that these values do not change if a new items is added or one
is removed in the future.  These opcodes are part of the wire
protocol and need to remain constant.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8132ca01916cd657933d0c8864e4e78f8b3ebbe5
Reviewed-on: https://review.whamcloud.com/32653
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10855 ptlrpc: remove obsolete OBD RPC opcodes 51/32651/3
Andreas Dilger [Wed, 6 Jun 2018 20:41:13 +0000 (14:41 -0600)]
LU-10855 ptlrpc: remove obsolete OBD RPC opcodes

Remove the obsolete OBD_LOG_CANCEL (since Lustre 1.5) and
OBD_QC_CALLBACK (since Lustre 2.4) RPC opcodes.

Assign  OBD_IDX_READ an explicit opcode (as should be done with all
enums in lustre_idl.h) so that the value does not change if some
prior field is removed.

Also remove the OBD_FAIL checks that were used to test them.
The setting in conf_sanity.sh test_58 was unused for many years.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie68c6be0da1c114fc981cb4b1afdcdb7c13ebbe5
Reviewed-on: https://review.whamcloud.com/32651
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11052 obd: remove OBD ops based stats 02/32602/2
John L. Hammond [Fri, 25 May 2018 14:40:04 +0000 (09:40 -0500)]
LU-11052 obd: remove OBD ops based stats

Stats maintained via the OBD operations wrappers (obd_setup(),
obd_cleanup(), ...) are less and less interesting to the point that we
should remove them. The only stats files affected by this are
obdfilter.*.stats, obdfilter.*.exports.*.stats and
obdecho.*.stats. For obdfilter here is a comparison for two racer
runs. With the current OBD ops based stats:

obdfilter.lustre-OST0000.stats=
snapshot_time             1527267354.328068245 secs.nsecs
read_bytes                610 samples [bytes] 4096 4194304 800043008
write_bytes               2196 samples [bytes] 5 4194304 3410224606
setattr                   13545 samples [reqs]
punch                     7682 samples [reqs]
destroy                   2281 samples [reqs]
create                    74 samples [reqs]
statfs                    234 samples [reqs]
get_info                  1 samples [reqs]
connect                   3 samples [reqs]
disconnect                1 samples [reqs]
preprw                    2806 samples [reqs]
commitrw                  2806 samples [reqs]
ping                      422 samples [reqs]

And after the OBD ops bases stats have been removed:

obdfilter.lustre-OST0000.stats=
snapshot_time             1527168813.867472974 secs.nsecs
read_bytes                200 samples [bytes] 4096 4194304 231366656
write_bytes               1703 samples [bytes] 5 4194304 1220864892
getattr                   337 samples [reqs]
setattr                   6358 samples [reqs]
punch                     2880 samples [reqs]
destroy                   2000 samples [reqs]
create                    71 samples [reqs]
statfs                    2148 samples [reqs]
get_info                  4 samples [reqs]

Changes to obdfilter.lustre-OST0000.exports.*.stats are similar.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: If4fb7022a3de0aa61905212eaab07b94c1687c68
Reviewed-on: https://review.whamcloud.com/32602
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jesse Hanley <hanleyja@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9325 llog: replace simple_strtol with kstrtol 98/32598/3
James Simmons [Thu, 7 Jun 2018 00:52:17 +0000 (20:52 -0400)]
LU-9325 llog: replace simple_strtol with kstrtol

Eventually simple_strtol will be removed so replace its use in
the llog_ioctl code with kstrtoxxx() functions.

Change-Id: I55a4e97837a1d9e0134dde92f0c2380f07691ab9
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32598
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
3 years agoLU-11045 test: use provided directory in racer/racer.sh 14/32514/2
John L. Hammond [Wed, 23 May 2018 15:03:46 +0000 (10:03 -0500)]
LU-11045 test: use provided directory in racer/racer.sh

In racer/racer.sh use the directory provided by the parent script
rather than the environmental variable $DIR.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iab753c34752462a30e7263b7c304e1626e5cc343
Reviewed-on: https://review.whamcloud.com/32514
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11044 osd-ldiskfs: ext4_dir_operations uses iterate_shared 86/32486/2
Chris Horn [Tue, 22 May 2018 14:39:14 +0000 (09:39 -0500)]
LU-11044 osd-ldiskfs: ext4_dir_operations uses iterate_shared

Linux 4.7 commit ae05327a00fd47c34dfe25294b359a3f3fef96e8 replaces
ext4_dir_operations iterate with iterate_shared. dir_relaxed_shared()
was also added in that commit, so we can use that function to verify
that the ext4_dir_operations is using iterate_shared.

Cray-bug-id: LUS-6008
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I67ff714296cab96408cb74fba62855c0e12cdf43
Reviewed-on: https://review.whamcloud.com/32486
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11034 build: update changelog for Ubuntu 18.04 59/32459/3
Minh Diep [Fri, 18 May 2018 17:46:56 +0000 (10:46 -0700)]
LU-11034 build: update changelog for Ubuntu 18.04

Record the version that we are building

Test-Parameters: trivial

Change-Id: I78c4aa6ad9b1a85cd498709b76ec3111e9572b84
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/32459
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-11014 mdt: remove enum mdt_it_code 58/32358/3
John L. Hammond [Fri, 11 May 2018 14:52:45 +0000 (09:52 -0500)]
LU-11014 mdt: remove enum mdt_it_code

Remove enum mdt_it_code, struct mdt_it_flavor and the mdt_it_flavor
array. In mdt_intent_opc, collapse the switch statement followed by
array lookup into a single switch statement that assigns the intent
format, handler, and handler flags. Simplify the subsequent logic in
mdt_intent_opc() accordingly.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Id56fe5fa1bd4d4c03a8de2db9d39f571bed06b2f
Reviewed-on: https://review.whamcloud.com/32358
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10990 osc: increase default max_dirty_mb to 2G 88/32288/5
Oleg Drokin [Fri, 4 May 2018 03:08:35 +0000 (23:08 -0400)]
LU-10990 osc: increase default max_dirty_mb to 2G

While ideally we want to go away from max_dirty_mb setting
completely and let grants code to take the msot part of it,
Andreas raises a somewhat valid point that for certain
system configurations with high-latency links, system
administrators might want to have ability to limit
amount of dirty pages just for those OSCs to limit amount
of time it might take to flush that dirty data.

So a good compromise is to lift the max_dirty_mb default
value first while we work out the current grant code
deficiencies

Change-Id: I4de407088af70e0f98f0563160217ba70a635dfb
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/32288
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-10986 lfs: make lfs project tolerant errors 43/32243/16
Wang Shilong [Wed, 2 May 2018 08:54:15 +0000 (16:54 +0800)]
LU-10986 lfs: make lfs project tolerant errors

This patch try to fix following problems:
1)command hang on pipe file, reproduced by following steps:
 $ mkfifo tmp/pipe
 $ lfs project -srp 500 tmp -->this will never finish.

Problem is opening a pipe file will be blocked in default
without O_NOBLOCK or O_NODELAY flag.

2)If a symbolic link with missing target exists, command
returns error and does not process remaining entries.

we should fix this problem by allowing command process
further even it hit some errors.

3)fix a wrong check for MAX_PATH.

Test-Parameters: trivial testlist=sanity-quota,sanity-quota
Change-Id: I7d08a7547e6b1351a1eff23063da6cd9c4cdc5e3
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/32243
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11086 test: reset quota setting properly 07/32707/3
Wang Shilong [Wed, 13 Jun 2018 14:12:16 +0000 (22:12 +0800)]
LU-11086 test: reset quota setting properly

some test cases don't reset quota setting properly, which
make running sanity-quota.sh several times fail, this patch
try to improve this problem by:

1)reset quota setting before check_runas_id_ret, as it will
touch file which might hit EDQUOT if we don't cleanup quota
setting properly since last run.

2)fix to reset quota for test case 55 and 60.

3)reset quota setting again after all tests finished, because
some tests after sanity-quota.sh might be affected, if quota
setting not reset properly for some reasons.

Test-Parameters: trivial testlist=sanity-quota,sanity-quota
Change-Id: I2983102ea379e64173ef8c54b149ba3b5fbfebe9
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/32707
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10734 tests: ensure current GC interval is over 04/31604/9
Bruno Faccini [Fri, 9 Mar 2018 01:59:51 +0000 (02:59 +0100)]
LU-10734 tests: ensure current GC interval is over

In sanity/test_160g, ensure current configured
"changelog_min_gc_interval=2" is over to allow for
GC thread to be effectivelly started.

Also, enable Changelog GC, as it is no longer the
default, in sanity/test_160g sub-test and remove
it from ALWAYS_EXCEPT to reenable it and leave
160f for LU-10680 reason.

sanity/test_160g has also been reworked to become
fully DNE aware.

Test-Parameters: trivial
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I8a079ba2ba1822b488f65ad9703204d6296fada0
Reviewed-on: https://review.whamcloud.com/31604
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11079 llite: control concurrent statahead instances 90/32690/7
Fan Yong [Wed, 13 Jun 2018 14:33:55 +0000 (22:33 +0800)]
LU-11079 llite: control concurrent statahead instances

It is found that if there are too many concurrent statahead
instances, then related statahead RPCs may accumulate on the
client import (for MDT) RPC lists
(imp_sending_list/imp_delayed_list/imp_unreplied_lis), as to
seriously affect the efficiency of spin_lock under the case
of MDT overloaded or in recovery. Be as the temporarily solution,
restrict the concurrent statahead instances.

If want to support more concurrent statahead instances, please
consider to decentralize the RPC lists attached on related import.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I7251cc536f11d184f768e3d3704ba6717644541e
Reviewed-on: https://review.whamcloud.com/32690
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10893 tests: allow to disable dm-flakey layer 58/32658/4
Alexander Boyko [Thu, 7 Jun 2018 13:54:41 +0000 (09:54 -0400)]
LU-10893 tests: allow to disable dm-flakey layer

The patch 54b9e3f789358bd9dfb94b77fe33a4faa1e28ab2 adds flakey layer
to test framework. But it also adds a regression, you can`t run tests
separately from a setup. Before the dm-flakey, it was easy to create a
configuration at ncli, setup a cluster, and start a test. But now it
is impossible. For example
sudo MDSDEV=/dev/sdb MDSDEV1=/dev/sdb sh lustre/tests/llmount.sh
sudo MDSDEV=/dev/sdb MDSDEV1=/dev/sdb ONLY=0 sh
lustre/tests/conf-sanity.sh
Format mds1: /dev/sdb
mkfs.lustre FATAL: Unable to build fs /dev/sdb (256)
mkfs.lustre FATAL: mkfs failed 256

The fix disables dm-flakey layer with option FLAKEY=false.

Test-Parameters: envdefinitions=FLAKEY=false
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-5851
Change-Id: I248be2307cff5fe6b4b2524478ca8e4cd96a77d2
Reviewed-on: https://review.whamcloud.com/32658
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11064 lnd: determine gaps correctly 86/32586/4
Amir Shehata [Wed, 30 May 2018 20:22:11 +0000 (13:22 -0700)]
LU-11064 lnd: determine gaps correctly

We're allowed to start at a non-aligned page offset in the first
fragment and end at a non-aligned page offset in the last fragment.

When checking the iovec exclude both of the first and last fragments
from the tx_gaps check.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I8a9231db7db404a5d5a6294ff263c1bd2ac28e6c
Reviewed-on: https://review.whamcloud.com/32586
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11117 ptlrpc: don't zero request handle 81/32781/4
Alexander Boyko [Fri, 15 Jun 2018 09:02:36 +0000 (05:02 -0400)]
LU-11117 ptlrpc: don't zero request handle

LNet can retransmit a request at any time if it isn't replied.
The ptlrpc_resend_req zero the request handle and ptlrpc_send_rpc
set it. If retransmission happen with zeroed handle, the client
can't find a valid export by handle and set rq_export to NULL and
reply with ENOTCONN. A server evict client with this error.

client (nid x.x.x.x@tcp) returned error from blocking AST
(req status -107 rc -107), evict it

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-6037
Change-Id: I198666d386fea99b46994f965c1519acb5743d75
Reviewed-on: https://review.whamcloud.com/32781
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-7816 quota: add default quota setting support 06/32306/16
Hongchao Zhang [Tue, 5 Jun 2018 22:23:42 +0000 (18:23 -0400)]
LU-7816 quota: add default quota setting support

Similar function which is motivated by GPFS which is friendly
feature for cluster administrators to manage quota.

Lazy Quota default setting support, here is basic idea:

Default quota setting is global quota setting for user, group,
project quotas, if default quota is set for one quota type,
newer created users/groups/projects will inherit this setting
automatically, since Lustre itself don't have ideas when new
users created, they could only know when this users trying to
acquire space from Lustre.

So we try to implement lazy quota setting inherit, Slave firstly
check if there exists default quota setting, if exists, it will
force slave to acquire quota from master, and master will detect
whether default quota is set, then it will set this quota and also
return proper grant space to slave.

To implement this and reuse existed quota APIs, we try to manage
the default quota in the quota record of 0 id, and enforce the
quota check when reading the quota recored from disk.

In the current Lustre implementation, the grace time is either
the time or the timestamp to be used after some quota ID exceeds
the soft limt, then 48bits should be enough for it, its high 16bits
can be used as kinds of quota flags, this patch will use one of
them as the default quota flag.

The global quota record used by default quota will set its soft
and hard limit as zero, its grace time will contain the default flag.

Use lfs setquota -U/-G/-P <mnt> to set default quota.
Use lfs setquota -u/-g/-p foo -d <mnt> to set foo to use default quota
Use lfs quota -U/-G/-P <mnt> to show default quota.

Test-Parameters: envdefinitions=DEBUG_SIZE=64

Change-Id: Ib23007360921832b3c7d5710ab50324bc5067286
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/32306
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11003 ldlm: don't add canceling lock back to LRU 92/32692/2
Mikhail Pershin [Mon, 11 Jun 2018 06:44:01 +0000 (09:44 +0300)]
LU-11003 ldlm: don't add canceling lock back to LRU

When lock is converted check it is not canceling before
adding it back to LRU.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I278389f2a23b304d812f82ffb2dcee2ca70f5b21
Reviewed-on: https://review.whamcloud.com/32692
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11004 ptlrpc: Serialize procfs access to scp_hist_reqs using mutex 07/32307/2
Andriy Skulysh [Thu, 12 Apr 2018 13:12:05 +0000 (16:12 +0300)]
LU-11004 ptlrpc: Serialize procfs access to scp_hist_reqs using mutex

scp_hist_reqs list can be quite long thus a lot of
userland processes can waste CPU power in spinlock cycles.

Change-Id: Ic0fa7338569f9a19213a1dc31f5479c96a76d23a
Cray-bug-id: LUS-5833
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/32307
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10527 obdclass: don't recycle loghandle upon ENOSPC 97/30897/4
Bruno Faccini [Wed, 17 Jan 2018 15:22:58 +0000 (16:22 +0100)]
LU-10527 obdclass: don't recycle loghandle upon ENOSPC

In llog_cat_add_rec(), upon -ENOSPC error being returned from
llog_cat_new_log(), don't reset "cathandle->u.chd.chd_current_log"
to NULL.
Not doing so will avoid to have llog_cat_declare_add_rec() repeatedly
and unnecessarily create new+partially initialized LLOGs/llog_handle
and assigned to "cathandle->u.chd.chd_current_log", this without
llog_init_handle() never being called to initialize
"loghandle->lgh_hdr".

Also, unnecessary LASSERT(llh) has been removed in
llog_cat_current_log() as it prevented to gracefully handle this
case by simply returning the loghandle.
Thanks to S.Cheremencev (Cray) to report this.

Both ways to fix have been kept in patch as the 1st part allows for
better performance in terms of number of FS operations being done
with permanent changelog's ENOSPC condition, even if this covers
a somewhat unlikely situation.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I526f788dc283fa7136ba518179d9337e1d5e3714
Reviewed-on: https://review.whamcloud.com/30897
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10175 ldlm: handle lock converts in cancel handler 14/32314/5
Mikhail Pershin [Mon, 7 May 2018 20:36:55 +0000 (23:36 +0300)]
LU-10175 ldlm: handle lock converts in cancel handler

- Use cancel portals and high-priority handling for lock
  converts. Update ldlm_cancel_handler to understand
  LDLM_CONVERT RPC for that.
- Use ns_dirty_age_limit for lock convert - don't convert too old
  locks.
- Check for empty converts and skip such

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I767626acd974ad88bbbf0bb3b0a46744c45b7897
Reviewed-on: https://review.whamcloud.com/32314
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoRevert "LU-8066 llite: replace ll_process_config with class_modify_config" 21/32721/2
Oleg Drokin [Thu, 14 Jun 2018 18:08:55 +0000 (18:08 +0000)]
Revert "LU-8066 llite: replace ll_process_config with class_modify_config"

This patch was landed by mistake.

This reverts commit db67e686d9abcf750359820bfbdb754ab611bf5c.

Change-Id: I2cbfe808eb7d5c448bdf06d4c36229813e6978d2
Reviewed-on: https://review.whamcloud.com/32721
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8066 llite: replace ll_process_config with class_modify_config 95/32495/5
James Simmons [Sat, 9 Jun 2018 14:16:59 +0000 (10:16 -0400)]
LU-8066 llite: replace ll_process_config with class_modify_config

The current method of handling tunables with ll_process_config can
not work with sysfs. So replace ll_process_config handling with
class_modify_config() which can handle sysfs, debugfs and procfs.

Change-Id: I40611930ab2b769c0661aa7dce0c7dd0f2d90204
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32495
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
3 years agoLU-10560 osd: bio_integrity_enabled was removed 21/32621/3
Li Dongyang [Tue, 5 Jun 2018 01:40:43 +0000 (11:40 +1000)]
LU-10560 osd: bio_integrity_enabled was removed

T10PI bio support patches used bio_integrity_enabled
which was no longer available in recent kernels.
Fix this so we can have server support back on 4.13+
kernels.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I32eeea244ad599c7af2d551b9b2b173e982d07d3
Reviewed-on: https://review.whamcloud.com/32621
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11065 kernel: kernel update [SLES12 SP3 4.4.132-94.33] 99/32599/3
Bob Glossman [Thu, 31 May 2018 13:57:52 +0000 (06:57 -0700)]
LU-11065 kernel: kernel update [SLES12 SP3 4.4.132-94.33]

Update target, kernel_config, and ldiskfs files for new version
One ldiskfs patch revised for ext4 changes.
Old unchanged ldiskfs patch kept to use for sles12sp2.

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ic6d0219a7133825d1dba0b2bfadf8354442cddb3
Reviewed-on: https://review.whamcloud.com/32599
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11051 obd: remove obd_{get,put}ref() 29/32529/3
John L. Hammond [Thu, 17 May 2018 16:36:23 +0000 (11:36 -0500)]
LU-11051 obd: remove obd_{get,put}ref()

obd_getref() and obd_putref() are only used in the lov layer and only
implemented by the lov layer. So they can be removed in favor of
direct calls. Rename lov_{get,put}ref() to lov_tgts_{get,put}ref()
since they do not manage references on the lov device but on its
targets array.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I0f48eaf4bb42b81b2155c599f361a17dd7bb1ae3
Reviewed-on: https://review.whamcloud.com/32529
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10921 utils: improve lfs setstripe error message 42/32442/4
Andreas Dilger [Wed, 16 May 2018 22:18:33 +0000 (16:18 -0600)]
LU-10921 utils: improve lfs setstripe error message

Improve the error messages when "lfs setstripe" or "lfs setdirstripe"
is run on an existing file/directory.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3b21fb65847822c73713e9a26d6dea978b3cab07
Reviewed-on: https://review.whamcloud.com/32442
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10175 ptlrpc: add LOCK_CONVERT connection flag 93/32593/3
Mikhail Pershin [Sun, 20 May 2018 18:00:23 +0000 (21:00 +0300)]
LU-10175 ptlrpc: add LOCK_CONVERT connection flag

Add LOCK_CONVERT connection flag to don't use lock
convert feature with old servers.

Test-Parameters: trivial
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ie860f43955314017609774d692f89cfe3c2ab896
Reviewed-on: https://review.whamcloud.com/32593
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-10963 gnilnd: stats variables overflow assert 84/32184/4
Chuck Fossen [Thu, 26 Apr 2018 20:04:25 +0000 (15:04 -0500)]
LU-10963 gnilnd: stats variables overflow assert

Reverse bte rdma transactions stats were being
incremented by kgnilnd_admin_addref() which asserts when the value
goes negative. These stats should be incremented with atomic_inc
instead.

Test-Parameters: trivial
Cray-bug-id: LUS-5940
Signed-off-by: Chuck Fossen <chuckf@cray.com>
Change-Id: I06426bc078cc76f14c7b3efb5f3ceb71054c2d09
Reviewed-on: https://review.whamcloud.com/32184
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4423 ldlm: use delayed_work for ldlm_pools_recalc 05/31705/8
NeilBrown [Thu, 31 May 2018 16:44:57 +0000 (12:44 -0400)]
LU-4423 ldlm: use delayed_work for ldlm_pools_recalc

ldlm currenty has a kthread which wakes up every so often and calls
ldlm_pools_recalc(). The thread is started and stopped, but no other
external interactions happen.

This can trivially be replaced by a delayed_work if we have
ldlm_pools_recalc() reschedule the work rather than just report when to
do that.

Change-Id: I85f8bc79ef86d1c7a6cbe159e6970445eb7f8389
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/31705
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10370 ofd: truncate does not update blocks count on client 73/31073/10
Arshad Hussain [Fri, 9 Feb 2018 19:11:51 +0000 (00:41 +0530)]
LU-10370 ofd: truncate does not update blocks count on client

'truncate' call correctly updates the server side with
correct size and blocks count. However, on the client
side all the metadata are correctly updated except the
blocks count, which still reflects the old count prior
to truncate call. This patch fixes this issue by
modifying ofd_punch_hdl() to update repbody with the
updated block count.

New test case under sanity is added to verify the that
the blocks counts are correctly updated after truncate call

Change-Id: I8f3f44e1668fab925339350074d1ad8ab681fc95
Co-authored-by: Abrarahmed Momin <abrar.momin@gmail.com>
Signed-off-by: Abrarahmed Momin <abrar.momin@gmail.com>
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/31073
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10120 lsnapshot: handle dash in fsname 26/30626/2
Fan Yong [Thu, 21 Dec 2017 12:09:30 +0000 (20:09 +0800)]
LU-10120 lsnapshot: handle dash in fsname

'-' is a valid character for Lustre fsname. Replace "strchr()"
with "strrchr()" to correctly parse fsname from configuration.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Darby Vicker <darby.vicker-1@nasa.gov>
Change-Id: Ib972288668f1b7bcf1f9188c0e9cc77027e7ceeb
Reviewed-on: https://review.whamcloud.com/30626
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
3 years agoLU-9751 snapshot: set PATH for remote zfs commands 99/27999/15
Fan Yong [Mon, 9 Apr 2018 14:55:14 +0000 (22:55 +0800)]
LU-9751 snapshot: set PATH for remote zfs commands

It is possible that the remote zfs/zpool commands for Lustre
snapshot are NOT in the remote shell execute/search path. So
needs to set the PATH variable for the remote shell commands.

It is inconvenient for the admin to specify the PATH option
via single lsnapshot command for each Lustre target. So the
patch specifies the remote PATH environment variable as the
the local PATH environment variable. It requires all Lustre
servers to have broadly consistent zfs tools instalation in
such PATH.

It also contains some macro definations for code cleanup.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I2b1ce630d4aad63ab20e6c323f2222dccb51ed6e
Reviewed-on: https://review.whamcloud.com/27999
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9764 lfsck: reset LFSCK trace file if fail to load it 97/27997/2
Fan Yong [Wed, 12 Jul 2017 04:50:41 +0000 (12:50 +0800)]
LU-9764 lfsck: reset LFSCK trace file if fail to load it

If the on-disk LFSCK trace file is corrupted, then LFSCK
may get failure when load it. Under such case, the LFSCK
should reset (recreate) the traces files by force.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I0237a88ff23cdec680303ac3976a53c1632598fe
Reviewed-on: https://review.whamcloud.com/27997
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
3 years agoLU-10048 osd: async truncate 88/27488/43
Alex Zhuravlev [Wed, 7 Jun 2017 13:32:39 +0000 (17:32 +0400)]
LU-10048 osd: async truncate

osd-ldiskfs should execute truncate outside of main transaction
handle. This avoids restarting truncate transaction handles in
main transaction, and allows "transaction first, locking second"
model on OST.

Change-Id: Iffe45c42834c26ca72b65e068ad25ac61d0607c8
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/27488
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
3 years agoLU-6511 osd-ldiskfs: Fix all irregular indentation for osd_iam.c 98/19598/4
Parinay Kondekar [Sat, 26 May 2018 20:02:49 +0000 (01:32 +0530)]
LU-6511 osd-ldiskfs: Fix all irregular indentation for osd_iam.c

"osd_iam.c" had irregular and inconsistent indentation all
throughout the file. This patch fixes all the indentation
and space warnings throughout the file. There are still few
'checkpatch' errors/warnings left. However, to keep the patch
consistent only space and indents are corrected in this patch.

Test-Parameters: trivial
Change-Id: I55f650175b7efc85f87f216d8225b0517e8a3d94
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Signed-off-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Reviewed-on: https://review.whamcloud.com/19598
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
3 years agoLU-7236 ptlrpc: idle connections can disconnect 82/16682/123
Alex Zhuravlev [Mon, 28 Sep 2015 13:50:15 +0000 (16:50 +0300)]
LU-7236 ptlrpc: idle connections can disconnect

 - when new request is being allocated ptlrpc initiates
   connection if it's not connected yet
 - if the import is idle (no locks, no active RPCs, no
   non-PING reply for last osc_idle_timeout seconds),
   then pinger tries to disconnect asynchronously
 - currently only client-to-OST connections can be idle
 - lctl set_param osc.*.idle_timeout=N controls new feature:
   N=0 - disable
   N>0 - seconds to idle before disconnect
 - lctl set_param osc.*.idle_connect=N to reconnect if idle
   (N is positive number)
 - OSC module parameter osc_idle_timeout controls default
   idle timeout and set to 20 seconds by default

Change-Id: I4b90eb5209a0b0e62d85fd55ad6e9cab8c03fd14
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/16682
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
3 years agoLU-11066 systemd: Add IB dependencies to lnet.service 46/32646/3
Nathaniel Clark [Thu, 31 May 2018 14:45:47 +0000 (10:45 -0400)]
LU-11066 systemd: Add IB dependencies to lnet.service

Add ordering for inkernel (rdma.server) and Mellanox MOFED
(openibd.service).

This ensures that systemd will shutdown lnet prior to IB, thus
preventing it from hanging.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ia0be1ca60eb8f54edd2f4f6bfbca10cbc01cc638
Reviewed-on: https://review.whamcloud.com/32646
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-11049 ssk: correctly handle null byte by lgss_sk 10/32510/4
Sebastien Buisson [Tue, 22 May 2018 15:50:53 +0000 (17:50 +0200)]
LU-11049 ssk: correctly handle null byte by lgss_sk

lgss_sk must include null byte with fsname and nodemap info taken from
command line.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ie98444c930b8df521482468c4897e080ded0d2f6
Reviewed-on: https://review.whamcloud.com/32510
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10680 mdd: create gc thread when no current transaction 76/31376/40
Bruno Faccini [Thu, 22 Feb 2018 15:23:18 +0000 (16:23 +0100)]
LU-10680 mdd: create gc thread when no current transaction

Creating a kthread can't occur during a journal transaction is being
filled because otherwise a deadlock can happen if memory reclaim is
triggered by kthreadd when forking the new thread, and thus I/Os
could be attempted to the same device from shrinkers requiring a new
journal transaction to be started when current could never complete.

Thus this patch moves kthread_run() of gc_task in mdd_trans_stop().

Comment in mdd_changelog_max_idle_time_seq_write() as been updated
to reflect the need to limit the value to about 68 years, to allow
to keep with 32 bits operands for comparison,

As it will go away with recent kernels, get_seconds() usage has
been replaced by calling ktime_get_real_seconds() for user idle
time initialization and comparison.

Also, enable Changelog GC, as it is no longer the default, in
sanity/test_160f sub-test and remove it from ALWAYS_EXCEPT to
reenable it, leaving 160g for LU-10734 reason now. And in
addition, changes in sanity/test_160f have been added to make
it fully DNE-compatible.

With this patch, GC-thread can be stopped upon MDT umount, and
remaining orphan ChangeLog records clean-up will occur upon next
restart. New sanity/test_160h sub-test checks this scenario.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I7ec076bc04594b230c57348d7ac92acc58c258e1
Reviewed-on: https://review.whamcloud.com/31376
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11069 llite: correct file position after appending writes 41/32641/3
John L. Hammond [Wed, 6 Jun 2018 13:14:50 +0000 (08:14 -0500)]
LU-11069 llite: correct file position after appending writes

In ll_file_io_generic() use the position returned in the kiocb to set
the returned file position. This ensures that the file position is set
correctly after an appending write. Add sanity test_23d() to check
that calling lseek() for the current offset returns the correct value
in this situation.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ic76ce49db6e87d5294e18546d5b75a12793aa99c
Reviewed-on: https://review.whamcloud.com/32641
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10419 lfsck: signal master engine when stop 27/31627/5
Fan Yong [Fri, 20 Apr 2018 21:53:50 +0000 (05:53 +0800)]
LU-10419 lfsck: signal master engine when stop

It is possible that during the LFSCK scanning, some server, MDT
or OST, maybe offline. At that time, if the LFSCK needs to talk
with such offline server, related RPC will trigger reconnect to
the offline server, and the LFSCK engine has to wait untill the
offline server become online or someone deactives the server by
force. To avoid being blocked when lfsck_stop() under such case,
the stop logic will send SIGINT signal to LFSCK engines. But we
only do that for the LFSCK assistant engines, forget to do that
for the LFSCK master engine. This patch fixes that.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I5d51ab49524e8ae54f0853e93b94e78913f65e8a
Reviewed-on: https://review.whamcloud.com/31627
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11058 tests: stop running sanity test 77k 85/32685/2
James Nunez [Fri, 8 Jun 2018 18:46:10 +0000 (12:46 -0600)]
LU-11058 tests: stop running sanity test 77k

sanity test 77k is failing for a variety of Lustre
file system configurations. Stop running test 77k by
adding it to the ALWAYS_EXCEPT list.

When this issue is resolved, we need to resume running
sanity test 77k by removing it from the ALWAYS_EXCEPT list.

Test-Parameters: trivial
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I3cd53e721b1b3ede633603273dafd54c9f5701c4
Reviewed-on: https://review.whamcloud.com/32685
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-11054 lnet: remove non-error error message 60/32560/2
John L. Hammond [Fri, 25 May 2018 14:36:31 +0000 (09:36 -0500)]
LU-11054 lnet: remove non-error error message

In lnet_ipif_enumerate(), remove the CERROR() that prints each device.

Test-Parameters: trivial
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ida8d1636e9e608087205defabda865f930fd38a1
Reviewed-on: https://review.whamcloud.com/32560
Tested-by: Jenkins
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11043 kernel: kernel update RHEL7.5 [3.10.0-862.3.2.el7] 13/32513/3
Bob Glossman [Mon, 21 May 2018 23:20:05 +0000 (16:20 -0700)]
LU-11043 kernel: kernel update RHEL7.5 [3.10.0-862.3.2.el7]

update RHEL 7.5 kernel to 3.10.0-862.3.2.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I0defa14e83ce098c48b3228b4867afa73a2d9185
Reviewed-on: https://review.whamcloud.com/32513
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10808 lod: remove DoM component if DoM is disabled 82/32482/5
Mikhail Pershin [Mon, 21 May 2018 18:24:05 +0000 (21:24 +0300)]
LU-10808 lod: remove DoM component if DoM is disabled

If file is created with DoM component but server disables
DoM file creation then remove DoM entry from file layout
and keep other components.
If layout has only DoM entry then just return error.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ibafd0269d76dc5de4599efca064930607dc556eb
Reviewed-on: https://review.whamcloud.com/32482
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11014 mdt: intent handling simplification 57/32357/3
John L. Hammond [Fri, 11 May 2018 14:01:32 +0000 (09:01 -0500)]
LU-11014 mdt: intent handling simplification

Remove the obsolete constants MDT_IT_CREATE, MDT_IT_READDIR,
MDT_IT_UNLINK, and MDT_IT_TRUNC from enum mdt_it_code. Also remove
MDT_IT_OCREAT, since (at this level) it can be handled identically to
MDT_IT_OPEN. Rename mdt_intent_reint() to mdt_intent_open() since it
only handles open. Move the definition of the mdt_it_flavor array down
and remove the then unneeded forward declarations of mdt_intent_*().
In struct mdt_it_flavor, remove the obsolete it_reint member and
rename the it_flags member to it_handler_flags to avoid confusion with
LDLM flags. Use 'enum tgt_handler_flags' rather than __u32 for several
parameters used to hold values of that type.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I297ef397c879fcc7711d725e0315e73439d95826
Reviewed-on: https://review.whamcloud.com/32357
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10977 test: add version check to sanity test_60ab 43/32343/4
Saurabh Tandan [Wed, 9 May 2018 21:27:15 +0000 (14:27 -0700)]
LU-10977 test: add version check to sanity test_60ab

Skip sanity.sh test_60ab if server is equal or
less than 2.11.51

Test-Parameters:trivial testlist=sanity envdefinitions=ONLY=60ab serverjob=lustre-b2_10 serverbuildno=69
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: Ie9d2728790e19ac2a24c94e7c13ade28b5a5bbbe
Reviewed-on: https://review.whamcloud.com/32343
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4423 obd: backport of lu_object changes upstream 25/32325/3
NeilBrown [Wed, 9 May 2018 02:46:29 +0000 (22:46 -0400)]
LU-4423 obd: backport of lu_object changes upstream

fold lu_object_new() into lu_object_find_at()

lu_object_new() duplicates a lot of code that is in
lu_object_find_at().
There is no real need for a separate function, it is simpler just
to skip the bits of lu_object_find_at() that we don't
want in the LOC_F_NEW case.

Linux-commit: 775c4dc274343e5e2959fa1171baf2fc01028840

discard extra lru count.

lu_object maintains 2 lru counts.
One is a per-bucket lsb_lru_len.
The other is the per-cpu ls_lru_len_counter.

The only times the per-bucket counters are use are:
 - a debug message when an object is added
 - in lu_site_stats_get when all the counters are combined.

The debug message is not essential, and the per-cpu counter
can be used to get the combined total.

So discard the per-bucket lsb_lru_len.

Linux-commit: e167b370360f8887cf21a2a82f83e7118a2aeb11

make struct lu_site_bkt_data private

This data structure only needs to be public so that
various modules can access a wait queue to wait for object
destruction.
If we provide a function to get the wait queue, rather than the
whole bucket, the structure can be made private.

Linux-commit: bc5e7fb40d36edb95ce8f661596811bec3f7d5cf

Change-Id: I26203f331a0c73ae4e23878eb10b15d9fcf546c5
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32325
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10971 tests: use changelog routines in lustre-rsync-test 08/32208/4
James Nunez [Mon, 30 Apr 2018 19:12:49 +0000 (13:12 -0600)]
LU-10971 tests: use changelog routines in lustre-rsync-test

The lustre-rsync-test script has two subroutines to register
and deregister changelog users. These subroutines should be
updated to use changelog_register() and changelog_deregister()
found in test-framework.sh.

Test-Parameters: trivial clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=lustre-rsync-test
Test-Parameters: clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs testlist=lustre-rsync-test
Test-Parameters: clientcount=2 mdscount=1 mdtcount=1 osscount=1 ostcount=8 mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs testlist=lustre-rsync-test
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ia54095a6e039f6835def0f9c49157b71088d9e51
Reviewed-on: https://review.whamcloud.com/32208
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10808 lod: align wrong DoM stripe values with defaults 73/32073/5
Mikhail Pershin [Thu, 19 Apr 2018 13:29:54 +0000 (16:29 +0300)]
LU-10808 lod: align wrong DoM stripe values with defaults

- Align DoM component size to the server limit size instead of
  returning a error. Error is returned still if DoM file creation
  is disabled on the server (DOM limit is set to 0)
- Correct wrong values for dom_stripesize parameter by using minimal
  stripe size if provided value is lower and by aligning it to be a
  multiple of that minimal size.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ifcdf60fddda65acda92509bb7e69c9b2951fb6bd
Reviewed-on: https://review.whamcloud.com/32073
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4423 ptlrpc: use delayed_work in sec_gc 24/31724/3
Dmitry Eremin [Thu, 22 Mar 2018 15:51:00 +0000 (18:51 +0300)]
LU-4423 ptlrpc: use delayed_work in sec_gc

The garbage collection for security contexts currently has a dedicated
kthread which wakes up every 30 minutes to discard old garbage.

Replace this with a simple delayed_work item on the system work queue.

Change-Id: I5cdb023783104b5e21f4139731065946ed162af1
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/31724
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10648 ldlm: Reduce debug to console during eviction 37/31237/4
Patrick Farrell [Fri, 26 Aug 2016 16:03:33 +0000 (11:03 -0500)]
LU-10648 ldlm: Reduce debug to console during eviction

During an eviction, Lustre calls ldlm_namespace_cleanup,
and it will sometimes end up dumping all of the locks on a
particular resource to the console log
(ldlm_resource_complain), which is very wasteful and only
rarely helpful.

Move the debug level for this to D_NETERROR since it is in the
default debug mask.

Change-Id: I8a00f030393ce1748914d70fa8edb4690273e08a
Cray-bug-id: LUS-1418
Signed-off-by: Chris Horn <hornc@cray.com>
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/31237
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10472 osc: add T10PI support for RPC checksum 80/30980/37
Li Xi [Tue, 23 Jan 2018 07:17:17 +0000 (02:17 -0500)]
LU-10472 osc: add T10PI support for RPC checksum

T10 Protection Information (T10 PI), previously known as Data
Integrity Field (DIF), is a standard for end-to-end data integrity
validation. T10 PI prevents silent data corruption, ensuring that
incomplete and incorrect data cannot overwrite good data.

Lustre file system already supports RPC level checksum which
validates the data in bulk RPCs when writing/reading data to/from
objects on OSTs. RPC level checksum can detect data corruption that
happens during RPC being transferred over the wire. However, it is
not capable to prevent silent data corruption happening in other
conditions, for example, memory corruption when data is cached in
page cache. And by using the existing checksum mechanism, only
disjoint protection coverage is provided. Thus, in order to provide
end-to-end data protection, T10PI support for Lustre should be added.

In order to provide end-to-end data integrity validation, the T10 PI
checksum of data in a sector need to be calculated on Lustre client
side and validated later on the Lustre OSS side. The T10 protection
information should be sent together with the data in the RPC.
However, in order to avoid significant performance degradation,
instead of sending all original guard tags for all sectors in a bulk
RPC, the existing checksum feature of bulk RPC will be integrated
together with the new T10PI feature.

When OST starts, necessary T10PI information will be extracted from
storage, i.e. the T10PI DIF type and sector size. The DIF type could
be one of TYPE1_IP, TYPE1_CRC, TYPE3_IP and TYPE3_CRC. And sector
size could be either 512 or 4K bytes.

When an OSC is connecting to OST, OSC and OST will negotiate about
the checksum types. New checksum types are added for T10PI support
including OBD_CKSUM_T10IP512, OBD_CKSUM_T10IP4K, OBD_CKSUM_T10CRC512,
and OBD_CKSUM_T10CRC4K. If the OST storage has T10PI suppoort, the
only selectable T10PI checksum type would have the same type with the
T10PI type of the hardware. The other existing checksum types (crc32,
crc32c, adler32) are still valid options for the RPC checksum type.

When calculating RPC checksum of T10PI, the T10PI checksums of all
sectors will be calculated first using the T10PI chekcsum type, i.e.
16-bit crc or IP checksum. And then RPC checksum will be calculated on
all of the T10PI checksums. The RPC checksum type used in this step is
always alder32. Considering that the checksum-of-checksums is only
computed on a * 4KB chunk of GRD tags for a 1MB RPC for 512B sectors,
or 16KB of GRD tags for 16MB of 4KB sectors, this is only 1/256 or
1/1024 of the total data being checksummed, so the checksum type used
here should not affect overall system performance noticeably.

obdfilter.*.enforce_t10pi_cksum can be used to tune whether to enforce
T10-PI checksum or not.

If the OST supports T10-PI feature and T10-PI chekcsum is enforced, clients
will have no other choice for RPC checksum type other than using the T10PI
chekcsum type. This is useful for enforcing end-to-end integrity in the
whole system.

If the OST doesn't support T10-PI feature and T10-PI chekcsum is enforced,
together with other checksums with reasonably good speeds (e.g. crc32,
crc32c, adler, etc.), all the T10-PI checksum types (t10ip512, t10ip4K,
t10crc512, t10crc4K) will be added to the available checksum types,
regardless of the speeds of T10-PI chekcsums. This is useful for testing
T10-PI checksums of RPC.

If the OST supports T10-PI feature and T10-PI chekcsum is NOT enforced,
the corresponding T10-PI checksum type will be added to the checksum type
list, regardless of the speed of the T10-PI chekcsum. This provide the
clients to flexibility to choose whether to enable end-to-end integrity
or not.

If the OST does NOT supports T10-PI feature and T10-PI chekcsum is NOT
enforced, together with other checksums with reasonably good speeds,
all the T10-PI checksum types with good speeds will be added into the
checksum type list. Note that a T10-PI checksum type with a speed worse
than half of Alder will NOT be added as a option. In this circumstance,
T10-PI checksum types has the same behavior like other normal checksum
types.

The clients that has no T10-PI RPC checksum support will not be affected
by the above-mentioned logic. And that logic will only be enforced to the
newly connected clients after changing obdfilter.*.enforce_t10pi_cksum on
an OST.

Following are the speeds of different checksum types on a server with CPU
of Intel(R) Xeon(R) E5-2650 @ 2.00GHz:

crc: 1575 MB/s
crc32c: 9763 MB/s
adler: 1255 MB/s
t10ip512: 6151 MB/s
t10ip4k: 7935 MB/s
t10crc512: 1119 MB/s
t10crc4k: 1531 MB/s

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I6468680edeab0917bb71dbd8cd9ea16c65e935f5
Reviewed-on: https://review.whamcloud.com/30980
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
3 years agoLU-8066 llite: Preparation to move /proc/fs/lustre/llite to sysfs 31/24031/23
James Simmons [Fri, 25 May 2018 01:16:18 +0000 (21:16 -0400)]
LU-8066 llite: Preparation to move /proc/fs/lustre/llite to sysfs

Add necessary infrastructure, add support for mountpoint
registration in /sys/fs/lustre/llite

This is a heavly modified version of

Linux-commit: fd0d04ba85f95169106701397417360541a983b3

due to the large amount of changes to the OpenSFS/Intel branch.

Change-Id: Ic9ca2044249a59dc79ebc86553c8b7ce7afbf710
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24031
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10264 misc: fix possible array overflow 42/32242/2
Andreas Dilger [Fri, 9 Mar 2018 23:18:53 +0000 (16:18 -0700)]
LU-10264 misc: fix possible array overflow

Fix a static analysis error.

lustre/obdclass/obd_mount_server.c:1830 in osd_start(), buffer
    flagstr has size 16 but length of format string "%lu:%lu" is 31.
Increase the size of buffer to hold maximal-sized strings plus NUL.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3cc80d66bbb537161a561f4f2ba7830dde2cab07
Reviewed-on: https://review.whamcloud.com/32242
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8972 tests: remove conf-sanity test from ALWAYS_EXCEPT 20/32220/4
James Nunez [Tue, 1 May 2018 19:20:28 +0000 (13:20 -0600)]
LU-8972 tests: remove conf-sanity test from ALWAYS_EXCEPT

A patch landed to fix the issue reported in LU-8972. We need
to run conf-sanity test 101 to ensure that the issue is
fixed and does not regress.

Remove conf-sanity test 101 from the ALWAYS_EXCEPT list.

Test-Parameters: trivial
Test-Parameters: trivial clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=conf-sanity
Test-Parameters: clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs testlist=conf-sanity
Test-Parameters: clientcount=2 mdscount=1 mdtcount=1 osscount=1 ostcount=8 mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs testlist=conf-sanity
Test-Parameters: clientcount=2 mdscount=1 mdtcount=1 osscount=1 ostcount=8 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=conf-sanity
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic678c7527a60cab2de6139041cca81017d4aa75e
Reviewed-on: https://review.whamcloud.com/32220
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-6160 osd-zfs: Fix refcount_add call 44/28544/2
Giuseppe Di Natale [Mon, 14 Aug 2017 16:51:52 +0000 (09:51 -0700)]
LU-6160 osd-zfs: Fix refcount_add call

Correct the refcount_add in osd-zfs module's osd_fix_new_dnode
function. The variable 'tag' was undefined and caused osd-zfs
to fail builds against zfs packages with debug enabled.

This small change should enable lustre to be built against
zfs packages that have debug enabled.

Test-Parameters: trivial
Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Change-Id: If95f0af6178cf0ea78724658edfaece1ee16a3f1
Reviewed-on: https://review.whamcloud.com/28544
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9727 tests: exercise new changelog fields and records 35/32335/4
Sebastien Buisson [Fri, 19 Jan 2018 17:22:40 +0000 (02:22 +0900)]
LU-9727 tests: exercise new changelog fields and records

Add new tests in sanity-hsm to exercise new changelog fields
and also record types.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I1cd7282983d936105e1616aa859c47fd453e6017
Reviewed-on: https://review.whamcloud.com/32335
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10796 tests: standardize changelog testing in sanity-hsm 13/31613/13
Quentin Bouget [Fri, 9 Mar 2018 14:20:03 +0000 (14:20 +0000)]
LU-10796 tests: standardize changelog testing in sanity-hsm

To manage changelog users and changelog records, sanity-hsm used to
define:
 - changelog_setup
 - changelog_cleanup
 - changelog_get_flags

test-framework.sh implements similar functions:
 - changelog_register
 - changelog_deregister
 - changelog_dump
 - changelog2array (new in this patch)

This patch removes the implementations of sanity-hsm in favor of
those in test-framework.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: Ie3db8ef646fa48d06bf41b6025b3443de026cabd
Reviewed-on: https://review.whamcloud.com/31613
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8130 ldlm: store name directly in namespace. 08/32408/2
NeilBrown [Tue, 15 May 2018 17:29:27 +0000 (13:29 -0400)]
LU-8130 ldlm: store name directly in namespace.

Rather than storing the name of a namespace in the
hash table, store it directly in the namespace.
This will allow the hashtable to be changed to use
rhashtable.

Linux-commit: 648ae363628c84faa8d8861e3246e096b8c0a392

Change-Id: Ie5bb8092c9e1831fbc38beade46be6d35f3256dc
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/32408
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-11017 quota: ignore quota for CAP_SYS_RESOURCE properly 78/32378/10
Wang Shilong [Wed, 16 May 2018 02:13:13 +0000 (10:13 +0800)]
LU-11017 quota: ignore quota for CAP_SYS_RESOURCE properly

Currently, lustre quota will ignore this type of quota if
quota id is 0 or we force to ignore.

For write, we have passed CAP_SYS_RESOURCE properly, but
For metadata operations this is not done.

Test-Parameters: testlist=sanity-quota
Change-Id: Ibcdc0e53ad125042d4889ac51a9a9ead4066c0c8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/32378
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11015 lov: Move lov_tgts_kobj init to lov_setup 67/32367/4
Oleg Drokin [Fri, 18 May 2018 01:43:18 +0000 (21:43 -0400)]
LU-11015 lov: Move lov_tgts_kobj init to lov_setup

and free it in lov_cleanup.
This looks like a more robust solution vs doint it in lov_putref
esp. since we know refcount there crosses 0 repeatedly, confusing
things.

Change-Id: I49b1a1e97464bd388fe20a97b903468139730213
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/32367
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-11010 tests: remove calls to return after skip() 46/32346/2
James Nunez [Wed, 9 May 2018 22:11:19 +0000 (16:11 -0600)]
LU-11010 tests: remove calls to return after skip()

The skip() routine now contains a call to exit. All calls
to skip() and skip_env() should be reviewed and calls to
return that followed skip should be removed.

A problem with the skip message not being printed is corrected.

Test-Parameters: trivial testlist=sanity-pfl
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I1a52e9bd79a71de4ab4c0cea9c569f379115a603
Reviewed-on: https://review.whamcloud.com/32346
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11003 ldlm: fix for l_lru usage 09/32309/3
Yang Sheng [Mon, 7 May 2018 15:59:01 +0000 (23:59 +0800)]
LU-11003 ldlm: fix for l_lru usage

Fixes for lock convert code to prevent false assertion and
busy locks in LRU:
- ensure no l_readers and l_writers when add lock to LRU after
  convert.
- don't verify l_lru without ns_lock.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I8bcbdef3cb72db241ad03c50f5ce2b968e3ee3e4
Reviewed-on: https://review.whamcloud.com/32309
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10855 llog: remove obsolete llog handlers 02/32202/2
John L. Hammond [Mon, 30 Apr 2018 14:08:48 +0000 (09:08 -0500)]
LU-10855 llog: remove obsolete llog handlers

Remove the obsolete llog RPC handling for cancel, close, and
destroy. Remove llog handling from ldlm_callback_handler(). Remove the
unused client side method llog_client_destroy().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ieab44f3796971a7d3c65d6044e4c0be4afb4b508
Reviewed-on: https://review.whamcloud.com/32202
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10938 ptlrpc: Add WBC connect flag 41/32241/5
Oleg Drokin [Wed, 2 May 2018 07:03:38 +0000 (03:03 -0400)]
LU-10938 ptlrpc: Add WBC connect flag

It denotes ability of the node to understand additional
types of intent requests, exclusive metadata locks issued
to clients and server operations performed under such
locks while still held by clients.

Test-Parameters: trivial
Change-Id: I72c1ddfdf94edea3b357d82da6c410bc2d79a75c
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/32241
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
3 years agoLU-10945 ldlm: fix l_last_activity usage 33/32133/3
Alexander Boyko [Tue, 24 Apr 2018 07:06:42 +0000 (03:06 -0400)]
LU-10945 ldlm: fix l_last_activity usage

When race happen between ldlm_server_blocking_ast() and
ldlm_request_cancel(), the at_measured() is called with wrong
value equal to current time. And even worse, ldlm_bl_timeout() can
return current_time*1.5.
Before a time functions was fixed by LU-9019(e920be681) for 64bit,
this race leads to ETIMEDOUT at ptlrpc_import_delay_req() and
client eviction during bl ast sending. The wrong type conversion
take a place at pltrpc_send_limit_expired() at cfs_time_seconds().

We should not take cancels into accoount if the BLAST is not send,
just because the last_activity is not properly initialised - it
destroys the AT completely.
The patch devides l_last_activity to the client l_activity and
server l_blast_sent for better understanding. The l_blast_sent is
used for blocking ast only to measure time between BLAST and
cancel request.

For example:
 server cancels blocked lock after 1518731697s
 waiting_locks_callback()) ### lock callback timer expired after 0s:
 evicting client

Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I44962d2b3675b77e09182bbe062bdd78d6cb0af5
Cray-bug-id: LUS-5736
Reviewed-on: https://review.whamcloud.com/32133
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4684 xattr: add list support for remote object 26/31426/9
Lai Siyao [Sun, 21 Jan 2018 07:57:22 +0000 (15:57 +0800)]
LU-4684 xattr: add list support for remote object

XATTR_LIST may be issued to a remote object in directory migration,
add this support for OSP and OUT.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I9681e149703de2837a04dc1448d1bd583659205d
Reviewed-on: https://review.whamcloud.com/31426
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4684 mdt: improve directory stripe lock 25/31425/9
Lai Siyao [Sun, 21 Jan 2018 05:22:24 +0000 (13:22 +0800)]
LU-4684 mdt: improve directory stripe lock

Striped directory has an implication that the first stripe is
local, and others are remote, but this is not true for migrating
directory because its stripes consists of both the original and
the newly created stripes.

This patch also put striped directory master object locking and
stripes locking into one function called mdt_reint_striped_lock(),
which simplifies locking code.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I4724447e5b10c301b6799e1827f6d13a40876945
Reviewed-on: https://review.whamcloud.com/31425
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11026 lustre-dkms should require patch or quilt 31/32431/3
Joe Grund [Wed, 16 May 2018 17:18:56 +0000 (13:18 -0400)]
LU-11026 lustre-dkms should require patch or quilt

Add patch requirement to lustre-dkms.spec.in
as it (or quilt) are needed for lustre-build-ldiskfs.

  - Add requires patch.

Change-Id: I640bae382511502c02a0237694c93c304047f339
Signed-off-by: Joe Grund <joe.grund@intel.com>
Reviewed-on: https://review.whamcloud.com/32431
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
3 years agoLU-10964 build: armv7 client build fixes 94/32194/8
Andrew Perepechko [Sat, 28 Apr 2018 08:32:03 +0000 (11:32 +0300)]
LU-10964 build: armv7 client build fixes

This commit is supposed to fix armv7 Lustre client
build, mostly 64-bit division related changes.

Change-Id: I93d83d577351c1a1053e39a162cb1e85fc4e8aa3
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/32194
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7943 mdd: Move assignment after LASSERT() 76/32376/2
Arshad Hussain [Sat, 12 May 2018 08:43:54 +0000 (14:13 +0530)]
LU-7943 mdd: Move assignment after LASSERT()

This patch moves 'sname->ln_namelen' assignment call after LASSERT() call.
This avoids a case when 'sname' parameter is NULL and dereferencing the
NULL pointer would fault before it reaches LASSERT()

Change-Id: I68b07f7ca33fd21ee0599b7bb73d6e41546bd2d8
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32376
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
3 years agoLU-10897 kernel: kernel upgrade RHEL7.5 [3.10.0-862.2.3.el7] 70/32370/5
Bob Glossman [Thu, 10 May 2018 14:46:35 +0000 (07:46 -0700)]
LU-10897 kernel: kernel upgrade RHEL7.5 [3.10.0-862.2.3.el7]

With this mod we switch our supported el7 version to RHEL 7.5

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iedcea9498591d15eab69187274e4c32c57879e4e
Reviewed-on: https://review.whamcloud.com/32370
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-11019 build: Update ZFS/SPL to 0.7.9 88/32388/2
Nathaniel Clark [Mon, 14 May 2018 19:06:45 +0000 (15:06 -0400)]
LU-11019 build: Update ZFS/SPL to 0.7.9

This updates the ZFS version to 0.7.9.

https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.7.9

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I9452a589d9dc719de7a63d3ed287dec8b6f7c0b6
Reviewed-on: https://review.whamcloud.com/32388
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-4423 libcfs: disable preempt while sampling processor id. 60/32360/3
NeilBrown [Sat, 12 May 2018 17:52:58 +0000 (13:52 -0400)]
LU-4423 libcfs: disable preempt while sampling processor id.

Calling smp_processor_id() without disabling preemption
triggers a warning (if CONFIG_DEBUG_PREEMPT).
I think the result of cfs_cpt_current() is only used as a hint for
load balancing, rather than as a precise and stable indicator of
the current CPU.  So it doesn't need to be called with
preemption disabled.

So disable preemption inside cfs_cpt_current() to silence the warning.

Linux-commit : dbeccabf5294e80f7cc9ee566746c42211bed736

Change-Id: Iaa930acc7a2633c0e40bcabbe6bd309a3d767325
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32360
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9859 libcfs: rearrange placement of CPU partition management code. 59/32359/3
NeilBrown [Sat, 12 May 2018 13:51:30 +0000 (09:51 -0400)]
LU-9859 libcfs: rearrange placement of CPU partition management code.

Currently the code for cpu-partition tables lives in various places.
The non-SMP code is partly in libcfs/libcfs_cpu.h as static inlines,
and partly in lnet/libcfs/libcfs_cpu.c - some of the functions are
tiny and could well be inlines.

The SMP code is all in lnet/libcfs/linux/linux-cpu.c.

This patch moves all the trivial non-SMP functions into
libcfs_cpu.h as inlines, and all the SMP functions into libcfs_cpu.c
with the non-trival !SMP code.

Now when you go looking for some function, it is easier to find both
versions together when neither is trivial.

There is no code change here - just code movement.

Linux-commit: 93aa2c2a5091bd47819a3ead4af70fb57fda5065

Change-Id: I5250a52cad576eaeec17de176a3ca45ad076c4b9
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32359
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>