Whamcloud - gitweb
fs/lustre-release.git
6 years agoLU-10565 osd: unify interface for vfs 46/31646/6
Yang Sheng [Fri, 26 Jan 2018 17:26:26 +0000 (01:26 +0800)]
LU-10565 osd: unify interface for vfs

Some vfs changes were applied to other part but
OSD. So unify them with OSD layer.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ia3e907964d6321571f52e4c24a46a8ab64e4d056
Reviewed-on: https://review.whamcloud.com/31646
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10710 tests: fix run_write_disjoint line continuation 45/31645/2
James Nunez [Wed, 14 Mar 2018 18:42:39 +0000 (12:42 -0600)]
LU-10710 tests: fix run_write_disjoint line continuation

There is a problem with creating a command in
run_write_disjoint() due to a line continuation followed
on the next line by tabs. We need to remove the end
quotation before the line continuation and first
quotation mark on following line.

Test-Parameters: trivial testlist=parallel-scale
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I32dc620dd5c3e3d305d0bf985a096e69c18404d1
Reviewed-on: https://review.whamcloud.com/31645
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10565 osd: bi_error, pagevec_init, PAGE_CACHE_SHIFT changes 44/31644/2
Yang Sheng [Wed, 14 Mar 2018 09:36:48 +0000 (17:36 +0800)]
LU-10565 osd: bi_error, pagevec_init, PAGE_CACHE_SHIFT changes

 - bi_error replace to bi_status in bio
 - pagevec_init takes one parameter
 - PAGE_CACHE_SHIFT be removed

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ia04124d6d636d132550a63e1f8144c26cab39f8e
Reviewed-on: https://review.whamcloud.com/31644
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10819 o2ib: use splice in kiblnd_peer_connect_failed() 43/31643/2
John L. Hammond [Wed, 14 Mar 2018 17:12:06 +0000 (12:12 -0500)]
LU-10819 o2ib: use splice in kiblnd_peer_connect_failed()

In kiblnd_peer_connect_failed() replace a backwards list_add() and
list_del() with list_splice_init().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ib00d5d911d1070b6c8b49f14a2c7fc3552da553c
Reviewed-on: https://review.whamcloud.com/31643
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-4939 obdclass: llog_print params file 20/31620/8
Ben Evans [Fri, 9 Mar 2018 20:51:26 +0000 (15:51 -0500)]
LU-4939 obdclass: llog_print params file

Allow llog_print to handle the params file in yaml

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: Icf286bca7a1466bf3c8d9084971e58d2e8b8a651
Test-Parameters: trivial testlist=sanity
Reviewed-on: https://review.whamcloud.com/31620
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
6 years agoLU-10785 llite: use xattr_handler name for ACLs 95/31595/5
John L. Hammond [Thu, 8 Mar 2018 21:27:28 +0000 (15:27 -0600)]
LU-10785 llite: use xattr_handler name for ACLs

If struct xattr_handler has a name member then use it (rather than
prefix) for the ACL xattrs. This avoids a bug where ACL operations
failed for some kernels.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I28f6c5dbe3cdc4155e93d388d2c413092e02c082
Reviewed-on: https://review.whamcloud.com/31595
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10787 llite: correct removexattr detection 94/31594/3
John L. Hammond [Thu, 8 Mar 2018 19:30:46 +0000 (13:30 -0600)]
LU-10787 llite: correct removexattr detection

In ll_xattr_set_common() detect the removexattr() case correctly by
testing for a NULL value as well as XATTR_REPLACE.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I29a29851ad4ac432e257b63088e2d7a7dfc39605
Reviewed-on: https://review.whamcloud.com/31594
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10788 llite: pass flags through __vfs_setxattr() 93/31593/3
John L. Hammond [Thu, 8 Mar 2018 19:23:34 +0000 (13:23 -0600)]
LU-10788 llite: pass flags through __vfs_setxattr()

In the compat definition of __vfs_setxattr() pass the flags we
received down to the handler. For consistency with upstream return
-EOPNOTSUPP if no handler could be found.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I78b88d1521dd000e328f1add1a6159c70d16f5a7
Reviewed-on: https://review.whamcloud.com/31593
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
6 years agoLU-10792 llite: remove unused parameters from md_{get,set}xattr() 92/31592/3
John L. Hammond [Thu, 8 Mar 2018 19:03:54 +0000 (13:03 -0600)]
LU-10792 llite: remove unused parameters from md_{get,set}xattr()

md_getxattr() and md_setxattr() each have several unused
parameters. Remove them and improve the naming or remaining
parameters.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I578bdd5dab70745ba7f8fbb9f047fa9eb1f6ee9a
Reviewed-on: https://review.whamcloud.com/31592
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10541 llite: setxattr directly in ll_set_acl 88/31588/4
John L. Hammond [Thu, 8 Mar 2018 18:55:42 +0000 (12:55 -0600)]
LU-10541 llite: setxattr directly in ll_set_acl

Call md_setxattr() directly from ll_set_acl().

Test-Parameters: alwaysuploadlogs clientdistro=sles12sp3 testlist=parallel-scale-nfsv3
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie266ee4fe7a67338122a6a3effb545d3dbaee008
Reviewed-on: https://review.whamcloud.com/31588
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10779 llite: rename FSFILT_IOC_* to system flags 46/31546/3
Jinshan Xiong [Tue, 6 Mar 2018 16:54:11 +0000 (08:54 -0800)]
LU-10779 llite: rename FSFILT_IOC_* to system flags

Those definitions were probably created for compatibility. Now that
FS_IOC_* have been existing in kernel for long time, we should use
them to avoid confusion.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Change-Id: Id3b72233b619f1cf761ec5769e27b94af862cd22
Reviewed-on: https://review.whamcloud.com/31546
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10776 osc: Do not request more than 2GiB grant 33/31533/2
Patrick Farrell [Mon, 5 Mar 2018 16:24:32 +0000 (10:24 -0600)]
LU-10776 osc: Do not request more than 2GiB grant

The server enforces a grant limit of 2 GiB, which the
client must honor.  The existing client code combined with
16 MiB RPCs make it possible for the client to ask for
more than this limit.

Make this limit explicit, and also fix an overflow bug in
o_undirty calculation in osc_announce_cached.  (o_undirty
is a 32 bit value and 16 MiB*256 rpcs_in_flight = 4 GiB.
4 GiB + extra grant components overflows o_undirty.)

Cray-bug-id: LUS-5750
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ifcb8a9ea7529eae4cd209dc72223ed039c6f6a0d
Reviewed-on: https://review.whamcloud.com/31533
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-9444 tests: replace SINGLEMDS1 with SINGLEMDS 20/31420/4
James Nunez [Mon, 26 Feb 2018 17:59:50 +0000 (10:59 -0700)]
LU-9444 tests: replace SINGLEMDS1 with SINGLEMDS

In conf-sanity test 87, we use the global variable SINGLEMDS1
to get the version of the MDS. SINGLEMDS1 is not defined and
the test should use SINGLEMDS to check the version of the MDS.

Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: envdefinitions=ONLY=87 mdsjob=lustre-b2_9 ossjob=lustre-b2_9 serverbuildno=22 testlist=conf-sanity
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic8b1f32b87cc596fcc2e98d5b6095b6e4171bfd7
Reviewed-on: https://review.whamcloud.com/31420
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10629 lod: Clear OST pool with setstripe 64/31364/12
Ben Evans [Wed, 21 Feb 2018 18:17:58 +0000 (13:17 -0500)]
LU-10629 lod: Clear OST pool with setstripe

When setstripe -d is run on a directory, we should
clear the OST pool along with all the other settings
Currently there is no way to clear an OST pool,
only change them.

Signed-off-by: Ben Evans <bevans@cray.com>
Cray-bug-id: LUS-5696
Change-Id: I50426ce79ab153a715d29cc5d54b0ce70726da41
Reviewed-on: https://review.whamcloud.com/31364
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10649 llite: yield cpu after call to ll_agl_trigger 40/31240/4
Ann Koehler [Wed, 7 Jun 2017 19:28:03 +0000 (14:28 -0500)]
LU-10649 llite: yield cpu after call to ll_agl_trigger

The statahead and agl threads loop over all entries in the
directory without yielding the CPU. If the number of entries in
the directory is large enough then these threads may trigger
soft lockups. The fix is to add calls to cond_resched() after
calling ll_agl_trigger(), which gets the glimpse lock for a
file.

Change-Id: I4fbc72a3c6bc77f2ffd8e3fd0daf4c8906bb954a
Cray-bug-id: LUS-2584
Signed-off-by: Chris Horn <hornc@cray.com>
Reviewed-on: https://review.whamcloud.com/31240
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10643 ptlrpc: ptlrpc_register_bulk() LBUG on ENOMEM 28/31228/8
Andriy Skulysh [Tue, 19 Dec 2017 09:20:21 +0000 (11:20 +0200)]
LU-10643 ptlrpc: ptlrpc_register_bulk() LBUG on ENOMEM

Assertion fails on !desc->bd_registered during
retry after ENOMEM.

Drop bd_registered flag and exit via cleanup_bulk
to ensure that bulk is fully unregistered.

Cray-bug-id: MRP-4733
Change-Id: I51be5ec041ef903040bf8508156da8079511c9f7
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/31228
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10598 obdclass: ignore IGIF formatted last_id 40/31140/2
Fan Yong [Fri, 2 Feb 2018 07:44:26 +0000 (15:44 +0800)]
LU-10598 obdclass: ignore IGIF formatted last_id

All the FIDs with sequence within [FID_SEQ_IGIF, FID_SEQ_IGIF_MAX]
is valid IGIF in spite of what the f_oid is. So the IGIF with zero
f_oid is also valid IGIF, not last_id. So that last_id check logic
should ignore IGIF formatted last_id.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I81dc7b237e91688b09f360e43899a1de2c44bf78
Reviewed-on: https://review.whamcloud.com/31140
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10560 libcfs: Use kernel_write when appropriate 54/31154/17
Mike Marciniszyn [Tue, 27 Feb 2018 15:25:59 +0000 (10:25 -0500)]
LU-10560 libcfs: Use kernel_write when appropriate

Changes in the upstream kernel might have removed
vfs_write() in favor of kernel_write().

Unfortunately, the kernel_write() was initially exported
with an API that is not plug compatible with vfs_write()

The ring down is:
- kernel_write new API
- vfs_write

Change-Id: I67f73786308561dc42b06d51c26bfb94021b7589
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31154
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10461 tests: call exit in the skip routine 64/30964/8
James Nunez [Sun, 21 Jan 2018 22:31:31 +0000 (15:31 -0700)]
LU-10461 tests: call exit in the skip routine

There are many reasons to not run, or skip, a test; the test
may require a certain number of servers or a certain Lustre version.
In these cases, the skip() or skip_env() routine is called. When we
call skip, the intention is to exit the routine early. Thus, call
‘exit 0’ at the end of the skip() routine.

Some calls to skip() are changed to skip_env() when a test is being
skipped due to the Lustre configuration or test environment.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I42fd9535c0a803f334dfc5685f451a6bdc85e84b
Reviewed-on: https://review.whamcloud.com/30964
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10383 hsm: ignore compound_id 49/30949/5
John L. Hammond [Mon, 18 Dec 2017 15:24:33 +0000 (09:24 -0600)]
LU-10383 hsm: ignore compound_id

Ignore request compound ids in the HSM coordinator. Compound ids
prevent batching of CDT to CT requests and degrade HSM
performance. Use CT/archive id compatabiliy when deciding which HSM
actions to put in a request.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I38513f3b75313eb78bfb9811ab4e40e3e2b904c7
Reviewed-on: https://review.whamcloud.com/30949
Tested-by: Jenkins
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10383 hsm: add action count to hsm scan data 35/31235/3
John L. Hammond [Thu, 8 Feb 2018 19:19:39 +0000 (13:19 -0600)]
LU-10383 hsm: add action count to hsm scan data

Add an 'hsm_action_count' member to struct hsm_scan_data to count the
total number of actions in all requests in the hsd. Add an 'hsd_'
prefix to all pre-existing members of struct hsm_scan_data.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iab784a0f281d697bc0db758f20ce500315b8194a
Reviewed-on: https://review.whamcloud.com/31235
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10383 hsm: remove struct hsm_thread_data 34/31234/3
John L. Hammond [Thu, 8 Feb 2018 17:31:37 +0000 (11:31 -0600)]
LU-10383 hsm: remove struct hsm_thread_data

Remove struct hsm_thread_data. Move allocation of the HSM scan data
requests array to mdt_coordinator().

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I12075dc000c312d2432c8e32787ed36560d1ae42
Reviewed-on: https://review.whamcloud.com/31234
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10802 nrs: mismatch problem for wildcard in jobid TBF 62/29162/7
Qian Yingjin [Fri, 22 Sep 2017 03:02:24 +0000 (11:02 +0800)]
LU-10802 nrs: mismatch problem for wildcard in jobid TBF

When set the NRS JOBID rule
"start runas jobid={*.500} rate=10", run the dd with user 500,
the RPC rate is not under control.
This patch fix this mismatch problem for wildcard in TBF JOBID.

Test-Parameters: trivial testlist=sanityn
Change-Id: I39a8e691c9dc8273ed9fce686eeef71be1ac3e43
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/29162
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9043 test: remove conf-sanity test 24a ALWAYS_EXCEPT 40/28540/10
dilip krishnagiri [Tue, 24 Jan 2017 17:32:39 +0000 (10:32 -0700)]
LU-9043 test: remove conf-sanity test 24a ALWAYS_EXCEPT

conf-sanity test 24a was added to the ALWAYS_EXCEPT list
due to bugzilla 23573. The issue described in bugzilla
23573 was fixed and landed to master.

conf-sanity test 24a should be removed from the
ALWAYS_EXCEPT list.

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Id42b846d5fb34e8ebeb7fab63aeeafea40782321
Reviewed-on: https://review.whamcloud.com/28540
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9658 ptlrpc: Add QoS for uid and gid in NRS-TBF 08/27608/36
Teddy Chan [Fri, 9 Mar 2018 10:20:40 +0000 (18:20 +0800)]
LU-9658 ptlrpc: Add QoS for uid and gid in NRS-TBF

This patch add a new QoS feature in TBF policy which could
limits the rate based on uid or gid. The policy is able to
limit the rate both on MDT and OSS site.

The command for this feature is like:
Start the tbf uid QoS on OST:
    lctl set_param ost.OSS.*.nrs_policies="tbf uid"
Limit the rate of ptlrpc requests of the uid 500
    lctl set_param ost.OSS.*.nrs_tbf_rule=
 "start tbf_name uid={500} rate=100"

Start the tbf gid QoS on OST:
    lctl set_param ost.OSS.*.nrs_policies="tbf gid"
Limit the rate of ptlrpc requests of the gid 500
    lctl set_param ost.OSS.*.nrs_tbf_rule=
 "start tbf_name gid={500} rate=100"

or use generic tbf rule to mix them on OST:
    lctl set_param ost.OSS.*.nrs_policies="tbf"
Limit the rate of ptlrpc requests of the uid 500 gid 500
    lctl set_param ost.OSS.*.nrs_tbf_rule=
 "start tbf_name uid={500}&gid={500} rate=100"

Also, you can use the following rule to control all reqs
to mds:
Start the tbf uid QoS on MDS:
    lctl set_param mds.MDS.*.nrs_policies="tbf uid"
Limit the rate of ptlrpc requests of the uid 500
    lctl set_param mds.MDS.*.nrs_tbf_rule=
 "start tbf_name uid={500} rate=100"

Change-Id: I440ad087dd3dbacd8b5228717b0a1724ef47e3b4
Signed-off-by: Teddy Chan <teddy@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/27608
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9592 tests: remove sanity-quota tests from ALWAYS_EXCEPT 10/27410/6
dilip krishnagiri [Wed, 9 Aug 2017 17:38:12 +0000 (11:38 -0600)]
LU-9592 tests: remove sanity-quota tests from ALWAYS_EXCEPT

Remove sanity-quota tests
34 "Usage transfer for user & group & project"
35 "Usage is still accessible across reboot"
from ALWAYS_EXCEPT list.

Test-Parameters: trivial testlist=sanity-quota mdtfilesystemtype=zfs ostfilesystemtype=zfs

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: I41390699480d9f88b1019c459c142d36fea624fb
Reviewed-on: https://review.whamcloud.com/27410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10795 quota: fix wrong skipping of reintegration 07/31607/2
Wang Shilong [Fri, 9 Mar 2018 07:38:51 +0000 (15:38 +0800)]
LU-10795 quota: fix wrong skipping of reintegration

There are two problems addressed by this patch:
1)In qsd_prepare(), if @qqi_acct_failed is true,
that only means one type of quota failed, Quota
should continue to handle.
2)In qsd_config(), only trigger reintegration if
this type of quota is newly enabled, this could
fix annoying messages when admin running

$ lctl conf_param lustre.quota.mdt=ug

LustreError: 0-0: lustre-MDT0000: can't enable
quota enforcement since space accounting isn't
functional. Please run tunefs.lustre --quota on
an unmounted filesystem if not done already

Change-Id: I9bad618e7e8fa836902cac9f446714cd6c03f98a
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31607
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6032 obdclass: new wrapper to convert NID to string 56/12956/14
Liang Zhen [Fri, 5 Dec 2014 14:06:52 +0000 (22:06 +0800)]
LU-6032 obdclass: new wrapper to convert NID to string

This patch includes a couple of changes:
- add new wrapper function obd_import_nid2str
- use obd_import_nid2str and obd_export_nid2str to replace all
  libcfs_nid2str conversions for NID of export/import connection

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I57d08e6ef902c6a34c705663de0ed73bb3dc76f2
Reviewed-on: https://review.whamcloud.com/12956
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6032 ldlm: don't disable softirq for exp_rpc_lock 57/12957/12
Liang Zhen [Fri, 5 Dec 2014 14:13:17 +0000 (22:13 +0800)]
LU-6032 ldlm: don't disable softirq for exp_rpc_lock

it is not necessary to call ldlm_lock_busy() in the context of timer
callback, we can call it in thread context of expired_lock_main.
With this change, we don't need to disable softirq for exp_rpc_lock.

Instead of moving busy locks to the end of the waiting list one
at a time in the context of the timer callback, move any locks
that may be expired onto the expired list.  If these locks are
still being used by RPCs being processed, then put them back
onto the end of the waiting list instead of evicting the client.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: Ic3da0dd4e81b758c7448d9613ccd4786693e075d
Reviewed-on: https://review.whamcloud.com/12957
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8999 quota: fix issue of multiple call of seq start 21/31721/4
Hongchao Zhang [Wed, 21 Mar 2018 15:17:03 +0000 (23:17 +0800)]
LU-8999 quota: fix issue of multiple call of seq start

Multiple call of lprocfs_quota_seq_start could change the block
orders in the lower level of the quota tree, which will cause
quota entries to be skipped.

This patch also fix a problem in walk_tree_dqentry, which some
entries could be skipped for the "index" can be added even if
a valid quota entry has been found.

Change-Id: I44936c70d4060bd83db22aba0e3f665981cfa50a
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31721
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10698 obdclass: allow specifying complex jobids 91/31691/7
Andreas Dilger [Tue, 20 Mar 2018 09:45:36 +0000 (03:45 -0600)]
LU-10698 obdclass: allow specifying complex jobids

Allow specifying a format string for the jobid_name variable to create
a jobid for processes on the client.  The jobid_name is used when
jobid_var=nodelocal, if jobid_name contains "%j", or as a fallback if
getting the specified jobid_var from the environment fails.

The jobid_node string allows the following escape sequences:

    %e = executable name
    %g = group ID
    %h = hostname (system utsname)
    %j = jobid from jobid_var environment variable
    %p = process ID
    %u = user ID

Any unknown escape sequences are dropped. Other arbitrary characters
pass through unmodified, up to the maximum jobid string size of 32,
though whitespace within the jobid is not copied.

This allows, for example, specifying an arbitrary prefix, such as the
cluster name, in addition to the traditional "procname.uid" format,
to distinguish between jobs running on clients in different clusters:

    lctl set_param jobid_var=nodelocal jobid_name=cluster2.%e.%u
or
    lctl set_param jobid_var=SLURM_JOB_ID jobid_name=cluster2.%j.%e

To use an environment-specified JobID, if available, but fall back to
a static string for all processes that do not have a valid JobID:

    lctl set_param jobid_var=SLURM_JOB_ID jobid_name=unknown

Implementation notes:

The LUSTRE_JOBID_SIZE includes a trailing NUL, so don't use
"LUSTRE_JOBID_SIZE + 1" anywhere, as that is misleading.

Rename the "obd_jobid_node" variable to "obd_jobid_name" to match
the /proc "jobid_name" parameter name to avoid confusion.

Rename "struct jobid_to_pid_map" to "jobid_pid_map" since this is
not actually mapping from a jobid *to* a PID, but the reverse.
Save jobid length, and reorder fields to avoid holes in structure.

Consolidate PID->jobid cache handling in jobid_get_from_cache(),
which only does environment lookups and caches the results.
The fallback to using obd_jobid_name is handled by the caller.

Rename check_job_name() to jobid_name_is_valid(), since that makes
it clear to the reader a "true" return is a valid name.

In jobid_cache_init() there is no benefit for locking the jobid_hash
creation, since the spinlock is just initialized in this function,
so multiple callers of this function would already be broken.

Pass the buffer size from the callers (who know the buffer size) to
lustre_get_jobid() instead of assuming it is LUSTRE_JOBID_SIZE.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Iad350e87b446c7d2356718cf2e5f9563e63ebbe5
Reviewed-on: https://review.whamcloud.com/31691
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9273 tests: disable random I/O in replay-ost-single/5 71/31671/2
Alex Zhuravlev [Fri, 16 Mar 2018 11:28:57 +0000 (14:28 +0300)]
LU-9273 tests: disable random I/O in replay-ost-single/5

disable random I/O in replay-ost-single/5 as it's very slow
on ZFS - this is due to grants as the client consume them
way too quickly: 1MB blocksize + ~0.5MB metadata overhead
for each random 4K written by iozone.

Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5
Test-Parameters: trivial ostcount=7 clients=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-ost-single envdefinitions=SLOW=yes,ONLY=5

Change-Id: Ic49429b8c681fdc16e5f95f483d78198b6f4804c
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31671
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
6 years agoLU-10264 mdc: fix possible NULL pointer dereference 21/31621/3
Andreas Dilger [Fri, 9 Mar 2018 23:18:53 +0000 (16:18 -0700)]
LU-10264 mdc: fix possible NULL pointer dereference

Fix two static analysis errors.

lustre/mdc/mdc_dev.c: in mdc_enqueue_send(), pointer 'matched' return
    from call to function 'ldlm_handle2lock' at line 704 may be NULL
    and will be dereferenced at line 705.
If client is evicted between ldlm_lock_match() and ldlm_handle2lock()
the lock pointer could be NULL.

lustre/lov/lov_dev.c:488 in lov_process_config, sscanf format
    specification '%d' expects type 'int' for 'd', but parameter 3
    has a different type '__u32'.
Converting to kstrtou32() requires changing the "index" variable type
from __u32 to u32, which is fine since it is only used internally,
fix up the few functions that are also passing "__u32 index" and the
resulting checkpatch.pl warnings.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3cc80d66bbb537161a561f4f2ba7830ddebcab07
Reviewed-on: https://review.whamcloud.com/31621
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7420 echo: fix echo server to work with unified target 43/18443/11
Mikhail Pershin [Tue, 27 Mar 2018 11:00:47 +0000 (14:00 +0300)]
LU-7420 echo: fix echo server to work with unified target

After Unified Target introduction the echo server lost its
ability to serve incoming request, i.e. works like fake OFD.
Patch restores that functionality, so echo server is able to
process requests from the echo client via network.

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Change-Id: I0c0d347486463ce320c7c66a1f85f6979b9a3681
Reviewed-on: https://review.whamcloud.com/18443
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10752 build: fix rpm packaging issues for gss 57/31757/7
James Simmons [Thu, 29 Mar 2018 17:02:42 +0000 (13:02 -0400)]
LU-10752 build: fix rpm packaging issues for gss

Lustre can create rpms in two ways. One is with make rpm and the
other is using the actual source rpm that is provided. Their are
several issues with how GSS is handled with rpm packaging.

First problem is that you can ./configure --disable-gss which has
never been handled. Secondly if you do configure with disable-gss
it is still possible to have the option enable-gss-keyring set to
yes. The reason it was never seen before is due to everything
being treated with the keyring option. Now if the user sets
enable-gss to no then enable-gss-keyring will also be set to no
even if the user tries to set it to yes. This was done by properly
setting $enable_gss and $enable_gss_keyring in lustre-core.m4.
In the spec file create the bcond gss to handle the gss only case
and we turn on gss if gss_keyring is true. Move lgssc.conf under
the with_gss_keyring bcond which is only needed for server builds
along side lsvcgss.

It is impossible to know if it can build due to the spec file not
properly handling build dependencies for GSS and not knowing if
the kernel is too new for GSS. So the user has to provide the
options --with gss and / or --with gss-keyring to rpmbuild. If
the user only provides gss-keyring option to rpmbuild make sure
it enables gss as well. That is handled in the spec file.

For the case of make rpms fix it up so if gss-keyring is enabled
then by default the core gss handling is enabled. Also handle the
long ignored enable-gss case.

Test-Parameters: trivial

Change-Id: Ieed9df98a27bd6e77504486762d6e60ddca5a916
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31757
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9551 utils: add l_tunedisk to fix disk tunings 64/31464/6
Nathaniel Clark [Wed, 28 Feb 2018 22:18:09 +0000 (17:18 -0500)]
LU-9551 utils: add l_tunedisk to fix disk tunings

This adds l_tunedisk utility to utilize osd_tune_lustre call for
mount_utils.h.  This can be called from udev.
This adds a udev rule to fix disk tunings.
This in some ways duplicates LU-9132, which sets this value at mount
time, but if a multipath component is removed then re-added, the
multipath's max_sectors_kb will not propgate to the newly added device
and this now will cause an error for I/Os that would violate this.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I35330ebe75552d71b71212f9fae00cfdcc028ea1
Reviewed-on: https://review.whamcloud.com/31464
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
6 years agoLU-10773 obdclass: yield cpu during changelog_block_trim_ext 16/31516/2
Fan Yong [Mon, 5 Mar 2018 15:11:21 +0000 (23:11 +0800)]
LU-10773 obdclass: yield cpu during changelog_block_trim_ext

To avoid soft-lockup if there are too many records to be handled.
The patch also filters out zero-sized records to avoid dead loop.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia094f9153b5ef2602103d2ee13ee7ad3ffe6dc4f
Reviewed-on: https://review.whamcloud.com/31516
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10761 osd-ldiskfs: not create REMOTE_PARENT_DIR on OST 08/31508/3
Fan Yong [Fri, 16 Mar 2018 06:28:01 +0000 (14:28 +0800)]
LU-10761 osd-ldiskfs: not create REMOTE_PARENT_DIR on OST

The REMOTE_PARENT_DIR is used to link remote object which parent
resides on remote MDT to the global namespace. It is only useful
for MDT. So it is unnecessary to create such directory on OST.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I240de3f69cde04740cb7f71ebaf9048407a900dc
Reviewed-on: https://review.whamcloud.com/31508
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10837 ldiskfs: skip bitmap check if block bitmap is uninitialized 20/31720/3
Wang Shilong [Thu, 22 Mar 2018 05:59:55 +0000 (13:59 +0800)]
LU-10837 ldiskfs: skip bitmap check if block bitmap is uninitialized

See comments in ext4_free_clusters_after_init:
/* Return the number of free blocks in a block group.  It is used when
 * the block bitmap is uninitialized, so we can't just count the bits
 * in the bitmap. */
So extra check we enhanced here is wrong if this block group
bitmap is uninitialized, since we only check bitmaps here.

Further, Looking at EXT4_BG_BLOCK_UNINIT clear codes, Kernel
will reinit free_clusters_count when tried to clear the flag, so
extra check for uninited block bitmaps dosen't make much sense.

Let's skip uninited block bitmap check if EXT4_BG_BLOCK_UNINIT
is set, whatever free count group desc recorded is untrustable somehow

Change-Id: I845f2e0e17e53b7e3073399bd8b0a85e3db66ef8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31720
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-10703 nodemap: save and clear fileset correctly 50/31450/13
Emoly Liu [Tue, 20 Mar 2018 09:42:29 +0000 (17:42 +0800)]
LU-10703 nodemap: save and clear fileset correctly

This patch is to fix the following two issues:
- When processing the nodemap_idx_type "NODEMAP_CLUSTER_IDX" in
  nodemap_process_keyrec(), fileset should be saved, otherwise,
  it will be changed to empty every time when client is notified
  to fetch nodemap logs (mgc_process_recover_nodemap_log()->
  nodemap_process_idx_pages()->nodemap_process_keyrec()).
- Allow 'fileset=clear' in addition to 'fileset=""' to clear
  fileset because either 'lctl set_param -P *.*.fileset=""' or
  'lctl nodemap_set_fileset --fileset ""' can only work on MGS,
  while on other non-MGS servers, they both will invoke upcall
  "/usr/sbin/lctl set_param nodemap.default.fileset=" by function
  process_param2_config(), which will cause "no value" error and
  won't clear fileset. 'fileset=""' is still kept for compatibility
  reason.

Also, sanity-sec.sh test_27a is modified and test_27b is added to
verify this patch.

Change-Id: I23236a4f1b67ac555713d6b3f059df699fdc91dc
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/31450
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10830 utils: fix create mode for lfs setstripe 47/31747/4
Andreas Dilger [Fri, 23 Mar 2018 06:01:58 +0000 (00:01 -0600)]
LU-10830 utils: fix create mode for lfs setstripe

Fix create mode for files created by "lfs setstripe" and also
"lfs mirror create" to match regular file creates, which are
filtered by umask to determine the final file create mode.

Add test case to verify umask is working correctly in all cases.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I0c9d6730f437dbfbafda4902a035cc0f0ed916b0
Reviewed-on: https://review.whamcloud.com/31747
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoNew tag 2.11.50 2.11.50 v2_11_50 v2_11_50_0
Oleg Drokin [Tue, 3 Apr 2018 17:30:11 +0000 (13:30 -0400)]
New tag 2.11.50

Start of 2.12 development

Change-Id: Ic96437e600ab6d460ea33cf48b36c88913f5d864
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoNew release 2.11 b2_11 2.11.0 v2_11_0 v2_11_0_0
Oleg Drokin [Tue, 3 Apr 2018 17:25:32 +0000 (13:25 -0400)]
New release 2.11

Change-Id: I2e6ea245c130823534a50a14056c4865572f181e
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoNew RC 2.11.0-RC3 2.11.0-RC3 v2_11_0_0_RC3 v2_11_0_RC3
Oleg Drokin [Fri, 30 Mar 2018 22:06:39 +0000 (18:06 -0400)]
New RC 2.11.0-RC3

Change-Id: Iee4f556142bf4f2a9efe61469c95e09fe460ddc0
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10822 utils: stop bogus buffer overflow errors 22/31822/6
Andreas Dilger [Wed, 28 Mar 2018 21:42:06 +0000 (15:42 -0600)]
LU-10822 utils: stop bogus buffer overflow errors

Over-zealous Fortify checks assume that the buffer being used for
snprintf() in get_lmd_info() is sizeof(*lmd) when in fact a larger
buffer has been allocated.  This causes runtime checks to fail and
lfs to core dump:

   *** buffer overflow detected ***: /usr/bin/lfs terminated

Instead of printing directly into "struct lov_user_mds_data", use
a generic buffer to hold the filename passed into the ioctl and
the return data.

There are several places in the code which do the same operations,
namely cb_getstripe(), get_lmd_info(), and ct_md_getattr(), so
change them all to call get_lmd_info() or a new get_lmd_info_fd()
helper to consolidate common code.  Also check the return values
from snprintf() in case there are new callers of this code in the
future that do not actually pass large-enough buffers.

Test-Parameters: clientdistro=ubuntu1604 serverdistro=el7 testlist=sanity
Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I41b1fcba1f7937fbce3cc7180ed5d73d067cab07
Reviewed-on: https://review.whamcloud.com/31822
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10334 tests: add tests to ALWAYS_EXCEPT for Ubuntu 28/31828/5
James Nunez [Thu, 29 Mar 2018 19:25:36 +0000 (13:25 -0600)]
LU-10334 tests: add tests to ALWAYS_EXCEPT for Ubuntu

Several tests are known to fail when running on Ubuntu clients:
tests 103a, 130a, 130b, 130c, 130d, 130e, 400a, and 410.

Add these tests to the ALWAYS_EXCEPT list to allow Ubuntu
testing to pass.

Test-Parameters: trivial
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I5ff51e94536f4382d670c9a4a1ce0af0c2832b4c
Reviewed-on: https://review.whamcloud.com/31828
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
6 years agoLU-10864 build: update changelog for ubuntu 16.04 26/31826/3
Minh Diep [Thu, 29 Mar 2018 17:37:59 +0000 (10:37 -0700)]
LU-10864 build: update changelog for ubuntu 16.04

update chanage to the kernel we are building

Test-Parameters: trivial

Change-Id: Ic3a6accda4fc19d56676e2fb84f65942bc107539
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/31826
Reviewed-by: Peter Jones <peter.a.jones@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Joseph Gmitter <joseph.gmitter@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10858 build: handle yaml library packaging on SLES systems 15/31815/3
James Simmons [Wed, 28 Mar 2018 18:07:45 +0000 (14:07 -0400)]
LU-10858 build: handle yaml library packaging on SLES systems

Newer distributions like SLES12 renamed the libyaml package to
libyaml-0-2. Update the spec file to handle this change.

Test-Parameters: clientdistro=sles12sp3 \
ossdistro=sles12sp3 mdsdistro=sles12sp3 \
testlist=sanity,sanity-pfl,sanity-flr

Change-Id: I876d05718194dd555d7d6ffa6433bcc9f445f97e
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31815
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoRevert "LU-6867 test: detect active facet based on current state" 98/31798/5
James Nunez [Tue, 27 Mar 2018 18:28:49 +0000 (18:28 +0000)]
Revert "LU-6867 test: detect active facet based on current state"

This reverts commit 643e3b4316b6c59009c259b96d38495152989df4.

conf-sanity is failing with rmmod errors for Ubuntu clients; LU-10827.
Reverting this patch fixes the issue.

Change-Id: I455d87ea1e2f661c6129c9de577fc660d68d4c4b
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-on: https://review.whamcloud.com/31798
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoNew RC 2.11-RC2 2.11.0-RC2 v2_11_0_0_RC2 v2_11_0_RC2
Oleg Drokin [Mon, 26 Mar 2018 22:31:06 +0000 (18:31 -0400)]
New RC 2.11-RC2

Change-Id: Ib5387f4cc463759452d26d4ad539201bd4c82717
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10829 utils: don't print lmm_stripe_offset for DoM layout 02/31702/2
Andreas Dilger [Tue, 20 Mar 2018 23:37:04 +0000 (17:37 -0600)]
LU-10829 utils: don't print lmm_stripe_offset for DoM layout

Running "lfs getstripe" on a DoM file prints out a non-zero value for
"lmm_stripe_offset:" on the 'mdt' component, even though this doesn't
make any sense.  Also, it prints an "lmm_objects:" header for the
component, even though it does not have any objects allocated to it.

  lcm_layout_gen:    4
  lcm_mirror_count:  1
  lcm_entry_count:   3
    lcme_id:             1
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 0
    lcme_extent.e_end:   1048576
      lmm_stripe_count:  0
      lmm_stripe_size:   1048576
      lmm_pattern:       mdt
      lmm_layout_gen:    0
      lmm_stripe_offset: 2
      lmm_objects:

Always print '0' for lmm_stripe_offset of DoM components, and don't
print "lmm_objects:" for these components at all.

Test-Parameters: trivial testlist=sanity-dom,sanity-flr
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I5430ff74d26ad2acd51d07ec23810cc9033ebbe5
Reviewed-on: https://review.whamcloud.com/31702
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10556 build: add require libyaml and zlib 10/31710/2
Minh Diep [Wed, 21 Mar 2018 19:30:42 +0000 (12:30 -0700)]
LU-10556 build: add require libyaml and zlib

Missing libyaml and zlib dev package

Test-Parameters: trivial

Change-Id: I167187c7bd11a2d92a6cc1fa8ccd7076f7ed5a85
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/31710
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
6 years agoLU-10569 build: properly package lustre for Debian/Ubuntu 48/31348/17
James Simmons [Tue, 6 Mar 2018 20:10:56 +0000 (15:10 -0500)]
LU-10569 build: properly package lustre for Debian/Ubuntu

Remove the obsolete linux-patch since patched kernels for lustre
clients have been long gone. Place only the static libraries and
*.so symlinks for the dynamic libraries in lustre-dev. The normal
dynamic libraries are placed into the utilities packages. Add in
all the missing dependencies and fix how the lustre debs are
dependent on each other. Lastly add in the missing lustre-iokit
that is present for rpm packages. Only thing missing is a package
for lustre resources which can be done at a latter time.

Test-Parameters: trivial

Change-Id: I5fd2a23bc1ae73434cef8dcf3679b50878256ab3
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/31348
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Tested-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoNew release candidate 2.11-RC1 2.11.0-RC1 v2_11_0_0_RC1 v2_11_0_RC1
Oleg Drokin [Mon, 19 Mar 2018 18:47:23 +0000 (14:47 -0400)]
New release candidate 2.11-RC1

Change-Id: I8ced43420aa756e242a87f50ffd3601b76b4eb9e
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoRevert "LU-9796 kernel: improve metadata performaces for RHEL7" 83/31683/3
Andreas Dilger [Mon, 19 Mar 2018 01:20:24 +0000 (01:20 +0000)]
Revert "LU-9796 kernel: improve metadata performaces for RHEL7"

This reverts commit 17fe3c192e101ac due to suspected
problems hit in some deployments.

Change-Id: I8cb28b4c69f67583356a7e07cf94ba897ffeb6ee
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-on: https://review.whamcloud.com/31683
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10800 lnet: Revert "LU-10270 lnet: remove an early rx code" 75/31675/3
John L. Hammond [Fri, 16 Mar 2018 15:20:42 +0000 (10:20 -0500)]
LU-10800 lnet: Revert "LU-10270 lnet: remove an early rx code"

This reverts commit c3894ff80fe4b48f2d62ea33ddc54fb5891e6484. Dropping
early receives caused pings to be ignored and interacted badly with
dynamic discovery.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I99a87a8f58ea67c59d5e85b964295472c2e15de4
Reviewed-on: https://review.whamcloud.com/31675
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10800 lnet: reduce discovery timeout 63/31663/3
Amir Shehata [Thu, 15 Mar 2018 19:12:04 +0000 (12:12 -0700)]
LU-10800 lnet: reduce discovery timeout

Discovery protocol sends a ping (GET) to the peer and expects a
REPLY back with the interface information. Discovery uses the
DEFAULT_PEER_TIMEOUT which 180s. This could lead to extended delay
during mounting if the OSTs are down or if the ping fails for
any reason.

This patch adds a module parameter lnet_transaction_timeout which
defaults to 5 seconds. lnet_transaction_timeout is used for the
discovery timeout.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ida1e19f55552b24e83c8094aa88a37c2748126cf
Reviewed-on: https://review.whamcloud.com/31663
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10804 echo: allow echo server to setup procfs 64/31664/4
Mikhail Pershin [Thu, 15 Mar 2018 19:30:39 +0000 (22:30 +0300)]
LU-10804 echo: allow echo server to setup procfs

Restore lprocfs init for echo server. It is still using
procfs for stats.

Fixes: 0100ab268c3120aa84847a88a2493988f38dee6b
Test-Parameters: trivial envdefinitions=SLOW=yes testlist=obdfilter-survey osscount=1 ostcount=4 mdscount=1 mdtcount=1
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I7a1bf7de3d7c3202e6da7545da63979555ce6624
Reviewed-on: https://review.whamcloud.com/31664
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
6 years agoLU-5761 tests: fix test_89 to use fs_log_size() 20/31120/6
Andreas Dilger [Sat, 3 Feb 2018 08:27:42 +0000 (01:27 -0700)]
LU-5761 tests: fix test_89 to use fs_log_size()

The test_89 checks should use fs_log_size() to determine how much
space might be leaked "normally" (due to log files, etc), and how
much data should be written to ensure that we do not misinterpret
this as the leak of block.

Also, fix up fs_log_size() to use the correct grant_block_size
units, which are in bytes, but fs_log_size() returns size in KB.
Allow a margin of 2 large blocks to be allocated for ZFS.

Test-Parameters: trivial ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=replay-single,replay-single,replay-single
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Id55175fc7f25fea52345d1c4443673b7efcec230
Reviewed-on: https://review.whamcloud.com/31120
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10675 tests: increase default MDSSIZE 25/31325/7
Alex Zhuravlev [Thu, 15 Feb 2018 19:52:48 +0000 (22:52 +0300)]
LU-10675 tests: increase default MDSSIZE

and fix few tests to release space

Change-Id: Ie5e5b3f440e3abbd1f75486d2c6a3928a382be7d
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31325
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
6 years agoLU-10793 test: re-add test_14b to replay-dual ALWAYS_EXCEPT 05/31605/3
Hongchao Zhang [Thu, 15 Feb 2018 12:39:12 +0000 (20:39 +0800)]
LU-10793 test: re-add test_14b to replay-dual ALWAYS_EXCEPT

The test_14b in replay-dual is removed from ALWAYS_EXCEPT list
in LU-10052 by https://review.whamcloud.com/#/c/30916/, but
the corresponding implementation is not ready, and this patch
re-add it to the ALWAYS_EXCEPT.

Test-Parameters: trivial testlist=replay-dual

Change-Id: I1027046b668e21f9fe4a47a0f46810f64b1ee954
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31605
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
6 years agoLU-5490 tests: Sanity/133d ensure stats read is on correct MDT 85/31585/3
Nathaniel Clark [Thu, 8 Mar 2018 15:40:40 +0000 (10:40 -0500)]
LU-5490 tests: Sanity/133d ensure stats read is on correct MDT

Ensure directories used to collect rename_stats are on the MDT
that is checked.  This ensures directories are created on
MDT0 and not striped and then rename_stats is read from MDT0.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib27f5c531f2d8bd664ec3a4732c512b0c389dc43
Reviewed-on: https://review.whamcloud.com/31585
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
6 years agoLU-10764 hsm: Correct debug print in ct_archive 09/31509/3
Oleg Drokin [Mon, 5 Mar 2018 06:15:50 +0000 (01:15 -0500)]
LU-10764 hsm: Correct debug print in ct_archive

As is it's never printed due to misplaced curly bracket

Test-Parameters: trivial
Change-Id: I15c60f2ec44aaaa723945068d576dc59e04a2b95
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/31509
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
6 years agoLU-10759 test: sanity tests need check for 2 or more OSTs 95/31495/3
Bobi Jam [Fri, 2 Mar 2018 18:31:06 +0000 (11:31 -0700)]
LU-10759 test: sanity tests need check for 2 or more OSTs

sanity test 27F, 311, and 314 need two or more OSTs. Add a check
on the number of OSTs in these test cases.

Test-Parameters: trivial osscount=1 ostcount=1

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ie802429bfeb44ee19d8867614b420de7bceebfa2
Reviewed-on: https://review.whamcloud.com/31495
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-4315 doc: correct lfs migrate man page separation 06/31406/3
Andreas Dilger [Wed, 21 Feb 2018 06:40:37 +0000 (23:40 -0700)]
LU-4315 doc: correct lfs migrate man page separation

The "--block" and "--non-block" options are not relevant for
lfs-setstripe.1, only lfs-migrate.1 so move their descriptions
there.

Also remove duplicate setstripe option descriptions from
lfs-migrate.1, since they are becoming increasingly complex.
Instead, just refer to the lfs-setstripe.1 man page for other
options.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I13a28b33be2dc29aa0f44d177a62dbd2e13ebbe5
Reviewed-on: https://review.whamcloud.com/31406
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10673 tests: sanity test_56a fixes 26/31326/4
Alyona Romanenko [Thu, 15 Feb 2018 19:01:05 +0000 (22:01 +0300)]
LU-10673 tests: sanity test_56a fixes

The $filenum is not equal to $found if stripe_count
more then 1.
The $filenum is not equal to $found if stripe_index
is not default.
Patch fixes the following:
 We will counted files twice with dual striped
 file as they will have objects on both stripes.
 Remove dir's stripe-offset from stripes-offset's sum of
 test dirs/files which get by getstripe -ir.

Author: Alyona Romanenko <alyona.romanenko@seagate.com>

Signed-off-by: Alyona Romanenko <alyona.romanenko@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanity envdefinitions=ONLY=56a
Cray-bug-id: MRP-2738
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Change-Id: I911bf8b40b7688b4341f48409d9c5b57386cfe3d
Reviewed-on: https://review.whamcloud.com/31326
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10210 tests: Add lustre_routes_conversion script in PATH 73/31173/2
Sonia Sharma [Mon, 5 Feb 2018 18:53:00 +0000 (10:53 -0800)]
LU-10210 tests: Add lustre_routes_conversion script in PATH

Fix the typo in test-framework.sh so that test_67 in
conf-sanity.sh find lustre_routes_conversion script
when running out of build tree.

Fixing the typo in test-framework.sh for exporting
LUSTRE_ROUTES_CONVERSION to pick the lustre_routes_conversion
script from $LUSTRE/scripts so that it is visible when running
out of build tree.

Change-Id: I1bd9a28e036b9c7b60eaa9886e641610d414c8ee
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/31173
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7854 tests: start gss daemons in sanity-gss 83/27383/15
Sebastien Buisson [Thu, 1 Jun 2017 18:37:32 +0000 (14:37 -0400)]
LU-7854 tests: start gss daemons in sanity-gss

In sanity-gss, launch lsvcgssd with '-z' flag prior to
commencing actual tests. And stop daemons at the end of the script.
The purpose of this patch is just to fix the test script, so passing
test_1 only is fine.

Test-Parameters: trivial envdefinitions=ONLY=1 testlist=sanity-gss
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib118b3735c74bb74a54b323ee8eec91d05491edf
Reviewed-on: https://review.whamcloud.com/27383
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-7854 gss: install lgssc.conf under /etc/request-key.d/ 17/31317/4
Sebastien Buisson [Thu, 15 Feb 2018 14:24:37 +0000 (23:24 +0900)]
LU-7854 gss: install lgssc.conf under /etc/request-key.d/

GSS keys for Lustre are generated via the lgss_keyring user-space
tool. But request-key system tool needs to know how to call
lgss_keyring in order to generate keys for Lustre.
This is done by adding the file lgssc.conf file under
/etc/request-key.d/, with the following content:
create lgssc * * /usr/sbin/lgss_keyring %o %k %t %d %c %u %g %T %P %S

This file is not packaged if gss keyring is explicitely disabled at
configure time.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibf2eb04584f6a100a57bf00070335cf4cf2c620c
Reviewed-on: https://review.whamcloud.com/31317
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7947 obdclass: Move assignment below LASSERT() 61/24561/5
Arshad Hussain [Tue, 27 Dec 2016 16:45:39 +0000 (22:15 +0530)]
LU-7947 obdclass: Move assignment below LASSERT()

This patch moves 'loghandle->lgh_hdr' assignment call
below LASSERT(). This avoids a case when loghandle parameter
is NULL and dereferencing the NULL pointer would fault
before it reaches LASSERT().

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@seagate.com>
Change-Id: Ie9bcd172a264e104dca300a8bac04d2bd132efb0
Reviewed-on: https://review.whamcloud.com/24561
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8503 tests: fix replay-single/66b test 41/21941/5
Elena Gryaznova [Fri, 16 Feb 2018 09:46:30 +0000 (12:46 +0300)]
LU-8503 tests: fix replay-single/66b test

In replay-single test_66b replace lookup with touch
to ensure a new RPC is sent on each test invocation.

Author: Abrarahmed Momin <abrar.habib@seagate.com>

Signed-off-by: Abrarahmed Momin <abrar.habib@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Test-Parameters: trivial testlist=replay-single envdefinitions=ONLY=66b
Cray-bug-id: LUS-4868
Seagate-bug-id: MRP-3386
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Change-Id: I87e91cccd8af92fe9ca2002127af934b8b02edfb
Reviewed-on: https://review.whamcloud.com/21941
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10772 utils: incorrect NULL check in free_node() 51/31551/2
Sonia Sharma [Tue, 6 Mar 2018 18:28:09 +0000 (10:28 -0800)]
LU-10772 utils: incorrect NULL check in free_node()

In lnet/utils/lnetconfig/cyaml.c, for free_node()
check first for NULL pointer before dereferencing it.

Issue found in Static analysis

Change-Id: I6298f0f09175b6fd210db5717d44d050b1cb9d8d
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/31551
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10658 utils: check mount_lustre for allocation failure 00/31400/3
Andreas Dilger [Fri, 23 Feb 2018 18:26:40 +0000 (11:26 -0700)]
LU-10658 utils: check mount_lustre for allocation failure

Check if calloc() failed and return an error rather than dereferencing
the NULL pointer.  Found by static analysis.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie4a5e3341fab1de77990fc99df54cdc562dcab07
Reviewed-on: https://review.whamcloud.com/31400
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10752 build: properly package lsvcgss for rpm builds 85/31485/9
James Simmons [Mon, 5 Mar 2018 17:15:16 +0000 (12:15 -0500)]
LU-10752 build: properly package lsvcgss for rpm builds

On some platforms rpm building will failure with the following
errors:

RPM build errors:
    Installed (but unpackaged) file(s) found:
   /etc/init.d/lsvcgss

Technically lsvcgss is a server only file so we can just include
it for server builds and only if GSS_KEYRING is set.

Test-Parameters: trivial

Change-Id: I2525916cd10ddea0b99337e1ff4ff967bd9f7f9a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31485
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8856 osd: mark specific transactions netfree 44/31444/9
Alex Zhuravlev [Wed, 3 May 2017 12:45:13 +0000 (15:45 +0300)]
LU-8856 osd: mark specific transactions netfree

osd-zfs should mark some transactions netfree. this means those
transactions are expected to release space (rather than consume)
and for this kind of transaction half of reserved space is available.

Change-Id: Ia5ca247843b296319376c4ac69efad68b557df9f
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31444
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
6 years agoLU-684 tests: replace dev_read_only patch with dm-flakey 00/7200/40
Hongchao Zhang [Sat, 3 Mar 2018 06:36:32 +0000 (22:36 -0800)]
LU-684 tests: replace dev_read_only patch with dm-flakey

The dev_read_only kernel patch is mainly used for testing,
in order to simulate a server crash for ldiskfs by discarding
all of the writes to the device.

Since Linux kernel 3.0, this testing functionality can be
simulated by using "dm-flakey" target for device-mapper,
which supports a "drop_writes" parameter that could be used
in place of our dev_read_only kernel patch.

Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I51ff9a1a10fb5bacdc1afa2716b769b5eda41863
Reviewed-on: https://review.whamcloud.com/7200
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10803 ptlrpc: fix req_buffers_max and req_history_max setting 22/31622/3
Wang Shilong [Mon, 12 Mar 2018 11:51:23 +0000 (19:51 +0800)]
LU-10803 ptlrpc: fix req_buffers_max and req_history_max setting

We hit LU-9372 OOM problems, and after applying
LU-9372 ptlrpc: allow to limit number of service's rqbds
we found two problems:

1)Since 0 is a reserved value for @srv_nrqbds_max which
means unlimited value, procfs write interface should support
this value, otherwise, there is no way to change default behavior
back.

2)the check in ptlrpc_lprocfs_req_history_max_seq_write() was broken
after this patch, the following check will always succeed if @srv_nrqbds_max
is kept as default value 0:

val > svc->srv_nrqbds_max/2

Change-Id: Ida0796fa500fe595e003accc11d20fdad5e60c03
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31622
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoNew tag 2.10.59 2.10.59 v2_10_59 v2_10_59_0
Oleg Drokin [Mon, 12 Mar 2018 15:08:38 +0000 (11:08 -0400)]
New tag 2.10.59

Change-Id: I3d21da3edd4b9851191db9dd0467015787acd5a5
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10794 lfs: make quota work for grace time 06/31606/3
Wang Shilong [Fri, 9 Mar 2018 05:25:07 +0000 (13:25 +0800)]
LU-10794 lfs: make quota work for grace time

Following commit:
LU-10011 utils: refactor lfs quota codes

Introduce a regression which will make 'lfs quota -t'
will output nothing, fix this bug and also add
a test case in sanity-quota.sh in case it is broken
in the future again.

Test-Parameters: trivial testlist=sanity-quota
Change-Id: I2063552505cf07464d9924f66c29fc2504bc56ce
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31606
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10783 kernel: kernel update RHEL7.4 [3.10.0-693.21.1.el7] 12/31612/2
Bob Glossman [Tue, 6 Mar 2018 22:06:18 +0000 (14:06 -0800)]
LU-10783 kernel: kernel update RHEL7.4 [3.10.0-693.21.1.el7]

update RHEL 7.4 kernel to 3.10.0-693.21.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ib7d5233d438798e1cdd1c31bb6728f8ea6697959
Reviewed-on: https://review.whamcloud.com/31612
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10750 mdd: declare changelogs only when enabled 77/31477/8
John L. Hammond [Thu, 1 Mar 2018 16:02:09 +0000 (10:02 -0600)]
LU-10750 mdd: declare changelogs only when enabled

In the mdd layer, rename recording_changelog() to
mdd_changelog_enabled() and add the changelog record type as a
parameter. In mdd_changelog_enabled() test to see if the type is
enabled in addition to checking is changelogs are generally enabled
and only lookup the ucred if the other tests pass. Add a type
parameter to mdd_declare_changelog_store() so that this information
can be passed to mdd_declare_changelog_store(). In mdd_close() check
if CLOSE changelogs are enabled before opening a transaction and
declaring the record.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Idd7604de5e97bad72a802cb4b49dae4668b2644a
Reviewed-on: https://review.whamcloud.com/31477
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
6 years agoLU-10465 lov: decrease default stripe size to 1MB 89/31589/2
Jian Yu [Thu, 8 Mar 2018 19:08:59 +0000 (11:08 -0800)]
LU-10465 lov: decrease default stripe size to 1MB

Commit 3f5abc6fa30e7c0256077ccf6a149d1809450465 increased
the default stripe size from 1MB to 4MB. However, this
caused usability issue in LU-10786 for PFL/DoM files.

This patch changes the default stripe size back to 1MB
until we have a better method of handling DoM components.
Otherwise, it means that DoM files will not be created
easily with default settings.

Change-Id: Ie6b6fe97596ed65abec771b3f37afd950dc821c8
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31589
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoRevert "LU-10419 lfsck: skip dead target" 00/31600/2
Oleg Drokin [Fri, 9 Mar 2018 00:19:51 +0000 (00:19 +0000)]
Revert "LU-10419 lfsck: skip dead target"

This is causing uninterruptible lfsck instances in soak testing documented in LU-10419 by Cliff

This reverts commit 012834c5e7c7be50ff117cee4ac473d7fee4294d.

Change-Id: I119d21c7ce3375140fbbb25a300e65b4c6aa9e73
Reviewed-on: https://review.whamcloud.com/31600
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10722 test: Add version check to sanity-quota test_55 31/31531/2
Wei Liu [Mon, 5 Mar 2018 18:38:43 +0000 (10:38 -0800)]
LU-10722 test: Add version check to sanity-quota test_55

Skip sanity-quota test_55 if server is older than 2.10.58

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Change-Id: Ia8a129298d75fb019699adda07fecd2f4d9eb46a
Reviewed-on: https://review.whamcloud.com/31531
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10705 utils: add "lfs find --blocks" 93/31393/4
Andreas Dilger [Fri, 23 Feb 2018 07:34:22 +0000 (00:34 -0700)]
LU-10705 utils: add "lfs find --blocks"

Add support for "lfs find --blocks|-b <block>" to be able to find
files with the specified number of allocated blocks (in kilobytes or
other specified units). This is distinct from "--size <size>" since
that doesn't properly check the space used for sparse files.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I7d48f919d95242c11ef7d3075ecc3f7e963ebbe5
Reviewed-on: https://review.whamcloud.com/31393
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10596 tests: skip tests require remote server with nodsh 21/31121/6
Elena Gryaznova [Sun, 4 Mar 2018 18:17:38 +0000 (21:17 +0300)]
LU-10596 tests: skip tests require remote server with nodsh

Patch fixes the following tests to be skipped for remote
servers with nodsh set:
sanity 56c, 60aa, 77c, 101g, 160f, 160g, 161d
Patch skips 160f and 160g for old MDS.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanity
Cray-bug-id: MRP-4757, LUS-5710
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I44f35129df5bc5c8c6e6ace3e68f3f2d400db86c
Reviewed-on: https://review.whamcloud.com/31121
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10336 osp: wakeup opd_pre_waitq when decrement opd_pre_reserved 97/30397/4
Sergey Cheremencev [Wed, 6 Dec 2017 13:52:33 +0000 (16:52 +0300)]
LU-10336 osp: wakeup opd_pre_waitq when decrement opd_pre_reserved

osp_precreate_cleanup_orphans could be blocked due to
reserved objects. In such case it set opd_pre_recovering
flag and waits until opd_pre_reserved becomes 0.
Thus we need to wake it up when opd_pre_reserved is reset
to 0.

Change-Id: Ib8d4708685c3c9675872577985a4c6897e3ee385
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Cray-bug-id: MRP-3623
Reviewed-on: https://review.whamcloud.com/30397
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9160 ldiskfs: preload block group descriptors 22/25722/7
Artem Blagodarenko [Sat, 18 Feb 2017 09:00:13 +0000 (12:00 +0300)]
LU-9160 ldiskfs: preload block group descriptors

With 300TB OST size, we saw slow mount time, which
caused 13 minutes, with this patch applied, it reduced
to 30s, so this patch greatly reduce mount time, backport
it from Linux upstream.

Linux-commit: 85c8f176a6111ecde9c158109989dbd445a0e59a

With enabled meta_bg option block group descriptors
reading IO is not sequential and requires optimization.

Seagate-bug-id: MRP-4129
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Change-Id: Iaa621c11ff88364021887d9f9dcec250dd5fd955
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/25722
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-10723 tests: disable sanity 232b before 2.10.58 87/31487/2
Quentin Bouget [Fri, 2 Mar 2018 08:22:25 +0000 (08:22 +0000)]
LU-10723 tests: disable sanity 232b before 2.10.58

The fix that allows test_232b of sanity.sh to pass was introduced in
lustre 2.10.58 so the test should not be run before this version.

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I7c625e916bfd0d4a614cc9924670bffe4ba3b8b0
Reviewed-on: https://review.whamcloud.com/31487
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10483 lustre: replace FMODE_{READ,WRITE} with MDS_* equivs 24/30824/10
Sebastien Buisson [Wed, 10 Jan 2018 14:37:24 +0000 (23:37 +0900)]
LU-10483 lustre: replace FMODE_{READ,WRITE} with MDS_* equivs

In file lustre/include/uapi/linux/lustre/lustre_user.h, replace direct
use of FMODE_READ and FMODE_WRITE with MDS_* equivalents.
That will avoid name clashes with the kernel symbols, and avoid
problems if their values ever change.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I07e77d8d025c5ddb3dc4e085738645e20fb77d0c
Reviewed-on: https://review.whamcloud.com/30824
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10003 lnet: remove lctl deprecation messages 34/31534/3
John L. Hammond [Mon, 5 Mar 2018 23:11:25 +0000 (17:11 -0600)]
LU-10003 lnet: remove lctl deprecation messages

Defer deprecation of these commands for now.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I09b97bacded9ac65a8c5df3ba47867a6a19fbf7b
Reviewed-on: https://review.whamcloud.com/31534
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
6 years agoLU-10419 lfsck: skip dead target 75/31475/2
Fan Yong [Thu, 1 Mar 2018 06:30:36 +0000 (14:30 +0800)]
LU-10419 lfsck: skip dead target

Do not send LFSCK RPC to dead targets to avoid being blocked.
The patch adds warning message when try to send LFSCK RPC on
the non-full connection, it is helpful to understand why the
LFSCK may be blocked.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I0599eb961f1aabd58d0de53fd51f25ca1ec8ff34
Reviewed-on: https://review.whamcloud.com/31475
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10769 osd-zfs: fix deadlock on osd_object::oo_guard 11/31511/4
Fan Yong [Mon, 5 Mar 2018 11:35:02 +0000 (19:35 +0800)]
LU-10769 osd-zfs: fix deadlock on osd_object::oo_guard

There is race condition inside osd-zfs, it may cause deadlock.
Consider the following scenarios:

1) The Thread1 calls osd_attr_set() to set flags on the object.
   The osd_attr_set() will call the osd_xattr_get() with holding
   the read mode semaphore on the object::oo_guard.

2) The Thread2 calls the osd_declare_destroy() to destroy such
   object, it will down_write() on the object::oo_gurad, but be
   blocked by the Thread1's granted read mode semaphore.

3) The osd_xattr_get() triggered by the osd_xattr_set() will also
   down_read() on the object::oo_guard. But it will be blocked by
   the Thread2's pending down_write() request.

Then the Thread1 and the Thread2 deadlock.
This patch makes the osd_attr_set() to call the lockless version
xattr_get osd_xattr_get_internal() to avoid such deadlock.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iaac2e414b5f1fd197303bb7ec7d5e2763b6f3e9a
Reviewed-on: https://review.whamcloud.com/31511
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-10681: Disable tiny writes for append 53/31353/8
Patrick Farrell [Sat, 3 Mar 2018 22:59:43 +0000 (16:59 -0600)]
LU-10681: Disable tiny writes for append

Unfortunately, tiny writes do not work correctly with
appending to files.  When appending to a file, we must take
DLM locks to EOF on all stripes, in order to protect file
size so we can append correctly.

If we dirty a page with a normal write then append to it
with a tiny write, these DLM locks are not present, and we
can use an incorrect size if another client writes to a
different stripe, increasing the size without cancelling
the lock which is protecting our dirty page.

We could theoretically check to make sure the required DLM
locks are held, but this would be time consuming.

The simplest solution is to just not allow tiny writes when
appending.

Also add option to disable tiny writes at runtime.

Cray-bug-id: LUS-5723

Change-Id: Ic9421faa3d0268d907040881e8ba3c894261fd49
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/31353
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10520 mkfs: enable extents for big MDT 37/31037/13
Yang Sheng [Fri, 26 Jan 2018 13:35:33 +0000 (21:35 +0800)]
LU-10520 mkfs: enable extents for big MDT

Enable extents while MDT size is big than 16T.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Iccd39c48e715a3f084cb5ee803be0541563f5d10
Reviewed-on: https://review.whamcloud.com/31037
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
6 years agoLU-10680 mdd: disable changelog garbage collection by default 52/31552/2
John L. Hammond [Tue, 6 Mar 2018 19:25:50 +0000 (13:25 -0600)]
LU-10680 mdd: disable changelog garbage collection by default

Changelog garbage collection has introduced some instability so
disable it by default.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I708198d76af060cb796de89266ee74a968f92ac1
Reviewed-on: https://review.whamcloud.com/31552
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10786 tests: add stripe size to lfs setstripe 69/31569/3
James Nunez [Wed, 7 Mar 2018 16:27:59 +0000 (09:27 -0700)]
LU-10786 tests: add stripe size to lfs setstripe

Since the default stripe size increased from one to four
MB, we need to add the stripe size parameter to calls
to 'lfs setstripe' for composite files when the component
size is less than the file system stripe size. Thus, add
the stripe size parameter to calls to 'lfs setstripe' for
sanity-flr tests 45 and 46 and sanity-pfl test 16.

Test-Parameters: trivial testlist=sanity-flr,sanity-pfl

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic169eaebd922175467f010b159a2b065fb91b3fb
Reviewed-on: https://review.whamcloud.com/31569
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8066 fid: move all files from procfs to debugfs 66/28366/10
James Simmons [Sun, 21 Jan 2018 16:55:10 +0000 (11:55 -0500)]
LU-8066 fid: move all files from procfs to debugfs

Linux-commit: f3aa79fbef7942971825fb2084a88e9527c6b04c

Besides the client port form upstream also port the server
side proc entires to debugfs.

Change-Id: I934fc5a39c8c407799abd0d6154240d3a579c93e
Signed-off-by: Dmitry Eremin <dmiter4ever@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28366
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8066 obd: final pieces for sysfs/debugfs support. 08/28108/24
James Simmons [Thu, 22 Feb 2018 17:26:16 +0000 (12:26 -0500)]
LU-8066 obd: final pieces for sysfs/debugfs support.

This patch puts in place the basics needed for debugfs.
It also creates class_setup_tunables so sysfs kobject
creation is handled for both obd_devices and llite. Add a
special LDEBUGFS_FOPS_WR_ONLY since often in this case
i_private is not set so any attempt to call PDE_DATA(inode)
will cause it to crash. Make lprocfs_obd_setup select either
debugfs or procfs but not both.

Handle the special symlinks needed for both debugfs
and sysfs with the server case. For lod we need to
create "lov" and osp we create "osc" for both sysfs
and debugfs. Handle the complex case of when a node
is both a server and client. For debugfs we can take
advantage of d_lookup() and for sysfs kset_find_obj()
to avoid special access to struct obd_type. This also
places the burden on the server lod/osp modules instead
of the client lov/osc modules.

Change-Id: I87090859db4da2300ab9e2aa3c23cb3773276103
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28108
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10551 lod: obd_fid_alloc() could start a nested trans 68/31268/10
Bobi Jam [Mon, 12 Feb 2018 05:44:48 +0000 (13:44 +0800)]
LU-10551 lod: obd_fid_alloc() could start a nested trans

* obd_fid_alloc() could possibly start a nested transaction, which
  would reset the OI cache. So we add a
  osd_thread_info::oti_ins_cache_depth to prevent clearing OI cache
  in the nested trnasaction.

* Add more debug mesages in osd_idc_find_or_init()/
  osd_idc_find_and_init()

Test-Parameters: alwaysuploadlogs envdefinitions=PTLDEBUG=-1 testlist=sanity-pfl ostfilesystemtype=zfs mdtfilesystemtype=zfs mdscount=2 mdtcount=4
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Id75fd1787ffc0f47bbf110d460f23db6c34670da
Reviewed-on: https://review.whamcloud.com/31268
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>