Whamcloud - gitweb
fs/lustre-release.git
6 years agoLU-4423 lnet: free a struct kib_conn outside of the kiblnd_destroy_conn() 73/31273/3
Dmitry Eremin [Mon, 12 Feb 2018 12:37:18 +0000 (15:37 +0300)]
LU-4423 lnet: free a struct kib_conn outside of the kiblnd_destroy_conn()

To avoid confusion this fix moved the freeing a struct kib_conn outside of
the function kiblnd_destroy_conn().

Change-Id: Iae28802f5d319570064a504feb14dffd13a22b84
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/31273
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10652 tests: restructure sanity 133[f,g] 45/31245/4
Elena Gryaznova [Wed, 14 Feb 2018 14:24:06 +0000 (17:24 +0300)]
LU-10652 tests: restructure sanity 133[f,g]

sanity 133f and 133g both get skipped in CLIENONLY mode,
but tests are to run on clients on this mode.

The fix separates code of the tests so that 133f tests
clients, while 133g runs on servers. Then in CLIENTONLY mode
only 133g is skipped.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: envdefinitions=ONLY=133 testlist=sanity
Seagate-bug-id: MRP-2438
Cray-bug-id: LUS-4289
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ibba69a3fd4fd4a9f8d90729ec2a294443dd4f29e
Reviewed-on: https://review.whamcloud.com/31245
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10639 tests: rename the tests 30/31230/4
Elena Gryaznova [Fri, 9 Feb 2018 09:01:54 +0000 (12:01 +0300)]
LU-10639 tests: rename the tests

The following tests are renamed to be run separately
from other tests in the groups:
sanity-hsm:
    test_1 to test_1A
    test_9 to test_9A
    test_26 to test_26A
    test_220 to test_220A
    test_224 to test_224A

conf-sanity:
    test_28 to test_28A

lustre-rsync-test.sh:
    test_1 to test_1A

sanity.sh:
    test_239 to test_239A

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ajay Nair <ajay.nair@seagate.com>
Cray-bug-id: LUS-2608, LUS-5328
Seagate-bug-id: MRP-4695, MRP-4121
Test-Parameters: testlist=sanity,sanity-hsm,conf-sanity
Test-Parameters: testlist=lustre-rsync-test
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: Ib1542d55328c0fb60c0c2c59257fa9f5742a57dc
Reviewed-on: https://review.whamcloud.com/31230
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10617 tests: Dir's and file's stripe counts are mismatched 93/31193/3
Elena Gryaznova [Tue, 13 Feb 2018 18:17:01 +0000 (21:17 +0300)]
LU-10617 tests: Dir's and file's stripe counts are mismatched

the case when stripe count of dir equals to -1 and files
in the dir must be equal to ost count added into
the test_24 of ost-pool.sh

Author: Alyona Romanenko <alyona.romanenko@seagate.com>

Signed-off-by: Alyona Romanenko <alyona.romanenko@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=ost-pools envdefinitions="ONLY=24"
Cray-bug-id: LUS-4467
Seagate-bug-id: MRP-2746
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: I91e7c65e178c7706f53a95a2807e06b1bc8e0d24
Reviewed-on: https://review.whamcloud.com/31193
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10612 tests: reply_single.sh,test_48: No space left 82/31182/2
Elena Gryaznova [Tue, 6 Feb 2018 14:20:53 +0000 (17:20 +0300)]
LU-10612 tests: reply_single.sh,test_48: No space left

MDS need to have time to discover the OST state, attempt to
recover, fail and recover again.

Author: gaurav mahajan <gaurav.mahajan@seagate.com>

Signed-off-by: gaurav mahajan <gaurav.mahajan@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=replay-single envdefinitions="ONLY=48"
Cray-bug-id: LUS-4384
Seagate-bug-id: MRP-2616
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Change-Id: I2b3cca70872b7c9f13c64b50e1b4373096fbc147
Reviewed-on: https://review.whamcloud.com/31182
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10600 tests: clean up sanity tests 64d and 65k 59/31159/4
James Nunez [Fri, 2 Feb 2018 23:53:51 +0000 (16:53 -0700)]
LU-10600 tests: clean up sanity tests 64d and 65k

Several saity tests create files or modified the environment
and does not clean up or return the environment to the
original state. sanity test 64d fills and OST and does not
clean up the file after the OST if full. sanity test 65k
sets OSTs to be inactive and, on error, does not set the OST
back to active.

These two tests need to clean up after themselves.

Test-Parameters: trivial testlist=sanity,sanity
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I01bc376680798815c9dd398da7781c92c6b70b2f
Reviewed-on: https://review.whamcloud.com/31159
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10570 obd: fix statfs handling 58/31158/3
James Simmons [Sun, 4 Feb 2018 19:38:25 +0000 (14:38 -0500)]
LU-10570 obd: fix statfs handling

The function lod_qos_statfs_updates() refreshes statfs
data every N seconds. Taking lq_rw_sem can take a very long
time so the testing for stale stats had to be done again after
taking the semaphore. Now that we are using only seconds
resolution it is more likely that max_age and obd_osfs_age
will be equal compared to when the code was using jiffies.
So only release the lock right away when osfs_age has passed
the max_age.

The comment 'use the value of cfs_time_current + HZ' for
obd_statfs() and obd_statfs_async() needs to updated to
the time64_t case.

Simplify llite_statfs_internal() handling by calculating
max_age inside of llite_statfs_internal(). This makes the
code cleaner.

Change-Id: I22aa5d4d78b30d6480e73998e05ec6582a316d4f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31158
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8854 llapi: remove lustre specific strlcpy & strlcat functions 98/29798/6
James Simmons [Sat, 10 Feb 2018 16:18:53 +0000 (11:18 -0500)]
LU-8854 llapi: remove lustre specific strlcpy & strlcat functions

In the days when lustre supported many more platforms some of those
platforms natively support strl[cpy|cat] but Linux has always lack
these functions. So lustre ended up providing its own versions of
these functions to fill in this functionality. Today Lustre only
supports the Linux platforms which has a version of libc that will
most likely never support strl[cat|cpy]. Since this is the case we
can remove the AC_CHECK_FUNCS since they only test against libc.
We could support detecting strl[cpy|cat] in another library but
many libraries provide their own version so the chances of collision
are high. The best solution is remove strlcpy and strlcat by
replacing those functions with string functions that are always
provided by the standard c library.

Change-Id: I72df93c8f83ed1aad80653fe0d1c4d54d1d8e2f2
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29798
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 lustre: record if enable_audit is set on nodemap 14/28314/18
Sebastien Buisson [Wed, 2 Aug 2017 14:47:47 +0000 (23:47 +0900)]
LU-9727 lustre: record if enable_audit is set on nodemap

Record changelogs from a client only if it pertains to a nodemap
on which enable_audit is set, and changelogs are activated.
If client is not explicitely assigned to a nodemap, enable_audit value
from default nodemap is used.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I31d361cfd8cc69db68b60298934cbbef4af0d75d
Reviewed-on: https://review.whamcloud.com/28314
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
6 years agoLU-9250 tests: add parallel-scale xdd test 76/26176/5
Elena Gryaznova [Fri, 16 Feb 2018 10:11:04 +0000 (13:11 +0300)]
LU-9250 tests: add parallel-scale xdd test

Patch adds parallel-scale xdd test.

Our customers report the Lustre issues hit during
xdd test. We need a flexible way to reproduce the
failures.

Author: Chennaiah Palla <chennaiah.palla@seagate.com>

Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5206
Seagate-bug-id: MRP-3915
Test-Parameters: testlist=parallel-scale
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Change-Id: Ia4823aa8ce64aad3d43b2611b24f48a532b8796c
Reviewed-on: https://review.whamcloud.com/26176
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6867 test: detect active facet based on current state 38/15638/17
Elena Gryaznova [Fri, 17 Jul 2015 16:32:49 +0000 (19:32 +0300)]
LU-6867 test: detect active facet based on current state

Lustre failover tests can not be ran test-by-test
on the setup with ${facet}_HOST != ${facet}failover_HOST
because of t-f does not restore facet state.
t-f keeps this info in "${facet}active" files, which are created
when facet_failover() is executed first time in the test session.
Before facet_failover() executed these files are empty and
active facet is ${facet} by default.
In case when tests are executed test-by-test the active facet is
${facet}failover after 1st test completed, and 2nd test is started
having ${facet}failover active without this info stored in
${facet}active files.

Patch contains the following changes:
- add the active facet detection based on current lustre state;
- fix sanity-hsm defect: exist with error if agt${n}1_HOST is empty.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-2680
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Change-Id: Ie42baaa55a6433596e6004d16eb5c18ae2ef7479
Reviewed-on: https://review.whamcloud.com/15638
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10680 mdd: fix run_gc_task uninitialized 47/31347/2
Bruno Faccini [Sun, 18 Feb 2018 19:13:04 +0000 (20:13 +0100)]
LU-10680 mdd: fix run_gc_task uninitialized

run_gc_task has been mistakenly left uninitialized in previous
patch for LU-7340. This has been silently ignored by gcc even
if -Wall option is used during build, possibly because no
optimization level/option requested where -Wuninitialized
option/check may only pe performed.
The side effect is that generated assembly code completelly
avoids run_gc_task usage from source, and thus a kthread
for ChangeLogs garbage-collection is created upon each
record creation and this without any of the garbage-collection
conditions are triggered.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ieb9ce062ba6ebf0c365c1e6f8a57f89dd39e0a9d
Reviewed-on: https://review.whamcloud.com/31347
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
6 years agoLU-10561 flr: remove "--parent" option from lfs mirror command 98/31298/5
Jian Yu [Tue, 20 Feb 2018 22:35:52 +0000 (14:35 -0800)]
LU-10561 flr: remove "--parent" option from lfs mirror command

"--parent" option for "lfs mirror create/extend" command was
originally designed to use default stripe options inherited
from parent directory. However, if parent directory has
composite layout, there will be inconsistency to choose the
stripe options from which component to inherit. And if there
is any other option specified, it's also inconsistent to
inherit the layout of parent directory.

So, this patch removes "--parent" option to eliminate ambiguity.
For "--pool|-p" option, this patch supports specifying "none" to
clear the pool name and inherit from parent directory.

Unspecified stripe count, stripe size and OST pool name will
inherit from previous component. If there is no previous component,
then unspecified stripe count and stripe size attributes will
inherit from filesystem-wide default values. Unspecified or
cleared OST pool name will inherit from parent directory.

Change-Id: Ib0ec3cbc65fb307c42881f35dc676090ab8319ff
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31298
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10663 utils: clear errno before check 05/31305/4
John L. Hammond [Wed, 14 Feb 2018 18:27:54 +0000 (12:27 -0600)]
LU-10663 utils: clear errno before check

In jt_obd_destroy() clear errno before calling strtoull() and checking
it.

Test-Parameters: trivial testlist=obdfilter-survey

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I686cd6eb0a57248177e5b0878df5e3f450fbc942
Reviewed-on: https://review.whamcloud.com/31305
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9724 ldiskfs: update ext4-large-eas.patch to match upstream ext4 33/31033/9
Emoly Liu [Fri, 26 Jan 2018 07:26:00 +0000 (15:26 +0800)]
LU-9724 ldiskfs: update ext4-large-eas.patch to match upstream ext4

In order to match the enhanced ea_inode functionality being landed
to the upstream ext4 kernel tree, ext4-large-eas.patch is modified
to start properly initializing some of the fields we don't
currently use to minimize the interoperability issues.

In particular, the new EA inode refcount is initialized to 1, and
hash field is computed based on the xattr value as it is in the
upstream kernel patch.

However, since ext4_xattr_inode_get_hash() has not been added to
ldiskfs code so that this hash value is not used anywhere, if the
new checksum driver (sbi->s_chksum_driver) is not available, hash
value will be 0 in the current implementation, until we find a way
to calculate it based on the xattr value propely.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I2bcf45c67a580f2f545816e1a70a6322c6ccc368
Reviewed-on: https://review.whamcloud.com/31033
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10277 utils: 'lfs mkdir -i -1' pick the less full MDTs 98/30598/11
Lai Siyao [Mon, 4 Dec 2017 07:38:25 +0000 (15:38 +0800)]
LU-10277 utils: 'lfs mkdir -i -1' pick the less full MDTs

If 'lfs mkdir -i -1 -c count' is specified, it will 'df' first,
and then randomly pick 'count' less full MDTs as specific MDTs.

Add sanity test 413.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I2ce1720479d37b1ae397054743afae865129fee3
Reviewed-on: https://review.whamcloud.com/30598
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9019 obd: migrate upcall cache to time64_t 64/31064/3
James Simmons [Wed, 7 Feb 2018 06:20:54 +0000 (01:20 -0500)]
LU-9019 obd: migrate upcall cache to time64_t

Move all the upcall cache time handling from jiffies to time64_t.

Change-Id: I86039c6e6e35ac83b773753c952936f1b2f5e14a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31064
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10270 lnet: remove an early rx code 54/30254/7
Alexey Lyashkov [Thu, 23 Nov 2017 11:28:18 +0000 (14:28 +0300)]
LU-10270 lnet: remove an early rx code

early RX added to the o2ib lnd as attempt to reordering problem
handling, When messages have arrived before actual connection sets.
But it code can fill all incoming queue and normal connect will not
processed.

Cray-bug-id: MRP-4638
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I2efc73534a20c4628ed462ee5055c901dbf44278
Reviewed-on: https://review.whamcloud.com/30254
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10181 tests: add FIO as test for DOM 59/30059/12
Mikhal Pershin [Mon, 13 Nov 2017 15:23:54 +0000 (18:23 +0300)]
LU-10181 tests: add FIO as test for DOM

Add FIO test for basic DOM performance tracking,
- remove unused smallfileio test,
- make parameter setting compatible with DNE,
- turn off extra stats output by default
- format test output

Test-Parameters: trivial mdssizegb=20 testlist=sanity-dom,dom-performance
Signed-off-by: Mikhal Pershin <mike.pershin@intel.com>
Change-Id: Id4236643e841165d35e7d3f0c1ab64ae8f9e1751
Reviewed-on: https://review.whamcloud.com/30059
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9019 osd-ldiskfs: migrate to 64 bit time 57/29857/10
James Simmons [Mon, 5 Feb 2018 17:14:51 +0000 (12:14 -0500)]
LU-9019 osd-ldiskfs: migrate to 64 bit time

Replace cfs_time_current_sec() to avoid the overflow issues in
2038 with ktime_get_real_seconds(). Besides changing struct
scrub_file sf_time_* fields to time64_t for usage with
ktime_get_real_seconds() the other fields can also be moved to
time64_t as well since we don't need precision better than one
second for the scrubbing code. The dr_* time fields in struct
osd_iobuf are jiffies which does get reporting with the histograms.
This was with the thinking that jiffies equal milliseconds which
is not always the case. Since we need better than one second
resolution move dr_* time fields to ktime. This way the value
passed to lprocfs_oh_tally_log() will always be in milliseconds.

Change-Id: Ibce7f7d9f972c8d3188271950f68dcda7663676f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29857
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-5695 libcfs: watchdog dispatch thread fix 55/12155/8
Alexander Zarochentsev [Fri, 16 Feb 2018 18:06:57 +0000 (13:06 -0500)]
LU-5695 libcfs: watchdog dispatch thread fix

lc_watchdogd may stop imediately after start
because nobody clears the stop flag.

Xyratex-bug-id: MRP-2108 MRP-1913
Change-Id: I1eaaf0330c111b7f2b17081c716ef8c200677d6b
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-on: https://review.whamcloud.com/12155
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10650 obd: add check to obd_statfs 43/31243/7
Alexander Boyko [Fri, 9 Feb 2018 12:07:19 +0000 (07:07 -0500)]
LU-10650 obd: add check to obd_statfs

The race could happend between mount and lctl get_param.
Because procfs files are ready before a full obd initialization.
For example:
3372:0:(dt_object.h:2509:dt_statfs()) ASSERTION( dev )
3372:0:(dt_object.h:2509:dt_statfs()) LBUG
Pid: 3372, comm: lctl
Call Trace:
libcfs_call_trace+0x4e/0x60[libcfs]
lbug_with_loc+0x4c/0xb0[libcfs]
tgt_statfs_internal+0x2ea/0x350[ptlrpc]
ofd_statfs+0x66/0x470 [ofd]
lprocfs_filesfree_seq_show+0xf6/0x520 [obdclass]
ofd_filesfree_seq_show+0x12/0x20 [ofd]

The patch adds a check of completed obd_setup to obd_statfs().
The patch adds the sanity 276 test.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-2665
Change-Id: I55a9ffa7e036f486388a8f548051d28974d47951
Reviewed-on: https://review.whamcloud.com/31243
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8990 lod: put root at cleanup 43/31143/3
Lai Siyao [Fri, 2 Feb 2018 15:00:15 +0000 (23:00 +0800)]
LU-8990 lod: put root at cleanup

'lod_md_root' was put at precleanup, but soak test shows there exists
race, and some ongoing request may re-initialize it, move this put
to cleanup.

Also add debug code to dump remaining objects if lod device is still
referenced at lod_device_free().

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I6f1ab0ba149ccf95279c1182c90a5588607ad8fa
Reviewed-on: https://review.whamcloud.com/31143
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10550 flr: resync RDONLY state FLR file 10/31010/8
Bobi Jam [Wed, 24 Jan 2018 15:32:37 +0000 (23:32 +0800)]
LU-10550 flr: resync RDONLY state FLR file

When some components are failed to resync due to various reasons,
those components will still have STALE bit set but the file statue may
become to RDONLY.

This patch makes resync RDONLY FLR file possible.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I2e3b518bb969aedd7f214e6b09b895079cab69ab
Reviewed-on: https://review.whamcloud.com/31010
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10356 llite: have ll_write_end to sync for DIO 59/30659/2
Vladimir Saveliev [Tue, 26 Dec 2017 19:49:58 +0000 (22:49 +0300)]
LU-10356 llite: have ll_write_end to sync for DIO

direct IO write uses buffered write for pages which could not be
released. If not adjacent pages are not releasable,
vio->u.write.vui_queue list becomes non-contiguos which makes
page_list_sanity_check() to fail.

Have ll_write_commit to do vvp_io_write_commit() when it is called in
course of direct IO.

Cray-bug-id: MRP-4415
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I21e653c4d45553c85ff5ded8edf22017966c7ba4
Reviewed-on: https://review.whamcloud.com/30659
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8856 osd: mark specific transactions netfree 30/26930/20
Alex Zhuravlev [Wed, 3 May 2017 12:45:13 +0000 (15:45 +0300)]
LU-8856 osd: mark specific transactions netfree

osd-zfs should mark some transactions netfree. this means those transactions
are expected to release space (rather than consume) and for this kind of
transaction half of reserved space is available.

Change-Id: I71605bc224882aafac26b3dfb0f3d7e82af8fde8
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/26930
Tested-by: Jenkins
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10670 test: make sanity-flr test_43 more reliable 15/31315/3
Bobi Jam [Thu, 15 Feb 2018 07:59:14 +0000 (15:59 +0800)]
LU-10670 test: make sanity-flr test_43 more reliable

Improve sanity-flr test_43 more reliable by setting the active
state of OSP device instead of OSC device to simulate OST's
unavailability.

Test-Parameters: testlist=sanity-flr
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ibfb4a54479a7dafff251dd3645b03ec172b6884e
Reviewed-on: https://review.whamcloud.com/31315
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10443 test: Handle file lifecycle correctly 54/31254/2
Patrick Farrell [Fri, 9 Feb 2018 15:00:13 +0000 (09:00 -0600)]
LU-10443 test: Handle file lifecycle correctly

The current lockahead_test.c removes the test file on exit,
which will destroy the locks which sanity.sh counts to
verify correct operation.  This usually works because
sanity.sh wins the race with the object destroy command
from the MDS to the OSS.

Change lockahead_test.c to remove the test file on entry,
and to use $tfile rather than its own file, so it is
automatically cleaned up by sanity.

Change-Id: I3cd1fdb7f33da167ca21476a7b3cbe5f57fd5782
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/31254
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10634 kernel: kernel update [SLES12 SP3 4.4.114-94.11] 24/31224/3
Bob Glossman [Wed, 7 Feb 2018 23:04:40 +0000 (15:04 -0800)]
LU-10634 kernel: kernel update [SLES12 SP3 4.4.114-94.11]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I3ffcd4c368b2976cffa6a517f9fabcf674781ac9
Reviewed-on: https://review.whamcloud.com/31224
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10603 ptlrpc: export req_buffers_max via procfs 62/31162/2
Alex Zhuravlev [Mon, 5 Feb 2018 10:03:17 +0000 (13:03 +0300)]
LU-10603 ptlrpc: export req_buffers_max via procfs

after LU-9372 gcc7 complains:
lustre/ptlrpc/lproc_ptlrpc.c:382:16: error: â€˜ptlrpc_lprocfs_req_buffers_max_fops’ defined but not used [-Werror=unused-const-variable=]
 LPROC_SEQ_FOPS(ptlrpc_lprocfs_req_buffers_max);
                 ^

Change-Id: Ie4806b79d104c7ea9aa34b6a8a280587fccef689
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31162
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10560 libcfs: handle rename to wait_queue_entry_t 53/31153/11
Mike Marciniszyn [Fri, 9 Feb 2018 18:22:50 +0000 (13:22 -0500)]
LU-10560 libcfs: handle rename to wait_queue_entry_t

The 4.13 kernel renames wait_queue_t to wait_queue_entry_t.

Add a probe and handle rename across the code base and have
a define to translate to the new name when indicated.

Test-Parameters: trivial

Change-Id: I8f0f5ec4d02ccb270acb72ccffe13f0ecf6bd2f7
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31153
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10560 lustre_compat: Convert GFP_TEMPORARY to GFP_KERNEL 52/31152/6
Mike Marciniszyn [Fri, 2 Feb 2018 16:45:54 +0000 (08:45 -0800)]
LU-10560 lustre_compat: Convert GFP_TEMPORARY to GFP_KERNEL

The 4.14 kernel removes this gfp.h define.

Adjust the code to use GFP_KERNEL as the upstream
patch does.

Change-Id: I40fff2724499fa17aa285507e0fd9b21f4afc070
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31152
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10574 tests: remove useless check from sanity-dom.sh 74/31074/3
Elena Gryaznova [Wed, 7 Feb 2018 20:09:58 +0000 (23:09 +0300)]
LU-10574 tests: remove useless check from sanity-dom.sh

Tests test_sanity() and test_sanityn() are skipped if started
not from lustre/tests directory because of incorrect check
that ./sanity.sh exists.
Patch removes the check of the files which are part of
lustre/tests.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2594
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Test-Parameters: testlist=sanity-dom
Change-Id: I51ad517fbf3ff653d9a11994eb280daee589a886
Reviewed-on: https://review.whamcloud.com/31074
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10449 nrs: Generic TBF policy can't be shown correctly 96/30696/6
Qian Yingjin [Wed, 3 Jan 2018 09:21:10 +0000 (17:21 +0800)]
LU-10449 nrs: Generic TBF policy can't be shown correctly

After setting TBF NID/OPCode/JobID policy and switch to generic
policy, the output of "lctl get_param ost.OSS.ost.nrs_policies"
can not display correctly.

Change-Id: If8dcb7ae6ade634ec7ec4dfcb5887501cda90cdf
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/30696
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9452 tests: remove sanityn test_29 from ALWAYS_EXCEPT 46/30646/2
Andreas Dilger [Fri, 22 Dec 2017 10:19:18 +0000 (03:19 -0700)]
LU-9452 tests: remove sanityn test_29 from ALWAYS_EXCEPT

There is no longer a sanityn.sh test_29() so it shouldn't be
listed in ALWAYS_EXCEPT.

Test-Parameters: trivial testlist=sanityn
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia8112e20cbd3203b69e85e586e2400551b94de81
Reviewed-on: https://review.whamcloud.com/30646
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10278 utils: allow to migrate without direct io 01/30301/4
Daniel Kobras [Tue, 28 Nov 2017 16:26:00 +0000 (00:26 +0800)]
LU-10278 utils: allow to migrate without direct io

Using direct i/o to copy file contents during migration minimizes
cache interference, but may significatly reduce performance.
Introduce new option -D/--non-direct to lfs migrate/lfs_migrate that
leaves the tradeoff at the discretion of the caller.

Signed-off-by: Daniel Kobras <d.kobras@science-computing.de>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I9c2935ff204ea5385bfc38006c5476b956deb6a7
Reviewed-on: https://review.whamcloud.com/30301
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6142 uapi: remove remaining typedef in lustre UAPI headers 75/31175/2
James Simmons [Mon, 5 Feb 2018 20:42:25 +0000 (15:42 -0500)]
LU-6142 uapi: remove remaining typedef in lustre UAPI headers

Remove remaining tyepdef in lustre UAPI headers to make them
linux kernel compliant.

Test-Parameters: trivial

Change-Id: I13a24deed348e06c1c63bd0c332f63d4f77a0d76
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31175
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7004 quota: make lctl set_param -P functional for quota 81/31081/5
James Simmons [Tue, 6 Feb 2018 15:39:30 +0000 (10:39 -0500)]
LU-7004 quota: make lctl set_param -P functional for quota

Currently setting up quota permanently can only be done with a
command like lctl conf_param $FSNAME.quota.ost=ug. To see if those
settings take hold we examine the 'enabled' proc file located in
the quota_slave directory in the proc tree. To make this workable
with lctl set_param -P we can make the 'enabled' proc file
writable and lustre can treat the config log change from set_param
-P for quota like any other tunable to be set permanetly.

Change-Id: I6a4c1fdc9d16658930f48d21e4f79e6f36047511
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31081
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10576 tests: sleep seconds to avoid using cached statfs 02/31102/8
Fan Yong [Wed, 14 Feb 2018 00:28:55 +0000 (08:28 +0800)]
LU-10576 tests: sleep seconds to avoid using cached statfs

In sanity test_803, we check the object usage via "lfs df -i".
But the MDT may return cached statfs if two "df" calls arrive
too close each other (about 1 second). Sleep 3 seconds between
two "df" calls to avoid such trouble.

Test-Parameters: trivial envdefinitions=SLOW=yes testlist=sanity mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4
Signed-off-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I9ce4cb6c069a88fe2b93d2d5a6304c96bdb5a0c1
Reviewed-on: https://review.whamcloud.com/31102
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8444 tests: test for unsigned xattr inode number 47/21547/24
Artem Blagodarenko [Wed, 27 Jul 2016 16:01:06 +0000 (19:01 +0300)]
LU-8444 tests: test for unsigned xattr inode number

The patch for "MRP-3025 Hitting "LDISKFS-fs error (device md66):
ldiskfs_xattr_inode_iget: error while reading EA inode -2147483347" on
large MDT volumes with large_xattr feature enabled."

Added test:
1. MDS should have more than 2G inodes
2. mdt fs should be created with large_xattr flag.
3. set inode_goal to get higher inode number allocated:
   echo 2147483947 > /sys/fs/ldiskfs/<device>/inode_goal
3. create a file
4. start adding hard links to that file

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Seagate-bug-id: MRP-3378
Change-Id: Id9c0fe9d8047935e5cf5be1b9209a74588565f2e
Reviewed-on: https://review.whamcloud.com/21547
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Alexander Lezhoev <garson2@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8602 gss: autoconf check missing "test" keyword 91/31191/2
Olaf Faaland [Tue, 6 Feb 2018 23:04:12 +0000 (15:04 -0800)]
LU-8602 gss: autoconf check missing "test" keyword

Change https://review.whamcloud.com/31095 introduced an error in the
autoconf, omitting the command "test" in an autoconf check.  Add it.

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I525805801d9e8166ec1064dccbf6cec6f97efdfa
Reviewed-on: https://review.whamcloud.com/31191
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10611 autoconf: check zlib library and zlib.h header file 86/31186/2
Jian Yu [Tue, 6 Feb 2018 20:37:23 +0000 (12:37 -0800)]
LU-10611 autoconf: check zlib library and zlib.h header file

After landing commit f1daa8fc6575e5b9e4a2f1f2ae4ceaefb889a694,
zlib library and zlib.h header file are required to compile lfs.c.
This patch adds the check in configure script.

Change-Id: Id3a8acfc780fb4fcdec0bb99b79b550c5c9e957a
Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31186
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10577 tests: fix lfsck-performance for separate MGT and MDT 75/31075/2
Elena Gryaznova [Mon, 29 Jan 2018 17:30:54 +0000 (20:30 +0300)]
LU-10577 tests: fix lfsck-performance for separate MGT and MDT

lfsck-performance 0,1,2,3 tests run stopall and then mount MDT
which cause the tests failures on configuration with not
combined MGS/MDS.
Patch fixes these tests defects.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2534
Test-Parameters: testlist=lfsck-performance
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: I24c15b9998511bab3dc6fdd3445793e70281c890
Reviewed-on: https://review.whamcloud.com/31075
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10482 flr: enhance "lfs find" to add mirror options 69/31069/6
Jian Yu [Thu, 8 Feb 2018 06:45:02 +0000 (22:45 -0800)]
LU-10482 flr: enhance "lfs find" to add mirror options

This patch adds the following mirror related search
options to "lfs find" command:

[[!] --mirror-count|-N [+-]n]
[[!] --mirror-state <[^]state>]

--mirror-count|-N indicates mirror count.
--mirror-state indicates mirrored file state.

A mirrored file can be one of the following states:
ro indicates the mirrored file is in read-only state.
   All of the mirrors contain the up-to-date data.
wp indicates the mirrored file is being written.
sp indicates the mirrored file is being resynchronized.

Change-Id: I3c8f5c8bb6518ba4bd73fc2f164dd52afdfac211
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31069
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 doc: update llog_reader man page for Changelogs 70/30970/9
Sebastien Buisson [Mon, 22 Jan 2018 17:07:01 +0000 (02:07 +0900)]
LU-9727 doc: update llog_reader man page for Changelogs

Add new paragraph in llog_reader's man page to explain how to read
Changelogs with llog_reader, and add an example.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3e1123b9a5ac88334a370fd69c1d9d63597e16f7
Reviewed-on: https://review.whamcloud.com/30970
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9906 osd: use pagevec for putting pages 31/30531/5
Patrick Farrell [Mon, 5 Feb 2018 12:16:58 +0000 (06:16 -0600)]
LU-9906 osd: use pagevec for putting pages

Using a pagevec instead of individual page puts is much
more efficient.  This should reduce contention on the page
cache allocation/freeing, which becomes a bottleneck with
high speed OSTs.

Cray-bug-id: LUS-5670
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ic15cb8e30887ec55e9348e50af307bfd7108c7e4
Reviewed-on: https://review.whamcloud.com/30531
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10377 build: Update ZFS Version to 0.7.6 22/30522/6
Nathaniel Clark [Fri, 26 Jan 2018 14:02:02 +0000 (09:02 -0500)]
LU-10377 build: Update ZFS Version to 0.7.6

Update SPL and ZFS version that is built against.

https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.7.6

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: If010d3a7e78b66a2acbd70242fe517218a438c02
Reviewed-on: https://review.whamcloud.com/30522
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 utils: make llog_reader decode changelog fields 15/30315/13
Sebastien Buisson [Wed, 29 Nov 2017 17:18:32 +0000 (02:18 +0900)]
LU-9727 utils: make llog_reader decode changelog fields

Make llog_reader decode all Changelogs fields and extra fields.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idfb41607fc5664cb99b254aece4625d1796331af
Reviewed-on: https://review.whamcloud.com/30315
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-9727 lustre: record denied OPEN in Changelogs 12/28812/24
Sebastien Buisson [Tue, 29 Aug 2017 08:45:30 +0000 (17:45 +0900)]
LU-9727 lustre: record denied OPEN in Changelogs

Record denied OPEN events in Changelogs, in the same format as
successful OPEN events.
Recording denied OPEN events is useful for security audit,
in order to find out who tried to get access to some data.
An NOPEN changlog entry is in the form:
4 24NOPEN 15:45:44.947406626 2017.08.31 0x2 t=[0x200000402:0x1:0x0]
ef=0xf u=500:500 nid=10.128.11.158@tcp m=-w-
By default, disable recording of NOPEN events in Changelogs.
NOPEN entries in Changelogs are rate limited: no more than one
entry per user per file per minute, configurable via
/proc/fs/lustre/mdd/<fsname>-MDTXXX/changelog_deniednext

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib33651dda63735e21fffeed34cb1adc803ff7eca
Reviewed-on: https://review.whamcloud.com/28812
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Matthew S <matthew.sanderson@anu.edu.au>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 lustre: limit OPEN and CLOSE rates in Changelogs 99/28299/31
Sebastien Buisson [Mon, 31 Jul 2017 11:50:22 +0000 (20:50 +0900)]
LU-9727 lustre: limit OPEN and CLOSE rates in Changelogs

Record OPEN only once in the Changelogs per UID/GID, for a given
open mode, as long as the file is not closed by this UID/GID.
Similarly, only record the last CLOSE per UID/GID.
For instance, it avoids flooding the Changelogs if there is an MPI
job opening the same file thousands of times from different threads.
It reduces the ChangeLog load significantly, without significantly
affecting the audit information.

To achieve this, add a list to struct mdd_object, containing uid/gid
of clients opening files. Record OPEN only if client uid/gid is not
already in list. And record CLOSE only if client uid/gid was just
removed from list.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0fa08d11f0284d63e531ab48c03a8af6f3928487
Reviewed-on: https://review.whamcloud.com/28299
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 lustre: add CL_GETXATTR for Changelogs 51/28251/22
Sebastien Buisson [Thu, 6 Jul 2017 12:50:14 +0000 (21:50 +0900)]
LU-9727 lustre: add CL_GETXATTR for Changelogs

Record GETXATTR events in Changelogs, and add a new changelog
extension named changelog_ext_xattr to hold xattr name.
A GETXATTR changlog entry is in the following form:
8 23GXATR 09:22:55.886793012 2017.07.27 0x0
t=[0x200000402:0x1:0x0] ef=0xf u=500:500 nid=10.128.11.159@tcp
x=user.name0
Also, rename CL_XATTR type to CL_SETXATTR.
By default, disable recording of GETXATTR events in Changelogs.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia02e870ca162c7d2b97eb0ce80e99fe7145b7601
Reviewed-on: https://review.whamcloud.com/28251
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-9409 llite: Add tiny write support 03/27903/29
Patrick Farrell [Mon, 5 Feb 2018 15:55:35 +0000 (09:55 -0600)]
LU-9409 llite: Add tiny write support

If a page is already dirty in the page cache, we can write
to it without a full i/o.  This improves performance for
writes of < 1 page dramatically.

Append writes are a bit tricky, requiring us to take the
range lock (which we can normally avoid), but they are
still much faster than the normal i/o path.

Performance numbers with dd, on a VM with an older Xeon.

All numbers in MiB/s.

                8 bytes 1KiB
Without patch:  .75     75
With patch:     6.5     153

Cray-bug-id: LUS-1705
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I75cc72ceb5f174a5394af8ffe5df4fe9583f19a3
Reviewed-on: https://review.whamcloud.com/27903
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10418 flr: replace llapi_lease_get with llapi_lease_acquire 21/31221/2
Jian Yu [Thu, 8 Feb 2018 07:09:47 +0000 (23:09 -0800)]
LU-10418 flr: replace llapi_lease_get with llapi_lease_acquire

After commit 8f1c7c1e44a4cc8870eb2b2a71da323e265881b4 landed,
we need replace llapi_lease_{get,put} with llapi_lease_{acquire,release}.

Change-Id: Ie36abeb3c71f2d0c3345de9b4830549af3bb56a3
Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31221
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10448 lod: pick primary mirror for write 11/30711/12
Bobi Jam [Tue, 26 Dec 2017 10:16:40 +0000 (18:16 +0800)]
LU-10448 lod: pick primary mirror for write

As a mirrored file being written for the first time, MDS will choose
a mirror to write the data, a primary choosing policy function is
defined in this patch (lod_primary_pick()) to avoid the mirror with
unavailable OSTs.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I5d6d0459e96583294c3040a7994c33114be1e439
Reviewed-on: https://review.whamcloud.com/30711
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10181 mdt: high-priority request handling for DOM 68/29968/9
Mikhal Pershin [Tue, 7 Nov 2017 16:29:32 +0000 (19:29 +0300)]
LU-10181 mdt: high-priority request handling for DOM

Implement high-priority request handling and lock
prolongation for Data-on-MDT BRW requests to avoid incorrect
timeouts and client eviction under heavy MDS load.

Test-Parameters: mdssizegb=20 testlist=sanity-dom,dom-performance
Signed-off-by: Mikhal Pershin <mike.pershin@intel.com>
Change-Id: I589efd2774d739f3a0b471d7a6e4d6be7c6a7c2c
Reviewed-on: https://review.whamcloud.com/29968
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10438 flr: layout truncate compatibility 86/30786/8
Bobi Jam [Wed, 10 Jan 2018 04:18:29 +0000 (12:18 +0800)]
LU-10438 flr: layout truncate compatibility

In PFL design, client issues [0, size) intent to MDS to instantiate
objects in this extent. While in FLR design, the intent serves two
purposes: 1) make objects across [size, EOF) on other mirrors stale,
and 2) instantiate objects in the chosen write mirror. And original
FLR chose to use [size, EOF) as the extent of truncate write intent
request.

This patch reverts the choice, and still uses [0, size) as the
truncate write intent extent.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I12d90320ba704f01457670f864d78fe764233000
Reviewed-on: https://review.whamcloud.com/30786
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9771 util: rename LCM_FL_NOT_FLR to LCM_FL_NONE 47/31047/6
Andreas Dilger [Fri, 26 Jan 2018 21:37:35 +0000 (14:37 -0700)]
LU-9771 util: rename LCM_FL_NOT_FLR to LCM_FL_NONE

Having "lfs getstripe" print out "lcm_flags: not_flr" is not very
useful, as the composite files may not relate to FLR (e.g. PFL).
Rename "LCM_FL_NOT_FLR" to "LCM_FL_NONE" so it is more clear there
are no composite flags on the layout.

Print out "0" for flags if no flags are set, to match old behaviour.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ib3701da8368253969567b927300cd42bc33ebbe5
Reviewed-on: https://review.whamcloud.com/31047
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10286 mdt: deny 2.10 clients to open mirrored files 57/30957/6
Jinshan Xiong [Sun, 21 Jan 2018 01:58:01 +0000 (01:58 +0000)]
LU-10286 mdt: deny 2.10 clients to open mirrored files

2.10 clients would manipulate mirrored layout as PFL layout, which
would damage mirrored files.

This patch only allows mirrored files to be opened by clients who
understand mirror layout.

It also fixes the problem that it should check OBD_CONNECT_FLAGS2
first before checking OBD_CONNECT2_XXX flags.

Test-Parameters: envdefinitions=SLOW=yes clientjob=lustre-b2_10 clientbuildno=68 testlist=runtests,sanity
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I3431211ad30a1edd07f0f583d573328d6308779d
Reviewed-on: https://review.whamcloud.com/30957
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10560 llite: remove extra headers from rw26.c 51/31151/3
Mike Marciniszyn [Fri, 2 Feb 2018 16:45:54 +0000 (08:45 -0800)]
LU-10560 llite: remove extra headers from rw26.c

Remove headers from llite_mmap.c that are no longer (or were possibly
never) needed.  This avoids a compile problem with 4.14 kernels:

CC [M]  lustre/llite/rw26.o
In file included from lustre/llite/rw26.c:43:0:
./arch/x86/include/asm/uaccess.h: In function â€˜set_fs’:
./arch/x86/include/asm/uaccess.h:31:9:
error: dereferencing pointer to incomplete type
current->thread.addr_limit = fs;

It turns out <asm/uaccess.h> was included twice but we don't need
anything from that header.  Same for <linux/stat.h> and a number of
other extraneous headers.

Sort headers alphabetically so it is easier to avoid duplicates.

Change-Id: I4cbe2c1c7289466396b2bb2eac3c475d1041a283
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31151
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
6 years agoLU-10560 llite: remove extra headers from llite_mmap.c 50/31150/4
Mike Marciniszyn [Fri, 2 Feb 2018 16:45:53 +0000 (08:45 -0800)]
LU-10560 llite: remove extra headers from llite_mmap.c

Remove headers from llite_mmap.c that are no longer (or were possibly
never) needed.  This avoids a compile problem with 4.14 kernels:

CC [M]  lustre/llite/llite_mmap.o
In file included from lustre/llite/llite_mmap.c:41:0:
./arch/x86/include/asm/uaccess.h: In function â€˜set_fs’:
./arch/x86/include/asm/uaccess.h:31:9:
error: dereferencing pointer to incomplete type
current->thread.addr_limit = fs;

It turns out <asm/uaccess.h> was included twice but we don't need
anything from that header.  Same for <linux/stat.h> and a number of
other extraneous headers.

Sort headers alphabetically so it is easier to avoid duplicates.

Test-Parameters: trivial
Change-Id: Ic24a58bdaec16f92d7bea9c24172796031cad471
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31150
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
6 years agoLU-10536 build: add path for libnvpair to zfslib 28/31128/3
James Simmons [Thu, 1 Feb 2018 19:37:06 +0000 (14:37 -0500)]
LU-10536 build: add path for libnvpair to zfslib

For the case of building lustre against the ZFS source tree
instead of an ZFS installation on the node we need to add
the path $zfssrc/lib/libnvpair/.libs so mount.lustre can
link to the library.

Test-Parameters: trivial

Change-Id: Id08cc3aed1bc201611fb9382d2f53e40e69f9544
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31128
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10580 lfsck: GPF in lfsck_namespace_repair_dirent 86/31086/3
Andriy Skulysh [Wed, 10 Jan 2018 03:07:02 +0000 (05:07 +0200)]
LU-10580 lfsck: GPF in lfsck_namespace_repair_dirent

chlid object can be NULL if we want just to remove a dirent

Change-Id: I330fc690b36c48c4be7fa4062304c39028f0b878
Cray-bug-id: MRP-4606
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/31086
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9466 tests: Error message for empty "error" calls 78/31078/7
Saurabh Tandan [Mon, 29 Jan 2018 20:01:09 +0000 (13:01 -0700)]
LU-9466 tests: Error message for empty "error" calls

Adding error message to empty "error" calls made
without any message for a few tests.

Made changes to tests under sanity.sh , sanityn.sh
and sanity-hsm.sh

Test-Parameters: trivial envdefinitions="SLOW=yes" testlist=sanityn.sh,sanity-hsm.sh
Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: Idb3440561e47f6caaab455f2c6d7e0d2a2651f95
Reviewed-on: https://review.whamcloud.com/31078
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10508 utils: use callvpe() in lustre_rsync 69/30869/4
John L. Hammond [Mon, 15 Jan 2018 16:20:00 +0000 (10:20 -0600)]
LU-10508 utils: use callvpe() in lustre_rsync

Add callvpe() to lustre/utils as a replacement for system() in cases
where the command string includes arbitrary pathnames. In
lustre_rsync, replace calls to system() with calls to callvpe().

Test-Parameters: trivial testlist=lustre-rsync-test
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Id5ae7e25e14346a1293497c2caa221513d0ee9f3
Reviewed-on: https://review.whamcloud.com/30869
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10507 tests: use {save,restore}_layout() in test 58/30858/6
Jinshan Xiong [Mon, 5 Feb 2018 20:06:48 +0000 (20:06 +0000)]
LU-10507 tests: use {save,restore}_layout() in test

Revised test cases sanity:test_{27A,65i,65j,65m,406}(),
sanity-pfl:test_10() to use new interfaces to save and restore
layout.

Test-Parameters: trivial testlist=sanity-pfl,sanity,sanity-dom
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I11f4e5dcd486d4f7d08666c462d056041e125365
Reviewed-on: https://review.whamcloud.com/30858
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-6353 contrib: Remove wireshark plugin 02/30602/5
Nathaniel Clark [Tue, 9 Jan 2018 13:04:29 +0000 (08:04 -0500)]
LU-6353 contrib: Remove wireshark plugin

Wireshark dissection has been pushed upstream:
https://code.wireshark.org/review/24795 [lnet]
https://code.wireshark.org/review/24800 [lustre]

Both patches have landed to wireshark master.

Dissectors were ported to wireshark master and significantly expaneded
and cleaned up.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I0a9d54634599cdb7f9169f1186c58fa96666b246
Reviewed-on: https://review.whamcloud.com/30602
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
6 years agoLU-10347 tests: suspend the copytool in sanity-hsm/test_252 92/30492/7
Quentin Bouget [Tue, 12 Dec 2017 12:09:19 +0000 (12:09 +0000)]
LU-10347 tests: suspend the copytool in sanity-hsm/test_252

The copytool does not have to, and should not, process the archive
request emitted in test_252 of sanity-hsm: the test needs the request
to time out.

This patch suspends the copytool until the request times out.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I54d114eafb85b75e7f9160806cd3298bfa186b19
Reviewed-on: https://review.whamcloud.com/30492
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10364 test: Add version check to test_255b 81/30481/2
Wei Liu [Mon, 11 Dec 2017 21:22:18 +0000 (13:22 -0800)]
LU-10364 test: Add version check to test_255b

Skip test if server version is older than 2.8.54 since
it does not support ladvise.

Change-Id: Ie613c1b8d9d082b78529a3d72fd59150431f65ea
Test-Parameters: trivial
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/30481
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9664 hsm: protect cdt_state 34/27634/10
Hongchao Zhang [Sat, 2 Dec 2017 00:48:03 +0000 (08:48 +0800)]
LU-9664 hsm: protect cdt_state

In hsm_cancel_all_actions in mdt_coordinator.c, the cdt_state
could be set to wrong state if there are more than one
hsm_cancel_all_actions at the same time.

Assume the state is CDT_ENABLED before hsm_cancel_all_actions

the first call               the second call
CDT_ENABLED is saved
cdt_state = CDT_DISABLED
                             CDT_DISABLED is saved
...                          cdt_state remains CDT_DISABLED

cdt_state = CDT_ENABLED      ...
                             cdt_state = CDT_DISABLED

This patch introduces cdt_state_lock to protect the state.

Test-Parameters: trivial testlist=sanity-hsm
Change-Id: I7c976a3a506300de7cf9f5fa1d53741b2e28b654
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/27634
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9474 tests: fix quoting in stack_trap 90/30490/9
Quentin Bouget [Mon, 11 Dec 2017 08:42:18 +0000 (08:42 +0000)]
LU-9474 tests: fix quoting in stack_trap

stack_trap() mishandled single quotes. This patch is not the cleanest
of fixes, but it works.

(sanity-hsm is the only test suite that uses the function, for now)

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: Ia43219e57079abdbfc75485105d572bbfa85caba
Reviewed-on: https://review.whamcloud.com/30490
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9019 lnd: remove remaining cfs_time wrappers 42/31042/2
James Simmons [Fri, 26 Jan 2018 20:18:09 +0000 (15:18 -0500)]
LU-9019 lnd: remove remaining cfs_time wrappers

Remove remaining libcfs time wrappers from ko2iblnd. Also fix bug
in ksocklnd to use cfs_time_seconds for calling schedule_timeout
instead of cfs_durations_sec. That was the opposite of the
conversion we needed. the remaining jiffy use it moved to
time64_t.

Change-Id: I5847d7260ac8a9be1b165423adb7b8e9a53998d2
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31042
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10420 flr: split a mirror from mirrored file 88/30388/18
Bobi Jam [Wed, 6 Dec 2017 04:12:26 +0000 (12:12 +0800)]
LU-10420 flr: split a mirror from mirrored file

Splits a mirror with mirror_id out of a mirrored file.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ib9c2ca7deb329ba0f95880ebeee77563317d0fca
Reviewed-on: https://review.whamcloud.com/30388
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10394 lnd: default to using MEM_REG 49/30749/5
Amir Shehata [Fri, 5 Jan 2018 19:50:33 +0000 (11:50 -0800)]
LU-10394 lnd: default to using MEM_REG

There is a performance drop when using IB_MR_TYPE_SG_GAPS. To
mitigate this, we added a module parameter, use_fastreg_gaps, which
defaults to 0. When allocating the memory region if this parameter
is set to 1 and the hw has gaps support then use it and output
a warning that performance may drop. Otherwise always use
IB_MR_TYPE_MEM_REG.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I08a8b72756b9b5b5bcb391bf3e979f6d28eb5cbb
Reviewed-on: https://review.whamcloud.com/30749
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoNew tag 2.10.58 2.10.58 v2_10_58 v2_10_58_0
Oleg Drokin [Thu, 8 Feb 2018 19:47:07 +0000 (14:47 -0500)]
New tag 2.10.58

Change-Id: I8dc17a8e4d8182548ccaafc83d643f5a0a09de9f
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10554 lnet: Remove LASSERT on userspace data 00/31100/3
Sonia Sharma [Wed, 31 Jan 2018 10:49:19 +0000 (02:49 -0800)]
LU-10554 lnet: Remove LASSERT on userspace data

If the net information is not provided while adding
NI, it results in an LBUG. Remove the LASSERT on
userspace input and handle it gracefully.

Test-Parameters: trivial
Change-Id: I9d2b6f94cb35e94bc81d5c52936d32cbf833e597
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/31100
Tested-by: Jenkins
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8602 gss: Fix autoconf check for crypto_hash 95/31095/2
Olaf Faaland [Wed, 31 Jan 2018 01:36:38 +0000 (17:36 -0800)]
LU-8602 gss: Fix autoconf check for crypto_hash

If earlier crypto_hash checks resulted in enable_gss=no, do not enable
GSS when gss_conf_test = success.

Fixes regression introduced by https://review.whamcloud.com/27823/
LU-9073 gss: remove newer kernel support

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I6c135e638ec6b8350b916f18de73b83cc7dbfb09
Reviewed-on: https://review.whamcloud.com/31095
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10563 kernel: kernel update RHEL7.4 [3.10.0-693.17.1.el7] 34/31034/2
Bob Glossman [Thu, 25 Jan 2018 17:35:45 +0000 (09:35 -0800)]
LU-10563 kernel: kernel update RHEL7.4 [3.10.0-693.17.1.el7]

update RHEL 7.4 kernel to 3.10.0-693.17.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I1302c2de6d8eebb33ca741dd357b65579ce71d7d
Reviewed-on: https://review.whamcloud.com/31034
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10565 mdc: add __GFP_COLD for back compatible 28/31028/2
Yang Sheng [Fri, 1 Dec 2017 14:44:46 +0000 (22:44 +0800)]
LU-10565 mdc: add __GFP_COLD for back compatible

The __GFP_COLD has been removed in upstream. Add
it for compatible.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I70d0978fa1cdfb9ea22f0b1511a26b28523048b6
Reviewed-on: https://review.whamcloud.com/31028
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10565 osd: use readdir while iterate is kabi_extend 18/31018/3
Yang Sheng [Thu, 25 Jan 2018 17:43:37 +0000 (01:43 +0800)]
LU-10565 osd: use readdir while iterate is kabi_extend

Sometime iterate interface is not initialized in ldiskfs.
So we use readdir in such case.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I43260a6d27003895b0ddd1bdf7b0539cc2ad64c5
Reviewed-on: https://review.whamcloud.com/31018
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10560 libcfs: remove extra headers from linux-debug.c 07/31007/3
Andreas Dilger [Fri, 2 Feb 2018 16:45:53 +0000 (08:45 -0800)]
LU-10560 libcfs: remove extra headers from linux-debug.c

Remove headers from linux-debug.c that are no longer (or were possibly
never) needed.  This avoids a compile problem with 4.14 kernels:

  CC [M] lustre/libcfs/libcfs/linux/linux-debug.o
  In file included from lustre/libcfs/libcfs/linux/linux-debug.c:50:0:
    ./arch/x86/include/asm/uaccess.h:31:9: In function set_fs():
      error: dereferencing pointer to incomplete type
      current->thread.addr_limit = fs;

It turns out <asm/uaccess.h> was included twice but we don't need
anything from that header.  Same for <linux/stat.h> and a number of
other extraneous headers.

Sort headers alphabetically so it is easier to avoid duplicates.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I751e796913624cd8c9c95052abe4ecbb823ebbe5
Reviewed-on: https://review.whamcloud.com/31007
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10497 tests: remove tests from sanity-dom 04/31004/2
James Nunez [Wed, 24 Jan 2018 20:47:09 +0000 (13:47 -0700)]
LU-10497 tests: remove tests from sanity-dom

sanity test 42a, 42b, and 42c are known to fail and are
not run during normal testing due to LU-6493 and LU-9693.
Thus, remove these sanity tests from running in sanity-dom.

Test-Parameters: trivial testlist=sanity-dom
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I2c99f44bcae845fd5f9fb0a17bb09a2b4b254ec9
Reviewed-on: https://review.whamcloud.com/31004
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
6 years agoLU-8028 build: fix make dependencies for --disable-modules 65/30965/4
Dmitry Eremin [Mon, 22 Jan 2018 09:48:48 +0000 (12:48 +0300)]
LU-8028 build: fix make dependencies for --disable-modules

When modules are excluded from a build there is no rule to build undef.h.
As result build fails with following:
/usr/include/stdc-predef.h:40:1: fatal error:
        /work/lustre-release/undef.h: No such file or directory

Test-Parameters: trivial
Change-Id: I4bf031933964c7d11f9e1c4a88016e1827d11762
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/30965
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-4277 scripts: ofd status integrated with zpool status 07/30907/7
Nathaniel Clark [Wed, 24 Jan 2018 13:35:05 +0000 (08:35 -0500)]
LU-4277 scripts: ofd status integrated with zpool status

Add zedlet to ZFS ZED that markes OFD as degraded/undegraded,
when a zpool is degraded or online, respectivly.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ia8ec3cf3a31ce24d8598d690bcb0356245712858
Reviewed-on: https://review.whamcloud.com/30907
Tested-by: Jenkins
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10452 lnet: cleanup YAML output 45/30845/6
Amir Shehata [Fri, 12 Jan 2018 03:29:54 +0000 (19:29 -0800)]
LU-10452 lnet: cleanup YAML output

The level of verbosity is high when exporting the YAML configuration
for the purposes of storing it to reconfigure a node. This patch
eliminates the unnecessary YAML lines which are not needed when
reconfiguring a node, such as statistics, status, etc.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ie57c761415cfb0ceee8b2dbc0b293e85ae415685
Reviewed-on: https://review.whamcloud.com/30845
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9158 quota: adjust quota ASAP 65/30765/8
Hongchao Zhang [Wed, 6 Dec 2017 08:49:54 +0000 (16:49 +0800)]
LU-9158 quota: adjust quota ASAP

In qsd_upd_thread, the quota adjust request will only be
scheduled to run when the current time (seconds) is larger
than the queued time (seconds). The transactions in subtest 12b
of sanity_quota are committed in one second simultaneously,
which cause the quota is not freed.

Test-Parameters: alwaysuploadlogs \
envdefinitions=ENABLE_QUOTA=yes,DEBUG_SIZE=64,PTLDEBUG=rpctrace \
clientcount=2 osscount=2 mdscount=2 mdtcount=4 \
austeroptions=-R mdtfilesystemtype=zfs ostfilesystemtype=zfs \
testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota

Change-Id: I9310237d58a21ee8d47daab8901892bd12016339
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/30765
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9378 utils: split getstripe and find from lfs.1 64/30464/10
Andreas Dilger [Sat, 9 Dec 2017 11:26:54 +0000 (04:26 -0700)]
LU-9378 utils: split getstripe and find from lfs.1

Split the getstripe and find commands from the lfs.1 man page into
their own lfs-getstripe.1 and lfs-find.1 man pages.

While updating the lfs-find.1 man page I realized that the short
options for "-print" and "-print0" were incorrectly documented
in both the usage message as well as the man page, which implies
that the short options were rarely, if ever, used.

Fix the "--print" option to be correctly documented as "-P" instead
of "-p", and deprecate the usage of "-p" for "--print0" in favour
of "-0".  This gives us the opportunity to reclaim "-p" for "--pool",
which is already used as such for "lfs df", "lfs getstripe", and
"lfs setstripe", after some period of printing a deprecation warning.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I9aa7a415d109d269c646fd034ea77785a94cab07
Reviewed-on: https://review.whamcloud.com/30464
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-5152 quota: enforce block quota for chgrp 46/30146/23
Hongchao Zhang [Sat, 27 Jan 2018 21:21:35 +0000 (05:21 +0800)]
LU-5152 quota: enforce block quota for chgrp

When an unprivileged user calls chgrp to change the group
of one of his files, the block quota limit of that new group
should be checked to ensure it not exceeds the limit.

The side effect of this patch could be,
1.The performance of chgrp from non-privileged user will be
very slow, no matter if quota is enabled. Since we assume that
chgrp issued from non-privileged user is very rare, the performance
impact possibly is acceptable.
2.If MDT crash while performing chgrp, inconsistency (group ownership
among MDT and OST objects) will be created. It should be acceptable.

This patch has fixed the bug while calculating the disk space of
some file for ldiskfs and zfs, the block unit is always 512.

Change-Id: I4b781e94493fe63c8cbd5700dc68293b2504c2ac
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/30146
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10193 tests: test migration between ldiskfs and zfs 06/30106/10
Fan Yong [Mon, 22 Jan 2018 05:44:14 +0000 (13:44 +0800)]
LU-10193 tests: test migration between ldiskfs and zfs

New test cases in conf-sanity.sh

test_108a: migrate from ldiskfs to zfs
test_108b: migrate from zfs to ldiskfs

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibb4749c316f51b0820648e59235a03a9656f762e
Reviewed-on: https://review.whamcloud.com/30106
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10193 osd-ldiskfs: backup index object with plain format 11/30911/11
Fan Yong [Thu, 25 Jan 2018 08:48:00 +0000 (16:48 +0800)]
LU-10193 osd-ldiskfs: backup index object with plain format

This patch is mainly for migrating filesyste between ZFS
backend and ldiskfs backend via server file level backup
and restore. It will dumps the ldiskfs special formatted
index object to the local '/index_backup' directory with
the name of source index's FID string and ".lbx" postfix
when umount device.

The format of the backup is as following (same ZFS case):
1) header: 512 bytes, including:
   magic:       4 bytes
   count:       4 bytes
   keysize:     4 bytes
   recsize:     4 bytes
   owner_fid:   16 bytes
   padding:     480 bytes

2) body: after the header, <key, rec> pairs one by one.

The backup will be done when server umount. The backup behavior
is controlled via new OSD lproc interface "index_backup". It is
off by default. You can turn it on to enable backup when server
umount via writing non-zero value to such lproc interface.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I5ac81dd470f3cb29eb3c9ec0e01935c9b1a0fda9
Reviewed-on: https://review.whamcloud.com/30911
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10193 osd-zfs: backup index object with plain format 10/30910/16
Fan Yong [Thu, 25 Jan 2018 08:47:20 +0000 (16:47 +0800)]
LU-10193 osd-zfs: backup index object with plain format

Lustre uses ZAP to implement index object. When tar the index
object via backend ZPL for backup, it is explained as regular
file, then when untar it, it is not ZAP formatted again, then
the Lustre cannot recognize the 'bad' formatted index object.

On the other hand, each backend FS has its own special format
for index object. Then we cannot migrate the index files from
one backend to another directly.

To resolve such issue, the patch will backup the index object
with plain format to the local '/index_backup' directory with
the name of source index's FID string and ".lbx" postfix when
umount the device.

The format of the backup is as following:
1) header: 512 bytes, including:
   magic:       4 bytes
   count:       4 bytes
   keysize:     4 bytes
   recsize:     4 bytes
   owner_fid:   16 bytes
   padding:     480 bytes

2) body: after the header, <key, rec> pairs one by one.

The backup will be done when server umount. The backup behavior
is controlled via new OSD lproc interface "index_backup". It is
off by default. You can turn it on to enable backup when server
umount via writing non-zero value to such lproc interface.

Test-Parameters: envdefinitions=SLOW=yes testlist=sanity-scrub mdtfilesystemtype=zfs ostfilesystemtype=zfs mdscount=2 mdtcount=4
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I01730bc9cfa3ae597f2d8652df9fb76418cf55ce
Reviewed-on: https://review.whamcloud.com/30910
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0 15/29215/12
Martin Schroeder [Tue, 13 Jun 2017 14:42:57 +0000 (10:42 -0400)]
LU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0

This enables compatibility with the current LTS flavours of Ubuntu.
Do note that you need the Xenial HWE Kernel for Ubuntu 14.04.5, as
that distribution originally used a 3.x series Kernel.

The patches have been developed to apply cleanly to the kernel versions
4.4.0-45.66 to 4.4.0-85.108 from the Ubuntu Xenial (and its Trusty backports).

This change also adjusts the Debian scripting to produce the
ldiskfs modules and the server utilities.

To create the server modules run "./configure" with "--enable-server"
and specify "--enable-ldiskfs" and "--with-zfs/-spl" as appropriate.

The call to "make debs" will then produce the server modules and
utils instead of their client versions.

NOTE: This contains a small hack taken from LU-9995 / #29130

Test-Parameters: trivial
Signed-off-by: Martin Schroeder <martin.h.schroeder@intel.com>
Change-Id: I02cd5e9314367ad4e1f8f3d81712f84441a8bc71
Reviewed-on: https://review.whamcloud.com/29215
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9919 lnet: safe access in debug print 71/28771/3
Amir Shehata [Mon, 28 Aug 2017 22:09:21 +0000 (15:09 -0700)]
LU-9919 lnet: safe access in debug print

Move debug print within the cpt lock to keep
peer access safe.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ic37ff0973367b3eb9cbc0059ffee9c31ecf98c34
Reviewed-on: https://review.whamcloud.com/28771
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7501 utils: clean up lfs argument handling/docs 92/28592/15
Andreas Dilger [Thu, 17 Aug 2017 23:33:47 +0000 (17:33 -0600)]
LU-7501 utils: clean up lfs argument handling/docs

Change "mdt-hash" option in lfs_setstripe() and lfs_setdirstripe
to use C99 formatting as used for other options.

Add comments for already-used options to lfs_find(), lfs_getstripe(),
and lfs_setstripe() to avoid conflicts in the future.

A few initializers can fit onto the same line with minor formatting
changes, better to be more compact than a slave to exact formatting.

Remove options that are obsoleted by LUSTRE_VERSION_CODE after 2.10.
Remove over-zealous deprecation of "lfs mkdir -c".

Sort options to be in mostly alphabetical order, unless the long
option parsing would return a deprecated short option.

Add deprecation warnings for short/long options that were deprecated
already in commit cdeb2f3a56e8 (http://review.whamcloud.com/22581).

Fix up lfs-setdirstripe.1 and lfs-getdirstripe.1 man pages to list
preferred option names.  Also, lfs-getdirstripe.1 listed some options
that never existed, and others that were named incorrectly.

Move test scripts over to use preferred command and option names.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I74a59ce372115ae0906d0feb37c539a450bed6bd
Reviewed-on: https://review.whamcloud.com/28592
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 nodemap: add audit_mode flag to nodemap 13/28313/15
Sebastien Buisson [Wed, 2 Aug 2017 09:44:33 +0000 (18:44 +0900)]
LU-9727 nodemap: add audit_mode flag to nodemap

Give the ability to specify an audit_mode flag on a nodemap.
When set to 1, a client pertaining to this nodemap will be able to
record file system access events to the Changelogs, if Changelogs are
otherwise activated.
When set to 0, events are not logged into the Changelogs, no matter
Changelogs are activated or not.
By default, audit_mode flag is set to 1 in newly created nodemap
entries. And it is also set to 1 on 'default' nodemap.

The idea of disabling audit on a per-nodemap basis is that it would
be possible to have some nodes (e.g. backup, HSM agent nodes) that do
not flood the audit logs.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ieb6c461c443b1734312afef44680d903deee5398
Reviewed-on: https://review.whamcloud.com/28313
Reviewed-by: Jean-Baptiste Riaux <riaux.jb@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10513 acl: prepare small buffer for ACL RPC reply 16/28116/11
Fan Yong [Mon, 15 Jan 2018 14:45:37 +0000 (22:45 +0800)]
LU-10513 acl: prepare small buffer for ACL RPC reply

For most of files, their ACL entries are very limited, under
such case, it is unnecessary to prepare very large reply buffer
to hold unknown-sized ACL entries for the getattr/open RPCs.
Instead, we can prepare some relative small buffer, such as the
LUSTRE_POSIX_ACL_MAX_SIZE_OLD (260) bytes, that is equal to the
ACL size before patch 64b2fad22a4eb4727315709e014d8f74c5a7f289.
If the target file has too many ACL entries and exceeds the
prepared reply buffer, then the MDT will reply -ERANGE failure
to the client, and then the client can prepare more large buffer
and try again. Since the file with large ACL is rare case, such
retrying getattr/open RPCs will not affect the real performance
too much.

The advantage is that it reduces the client side RAM pressure.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4c01b19520cab1cc712e36f3b0225973fba00410
Reviewed-on: https://review.whamcloud.com/28116
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 lustre: record CLOSE if OPEN was recorded 29/27929/21
Sebastien Buisson [Tue, 4 Jul 2017 15:21:44 +0000 (00:21 +0900)]
LU-9727 lustre: record CLOSE if OPEN was recorded

Record CL_CLOSE events in changelogs only if file was opened in
write mode, or if CL_OPEN was recorded.
Changelogs mask may change between open and close operations,
but this is not a big deal if we have a CL_CLOSE entry with no
matching CL_OPEN. Plus Changelogs mask may not change often.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5984a4b07b84d84c3860b9b21abc3b19b7fd9b1a
Reviewed-on: https://review.whamcloud.com/27929
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Matthew S <matthew.sanderson@anu.edu.au>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-9727 lustre: implement CL_OPEN for Changelogs 14/28214/23
Sebastien Buisson [Tue, 25 Jul 2017 13:45:58 +0000 (09:45 -0400)]
LU-9727 lustre: implement CL_OPEN for Changelogs

Record OPEN events in Changelogs, and add a new changelog
extension named changelog_ext_openmode to hold open mode.
An OPEN changlog entry is in the form:
7 10OPEN  13:38:51.510728296 2017.07.25 0x242
t=[0x200000401:0x2:0x0] ef=0x7 u=500:500 nid=10.128.11.159@tcp m=-w-
By default, disable recording of OPEN events in Changelogs.
Note that CREAT are still recorded even if OPEN are disabled.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I72c479938ab4782523f1b16aef19fbbc96f43c7f
Reviewed-on: https://review.whamcloud.com/28214
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 lustre: add client NID to Changelogs entries 13/28213/18
Sebastien Buisson [Mon, 24 Jul 2017 15:58:05 +0000 (11:58 -0400)]
LU-9727 lustre: add client NID to Changelogs entries

Add a new changelog extension named changelog_ext_nid to hold
client's NID information.
NID info is added to every Changelog entry type except MARK, in
the form 'nid=<nid>':
1 01CREAT 15:50:20.834838318 2017.07.24 0x0 t=[0x200000401:0x2:0x0]
ef=0x3 u=500:500 nid=10.128.11.159@tcp p=[0x200000007:0x1:0x0] fileA

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1049a699c17d3829d38abfade3187a28ca457bd1
Reviewed-on: https://review.whamcloud.com/28213
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-3397 lprocfs: create "export" /proc file on server 13/6713/17
Emoly Liu [Thu, 9 Oct 2014 16:50:00 +0000 (00:50 +0800)]
LU-3397 lprocfs: create "export" /proc file on server

Similar to the "import" file on the client for each client-to-server
connection, it would be useful to have a file on the server in the
per-nid directory obdfilter/*/exports/$NID/export. This contains
export connection information as in the "import" file, like:
 a793e354-49c0-aa11-8c4f-a4f2b1a1a92b:
     name: MGS
     client: 10.211.55.10@tcp
     connect_flags: [ version, barrier, adaptive_timeouts, ... ]
     connect_data:
        flags: 0x2000011005002020
        instance: 0
        target_version: 2.10.51.0
        export_flags: [ ... ]

Also, sanity.sh test_0d is added to verify this patch.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I60896090e3a8ad872141a8d4299f0698f0a5636a
Reviewed-on: https://review.whamcloud.com/6713
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
6 years agoLU-10003 lnet: clarify lctl deprecation message 30/31030/2
Amir Shehata [Thu, 25 Jan 2018 22:18:10 +0000 (14:18 -0800)]
LU-10003 lnet: clarify lctl deprecation message

Print out the lctl command which is deprecated

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I2d07a609718c205fba172530202e6f0c1b1d2119
Reviewed-on: https://review.whamcloud.com/31030
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>